{"title":"多标签联结时间分类","authors":"Curtis Wigington, Brian L. Price, Scott D. Cohen","doi":"10.1109/ICDAR.2019.00161","DOIUrl":null,"url":null,"abstract":"The Connectionist Temporal Classification (CTC) loss function [1] enables end-to-end training of a neural network for sequence-to-sequence tasks without the need for prior alignments between the input and output. CTC is traditionally used for training sequential, single-label problems; each element in the sequence has only one class. In this work, we show that CTC is not suitable for multi-label tasks and we present a novel Multi-label Connectionist Temporal Classification (MCTC) loss function for multi-label, sequence-to-sequence classification. Multi-label classes can represent meaningful attributes of a single element; for example, in Optical Music Recognition (OMR), a music note can have separate duration and pitch attributes. Our approach achieves state-of-the-art results on Joint Handwritten Text Recognition and Name Entity Recognition, Asian Character Recognition, and OMR.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Multi-label Connectionist Temporal Classification\",\"authors\":\"Curtis Wigington, Brian L. Price, Scott D. Cohen\",\"doi\":\"10.1109/ICDAR.2019.00161\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Connectionist Temporal Classification (CTC) loss function [1] enables end-to-end training of a neural network for sequence-to-sequence tasks without the need for prior alignments between the input and output. CTC is traditionally used for training sequential, single-label problems; each element in the sequence has only one class. In this work, we show that CTC is not suitable for multi-label tasks and we present a novel Multi-label Connectionist Temporal Classification (MCTC) loss function for multi-label, sequence-to-sequence classification. Multi-label classes can represent meaningful attributes of a single element; for example, in Optical Music Recognition (OMR), a music note can have separate duration and pitch attributes. Our approach achieves state-of-the-art results on Joint Handwritten Text Recognition and Name Entity Recognition, Asian Character Recognition, and OMR.\",\"PeriodicalId\":325437,\"journal\":{\"name\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2019.00161\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Connectionist Temporal Classification (CTC) loss function [1] enables end-to-end training of a neural network for sequence-to-sequence tasks without the need for prior alignments between the input and output. CTC is traditionally used for training sequential, single-label problems; each element in the sequence has only one class. In this work, we show that CTC is not suitable for multi-label tasks and we present a novel Multi-label Connectionist Temporal Classification (MCTC) loss function for multi-label, sequence-to-sequence classification. Multi-label classes can represent meaningful attributes of a single element; for example, in Optical Music Recognition (OMR), a music note can have separate duration and pitch attributes. Our approach achieves state-of-the-art results on Joint Handwritten Text Recognition and Name Entity Recognition, Asian Character Recognition, and OMR.