Target-Directed MixUp for Labeling Tangut Characters

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI:10.1109/ICDAR.2019.00041

Guangwei Zhang, Yinliang Zhao

{"title":"Target-Directed MixUp for Labeling Tangut Characters","authors":"Guangwei Zhang, Yinliang Zhao","doi":"10.1109/ICDAR.2019.00041","DOIUrl":null,"url":null,"abstract":"Deep learning largely improves the performance in computer vision and image understanding tasks depending on large training datasets of labeled images. However, it is usually expensive and time-consuming to label data although unlabeled data are much easier to get. It is practical to build the training dataset iteratively from a small set of manually labeled data because of the limited budget or emerging new categories. The labeled data could not only be used for training the model but also some knowledge could be mined from them for finding examples of the classes not included in the training dataset. Mixup [1] improves the model's accuracy and generalization by augmenting the training dataset with the \"virtual examples\" that are generated by mixing pairs of randomly selected examples from the training dataset. Motivated by Mixup, we propose the Target-Directed Mixup (TDM) method for building the training dataset of the deep learning-based Tangut character recognition system. The virtual examples are generated by mixing two or more similar examples in the training dataset, together with the target examples of unseen classes that need to be labeled, which is a kind of generative few-shot learning. This method can help expand the training dataset by finding real examples of unseen Tangut characters and provide virtual examples that could represent the rare characters that are used very limited in historical documents. According to our experiments, TDM can help recognize the unseen examples at the accuracy of 80% with only 4 to 5 real target examples, which largely reduces human labor in data annotation.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"2018 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning largely improves the performance in computer vision and image understanding tasks depending on large training datasets of labeled images. However, it is usually expensive and time-consuming to label data although unlabeled data are much easier to get. It is practical to build the training dataset iteratively from a small set of manually labeled data because of the limited budget or emerging new categories. The labeled data could not only be used for training the model but also some knowledge could be mined from them for finding examples of the classes not included in the training dataset. Mixup [1] improves the model's accuracy and generalization by augmenting the training dataset with the "virtual examples" that are generated by mixing pairs of randomly selected examples from the training dataset. Motivated by Mixup, we propose the Target-Directed Mixup (TDM) method for building the training dataset of the deep learning-based Tangut character recognition system. The virtual examples are generated by mixing two or more similar examples in the training dataset, together with the target examples of unseen classes that need to be labeled, which is a kind of generative few-shot learning. This method can help expand the training dataset by finding real examples of unseen Tangut characters and provide virtual examples that could represent the rare characters that are used very limited in historical documents. According to our experiments, TDM can help recognize the unseen examples at the accuracy of 80% with only 4 to 5 real target examples, which largely reduces human labor in data annotation.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于标注切线字符的目标定向混淆

深度学习在很大程度上提高了计算机视觉和图像理解任务的性能，这些任务依赖于标记图像的大型训练数据集。然而，尽管未标记的数据更容易获得，但标记数据通常既昂贵又耗时。由于预算有限或新类别的出现，从一小组手动标记的数据迭代构建训练数据集是可行的。标记的数据不仅可以用于训练模型，还可以从中挖掘一些知识，以查找未包含在训练数据集中的类的示例。Mixup[1]通过使用“虚拟示例”来增强训练数据集，从而提高了模型的准确性和泛化性。虚拟示例是由从训练数据集中随机选择的示例对混合而产生的。在Mixup的激励下，我们提出了基于目标导向的Mixup (Target-Directed Mixup, TDM)方法来构建基于深度学习的切线字符识别系统的训练数据集。虚拟样例是将训练数据集中的两个或两个以上相似的样例与需要标记的未知类的目标样例混合生成的，这是一种生成式的少次学习。这种方法可以通过寻找未见过的切线字符的真实示例来帮助扩展训练数据集，并提供可以代表历史文档中使用非常有限的稀有字符的虚拟示例。根据我们的实验，TDM只需要4到5个真实的目标样本，就可以帮助识别未见的样本，准确率达到80%，大大减少了数据标注的人工劳动。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 International Conference on Document Analysis and Recognition (ICDAR)

自引率

0.00%

发文量