手写数学表达式识别的对比表示增强和学习

IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Letters Pub Date : 2024-08-30 DOI:10.1016/j.patrec.2024.08.021
Zihao Lin , Jinrong Li , Gang Dai , Tianshui Chen , Shuangping Huang , Jianmin Lin
{"title":"手写数学表达式识别的对比表示增强和学习","authors":"Zihao Lin ,&nbsp;Jinrong Li ,&nbsp;Gang Dai ,&nbsp;Tianshui Chen ,&nbsp;Shuangping Huang ,&nbsp;Jianmin Lin","doi":"10.1016/j.patrec.2024.08.021","DOIUrl":null,"url":null,"abstract":"<div><p>Handwritten mathematical expression recognition (HMER) is an appealing task due to its wide applications and research challenges. Previous deep learning-based methods used string decoder to emphasize on expression symbol awareness and achieved considerable recognition performance. However, these methods still meet an obstacle in recognizing handwritten symbols with varying appearance, in which huge appearance variations significantly lead to the ambiguity of symbol representation. To this end, our intuition is to employ printed expressions with unified appearance to serve as the template of handwritten expressions, alleviating the effects brought by varying symbol appearance. In this paper, we propose a contrastive learning method, where handwritten symbols with identical semantic are clustered together through the guidance of printed symbols, leading model to enhance the robustness of symbol semantic representations. Specifically, we propose an anchor generation scheme to obtain printed expression images corresponding with handwritten expressions. We propose a contrastive learning objective, termed Semantic-NCE Loss, to pull together printed and handwritten symbols with identical semantic. Moreover, we employ a string decoder to parse the calibrated semantic representations, outputting satisfactory expression symbols. The experiment results on benchmark datasets CROHME 14/16/19 demonstrate that our method noticeably improves recognition accuracy of handwritten expressions and outperforms the standard string decoder methods.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"186 ","pages":"Pages 14-20"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Contrastive representation enhancement and learning for handwritten mathematical expression recognition\",\"authors\":\"Zihao Lin ,&nbsp;Jinrong Li ,&nbsp;Gang Dai ,&nbsp;Tianshui Chen ,&nbsp;Shuangping Huang ,&nbsp;Jianmin Lin\",\"doi\":\"10.1016/j.patrec.2024.08.021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Handwritten mathematical expression recognition (HMER) is an appealing task due to its wide applications and research challenges. Previous deep learning-based methods used string decoder to emphasize on expression symbol awareness and achieved considerable recognition performance. However, these methods still meet an obstacle in recognizing handwritten symbols with varying appearance, in which huge appearance variations significantly lead to the ambiguity of symbol representation. To this end, our intuition is to employ printed expressions with unified appearance to serve as the template of handwritten expressions, alleviating the effects brought by varying symbol appearance. In this paper, we propose a contrastive learning method, where handwritten symbols with identical semantic are clustered together through the guidance of printed symbols, leading model to enhance the robustness of symbol semantic representations. Specifically, we propose an anchor generation scheme to obtain printed expression images corresponding with handwritten expressions. We propose a contrastive learning objective, termed Semantic-NCE Loss, to pull together printed and handwritten symbols with identical semantic. Moreover, we employ a string decoder to parse the calibrated semantic representations, outputting satisfactory expression symbols. The experiment results on benchmark datasets CROHME 14/16/19 demonstrate that our method noticeably improves recognition accuracy of handwritten expressions and outperforms the standard string decoder methods.</p></div>\",\"PeriodicalId\":54638,\"journal\":{\"name\":\"Pattern Recognition Letters\",\"volume\":\"186 \",\"pages\":\"Pages 14-20\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167865524002538\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865524002538","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

手写数学表达式识别(HMER)因其广泛的应用和研究挑战而成为一项极具吸引力的任务。以往基于深度学习的方法使用字符串解码器来强调表达符号感知,并取得了可观的识别性能。然而,这些方法在识别具有不同外观的手写符号时仍会遇到障碍,其中巨大的外观变化会显著导致符号表示的模糊性。为此,我们的直觉是采用具有统一外观的印刷表达作为手写表达的模板,以减轻符号外观变化带来的影响。在本文中,我们提出了一种对比学习方法,即通过印刷符号的引导,将语义相同的手写符号聚类在一起,从而引导模型增强符号语义表征的鲁棒性。具体来说,我们提出了一种锚生成方案,以获得与手写表情相对应的印刷表情图像。我们提出了一种对比学习目标(称为语义-NCE损失),将具有相同语义的印刷符号和手写符号放在一起。此外,我们还采用了字符串解码器来解析校准后的语义表示,从而输出令人满意的表情符号。在基准数据集 CROHME 14/16/19 上的实验结果表明,我们的方法明显提高了手写表情的识别准确率,并优于标准字符串解码器方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Contrastive representation enhancement and learning for handwritten mathematical expression recognition

Handwritten mathematical expression recognition (HMER) is an appealing task due to its wide applications and research challenges. Previous deep learning-based methods used string decoder to emphasize on expression symbol awareness and achieved considerable recognition performance. However, these methods still meet an obstacle in recognizing handwritten symbols with varying appearance, in which huge appearance variations significantly lead to the ambiguity of symbol representation. To this end, our intuition is to employ printed expressions with unified appearance to serve as the template of handwritten expressions, alleviating the effects brought by varying symbol appearance. In this paper, we propose a contrastive learning method, where handwritten symbols with identical semantic are clustered together through the guidance of printed symbols, leading model to enhance the robustness of symbol semantic representations. Specifically, we propose an anchor generation scheme to obtain printed expression images corresponding with handwritten expressions. We propose a contrastive learning objective, termed Semantic-NCE Loss, to pull together printed and handwritten symbols with identical semantic. Moreover, we employ a string decoder to parse the calibrated semantic representations, outputting satisfactory expression symbols. The experiment results on benchmark datasets CROHME 14/16/19 demonstrate that our method noticeably improves recognition accuracy of handwritten expressions and outperforms the standard string decoder methods.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Pattern Recognition Letters
Pattern Recognition Letters 工程技术-计算机:人工智能
CiteScore
12.40
自引率
5.90%
发文量
287
审稿时长
9.1 months
期刊介绍: Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition. Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.
期刊最新文献
Prototypical class-wise test-time adaptation Improving ViT interpretability with patch-level mask prediction Sparse-attention augmented domain adaptation for unsupervised person re-identification GANzzle++: Generative approaches for jigsaw puzzle solving as local to global assignment in latent spatial representations Neuromorphic face analysis: A survey
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1