A robust framework for mathematical formula detection

M. Tran, Tri Pham, Tien Nguyen, Tien Do, T. Ngo
{"title":"A robust framework for mathematical formula detection","authors":"M. Tran, Tri Pham, Tien Nguyen, Tien Do, T. Ngo","doi":"10.1109/MAPR53640.2021.9585197","DOIUrl":null,"url":null,"abstract":"Mathematical formulas identification is a crucial step in the pipeline of many tasks such as mathematical information retrieval, storing digital science documents, etc. For basic mathematical formulas recognition, all these tasks need to detect the bounding boxes of mathematical expression as a prerequisite step. Currently, deep learning-based object detection methods work well for mathematical formula detection (MFD). These methods are divided into two categories: anchor self-study and anchor not self-study. The anchor self-study method is efficient with large quantity labels but not so well with small quantities, whereas the second type of method works better with small quantities. Therefore, we proposed an algorithm that keeps the good prediction of each type and then merges both into final results. To demonstrate the hypothesis, we select two typical object detection methods: YOLOv5 and Faster RCNN as the representation of two kind approaches to building an MFD framework. Our experiment results on ICDAR2021-MFD1 achieved the F1 score of the whole system is 89.3 while the single detector just reached 74.2, 88.9 (Faster RCNN and YOLOv5 respectively) that proving the effectiveness of the proposal.","PeriodicalId":233540,"journal":{"name":"2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MAPR53640.2021.9585197","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Mathematical formulas identification is a crucial step in the pipeline of many tasks such as mathematical information retrieval, storing digital science documents, etc. For basic mathematical formulas recognition, all these tasks need to detect the bounding boxes of mathematical expression as a prerequisite step. Currently, deep learning-based object detection methods work well for mathematical formula detection (MFD). These methods are divided into two categories: anchor self-study and anchor not self-study. The anchor self-study method is efficient with large quantity labels but not so well with small quantities, whereas the second type of method works better with small quantities. Therefore, we proposed an algorithm that keeps the good prediction of each type and then merges both into final results. To demonstrate the hypothesis, we select two typical object detection methods: YOLOv5 and Faster RCNN as the representation of two kind approaches to building an MFD framework. Our experiment results on ICDAR2021-MFD1 achieved the F1 score of the whole system is 89.3 while the single detector just reached 74.2, 88.9 (Faster RCNN and YOLOv5 respectively) that proving the effectiveness of the proposal.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
数学公式检测的鲁棒框架
数学公式识别是数学信息检索、数字科学文献存储等工作的关键环节。对于基本的数学公式识别,所有这些任务都需要检测数学表达式的边界框作为先决条件。目前,基于深度学习的目标检测方法在数学公式检测(MFD)中表现良好。这些方法分为主播自学和主播不自学两大类。锚点自学法在标签数量大的情况下效果很好,但在标签数量小的情况下效果不太好,而第二种方法在标签数量小的情况下效果更好。因此,我们提出了一种算法,该算法保留了每种类型的良好预测,然后将两者合并为最终结果。为了证明这一假设,我们选择了两种典型的目标检测方法:YOLOv5和Faster RCNN作为构建MFD框架的两种方法的表示。我们在ICDAR2021-MFD1上的实验结果表明,整个系统的F1得分为89.3,而单个检测器的F1得分仅为74.2,88.9(分别为Faster RCNN和YOLOv5),证明了该方案的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DS-YOLOv5: Deformable and Scalable YOLOv5 for Mathematical Formula Detection in Scientific Documents Exploring Efficiency of GAN-based Generated URLs for Phishing URL Detection DF-FSOD: A Novel Approach for Few-shot Object Detection via Distinguished Features A Unified Deep Framework for Hand Pose Estimation and Dynamic Hand Action Recognition from First-Person RGB Videos Exploring Zero-shot Cross-lingual Aspect-based Sentiment Analysis using Pre-trained Multilingual Language Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1