从扫描的PDF文档和MathML转换的数学信息检索(MIR)

Q1 Computer Science IPSJ Transactions on Computer Vision and Applications Pub Date : 2014-12-10 DOI:10.2197/ipsjtcva.6.132

A. Nazemi, I. Murray, D. McMeekin

{"title":"从扫描的PDF文档和MathML转换的数学信息检索(MIR)","authors":"A. Nazemi, I. Murray, D. McMeekin","doi":"10.2197/ipsjtcva.6.132","DOIUrl":null,"url":null,"abstract":"This paper describes part of an ongoing comprehensive research project that is aimed at generating a \nMathML format from images of mathematical expressions that have been extracted from scanned PDF documents. \nA MathML representation of a scanned PDF document reduces the document’s storage size and encodes the math- \nematical notation and meaning. The MathML representation then becomes suitable for vocalization and accessible \nthrough the use of assistive technologies. In order to achieve an accurate layout analysis of a scanned PDF document, \nall textual and non-textual components must be recognised, identified and tagged. These components may be text or \nmathematical expressions and graphics in the form of images, figures, tables and/or diagrams. Mathematical expres- \nsions are one of the most significant components within scanned scientific and engineering PDF documents and need \nto be machine readable for use with assistive technologies. This research is a work in progress and includes multiple \ndifferent modules: detecting and extracting mathematical expressions, recursive primitive component extraction, non- \nalphanumerical symbols recognition, structural semantic analysis and merging primitive components to generate the \nMathML of the scanned PDF document. An optional module converts MathML to audio format using a Text to Speech \nengine (TTS) to make the document accessible for vision-impaired users. \nKeywords: math recognition, graphics recognition, Mathematical Informati","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"23 1","pages":"132-142"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Mathematical Information Retrieval (MIR) from Scanned PDF Documents and MathML Conversion\",\"authors\":\"A. Nazemi, I. Murray, D. McMeekin\",\"doi\":\"10.2197/ipsjtcva.6.132\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes part of an ongoing comprehensive research project that is aimed at generating a \\nMathML format from images of mathematical expressions that have been extracted from scanned PDF documents. \\nA MathML representation of a scanned PDF document reduces the document’s storage size and encodes the math- \\nematical notation and meaning. The MathML representation then becomes suitable for vocalization and accessible \\nthrough the use of assistive technologies. In order to achieve an accurate layout analysis of a scanned PDF document, \\nall textual and non-textual components must be recognised, identified and tagged. These components may be text or \\nmathematical expressions and graphics in the form of images, figures, tables and/or diagrams. Mathematical expres- \\nsions are one of the most significant components within scanned scientific and engineering PDF documents and need \\nto be machine readable for use with assistive technologies. This research is a work in progress and includes multiple \\ndifferent modules: detecting and extracting mathematical expressions, recursive primitive component extraction, non- \\nalphanumerical symbols recognition, structural semantic analysis and merging primitive components to generate the \\nMathML of the scanned PDF document. An optional module converts MathML to audio format using a Text to Speech \\nengine (TTS) to make the document accessible for vision-impaired users. \\nKeywords: math recognition, graphics recognition, Mathematical Informati\",\"PeriodicalId\":38957,\"journal\":{\"name\":\"IPSJ Transactions on Computer Vision and Applications\",\"volume\":\"23 1\",\"pages\":\"132-142\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IPSJ Transactions on Computer Vision and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2197/ipsjtcva.6.132\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IPSJ Transactions on Computer Vision and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2197/ipsjtcva.6.132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 2

摘要

本文描述了一个正在进行的综合研究项目的一部分，该项目旨在从扫描的PDF文档中提取的数学表达式图像生成MathML格式。扫描PDF文档的MathML表示减少了文档的存储大小，并对数学符号和含义进行了编码。然后，MathML表示变得适合于发声，并且可以通过使用辅助技术进行访问。为了实现对扫描PDF文档的准确布局分析，必须识别、标识和标记所有文本和非文本组件。这些组件可以是文本或数学表达式，也可以是图像、数字、表格和/或图表形式的图形。数学表达式是扫描的科学和工程PDF文档中最重要的组成部分之一，需要机器可读才能与辅助技术一起使用。该研究包括数学表达式的检测与提取、递归原语成分提取、非字母数字符号识别、结构语义分析和原语成分合并生成扫描PDF文档的MathML等多个模块。一个可选模块使用文本到语音引擎(TTS)将MathML转换为音频格式，使视障用户可以访问该文档。关键词:数学识别，图形识别，数学信息

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Mathematical Information Retrieval (MIR) from Scanned PDF Documents and MathML Conversion

This paper describes part of an ongoing comprehensive research project that is aimed at generating a MathML format from images of mathematical expressions that have been extracted from scanned PDF documents. A MathML representation of a scanned PDF document reduces the document’s storage size and encodes the math- ematical notation and meaning. The MathML representation then becomes suitable for vocalization and accessible through the use of assistive technologies. In order to achieve an accurate layout analysis of a scanned PDF document, all textual and non-textual components must be recognised, identified and tagged. These components may be text or mathematical expressions and graphics in the form of images, figures, tables and/or diagrams. Mathematical expres- sions are one of the most significant components within scanned scientific and engineering PDF documents and need to be machine readable for use with assistive technologies. This research is a work in progress and includes multiple different modules: detecting and extracting mathematical expressions, recursive primitive component extraction, non- alphanumerical symbols recognition, structural semantic analysis and merging primitive components to generate the MathML of the scanned PDF document. An optional module converts MathML to audio format using a Text to Speech engine (TTS) to make the document accessible for vision-impaired users. Keywords: math recognition, graphics recognition, Mathematical Informati

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IPSJ Transactions on Computer Vision and Applications Computer Science-Computer Vision and Pattern Recognition

自引率

0.00%

发文量