V. Sorge, Akashdeep Bansal, Neha Jadhav, Himanshu Garg, Ayushi Verma, M. Balakrishnan
{"title":"Towards generating web-accessible STEM documents from PDF","authors":"V. Sorge, Akashdeep Bansal, Neha Jadhav, Himanshu Garg, Ayushi Verma, M. Balakrishnan","doi":"10.1145/3371300.3383351","DOIUrl":null,"url":null,"abstract":"PDF is still a very popular format that is widely used to exchange and archive electronic documents. And although considerable efforts have been made to ensure accessibility of PDF documents, they are still far from ideal when complex content like formulas, diagrams or tables is present. Unfortunately, many publications in scientific subjects are available in PDF format only and are therefore, if at all, only partially accessible. In this paper, we present a fully automated web-based technology to convert PDF documents into an accessible single file format. We concentrate on presenting working solutions for mathematical formulas and tables while also discussing some of the open problems in this context and how we aim to solve them in the future.","PeriodicalId":93137,"journal":{"name":"Proceedings of the 17th International Web for All Conference","volume":"76 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 17th International Web for All Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3371300.3383351","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
PDF is still a very popular format that is widely used to exchange and archive electronic documents. And although considerable efforts have been made to ensure accessibility of PDF documents, they are still far from ideal when complex content like formulas, diagrams or tables is present. Unfortunately, many publications in scientific subjects are available in PDF format only and are therefore, if at all, only partially accessible. In this paper, we present a fully automated web-based technology to convert PDF documents into an accessible single file format. We concentrate on presenting working solutions for mathematical formulas and tables while also discussing some of the open problems in this context and how we aim to solve them in the future.