Accessibility of Tables in PDF Documents

IF 1.5 4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Technology and Libraries Pub Date : 2021-09-20 DOI:10.6017/ital.v40i3.12325
N. Fayyaz, Shah Khusro, Shakir Ullah
{"title":"Accessibility of Tables in PDF Documents","authors":"N. Fayyaz, Shah Khusro, Shakir Ullah","doi":"10.6017/ital.v40i3.12325","DOIUrl":null,"url":null,"abstract":"People access and share information over the web and in other digital environments, including digital libraries, in the form of documents such as books, articles, technical reports, etc. These documents are in a variety of formats, of which the Portable Document Format (PDF) is most widely used because of its emphasis on preserving the layout of the original material. The retrieval of relevant material from these derivative documents is challenging for information retrieval (IR) because the rich semantic structure of these documents is lost. The retrieval of important units such as images, figures, algorithms, mathematical formulas, and tables becomes a challenge. Among these elements, tables are particularly important because they can add value to the resource description, discovery, and accessibility of documents not only on the web but also in libraries if they are made retrievable and presentable to readers. Sighted users comprehend tables for sensemaking using visual cues, but blind and visually impaired users must rely on assistive technologies, including text-to-speech and screen readers, to comprehend tables. However, these technologies do not pay sufficient attention to tables in order to effectively present tables to visually impaired individuals. Therefore, ways must be found to make tables in PDF documents not only retrievable but also comprehensible. Before developing such solutions, it is necessary to review the available assistive technologies, tools, and frameworks for their capabilities, strengths, and limitations from the comprehension perspective of blind and visually impaired people, along with suitable environments like digital libraries. We found no such review article that critically and analytically presents and evaluates these technologies. To fill this gap in the literature, this review paper reports on the current state of the accessibility of PDF documents, digital libraries, assistive technologies, tools, and frameworks that make PDF tables comprehensible and accessible to blind and visually impaired people. The study findings have implications for libraries, information sciences, and information retrieval.","PeriodicalId":50361,"journal":{"name":"Information Technology and Libraries","volume":" ","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2021-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Technology and Libraries","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.6017/ital.v40i3.12325","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 5

Abstract

People access and share information over the web and in other digital environments, including digital libraries, in the form of documents such as books, articles, technical reports, etc. These documents are in a variety of formats, of which the Portable Document Format (PDF) is most widely used because of its emphasis on preserving the layout of the original material. The retrieval of relevant material from these derivative documents is challenging for information retrieval (IR) because the rich semantic structure of these documents is lost. The retrieval of important units such as images, figures, algorithms, mathematical formulas, and tables becomes a challenge. Among these elements, tables are particularly important because they can add value to the resource description, discovery, and accessibility of documents not only on the web but also in libraries if they are made retrievable and presentable to readers. Sighted users comprehend tables for sensemaking using visual cues, but blind and visually impaired users must rely on assistive technologies, including text-to-speech and screen readers, to comprehend tables. However, these technologies do not pay sufficient attention to tables in order to effectively present tables to visually impaired individuals. Therefore, ways must be found to make tables in PDF documents not only retrievable but also comprehensible. Before developing such solutions, it is necessary to review the available assistive technologies, tools, and frameworks for their capabilities, strengths, and limitations from the comprehension perspective of blind and visually impaired people, along with suitable environments like digital libraries. We found no such review article that critically and analytically presents and evaluates these technologies. To fill this gap in the literature, this review paper reports on the current state of the accessibility of PDF documents, digital libraries, assistive technologies, tools, and frameworks that make PDF tables comprehensible and accessible to blind and visually impaired people. The study findings have implications for libraries, information sciences, and information retrieval.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PDF文件中表格的可访问性
人们通过网络和其他数字环境(包括数字图书馆)以书籍、文章、技术报告等文档的形式访问和共享信息。这些文档有多种格式,其中可移植文档格式(PDF)使用最广泛,因为它强调保留原始材料的布局。由于这些衍生文档丢失了丰富的语义结构,因此从这些衍生文档中检索相关材料对信息检索(IR)具有挑战性。诸如图像、图形、算法、数学公式和表格等重要单元的检索成为一个挑战。在这些元素中,表是特别重要的,因为它们可以为资源描述、发现和文档的可访问性增加价值,不仅在网络上,而且在图书馆中,如果它们可以被读者检索和展示的话。视力正常的用户通过视觉线索来理解表格,但盲人和视力受损的用户必须依靠辅助技术,包括文本转语音和屏幕阅读器来理解表格。然而,这些技术并没有对表格给予足够的重视,从而无法有效地为视障人士呈现表格。因此,必须找到使PDF文档中的表格不仅可检索而且易于理解的方法。在制定此类解决方案之前,有必要从盲人和视障人士的理解角度,以及数字图书馆等合适的环境,审查现有的辅助技术、工具和框架的能力、优势和局限性。我们没有发现这样的评论文章批判性和分析性地呈现和评估这些技术。为了填补这一文献空白,本文综述了PDF文档、数字图书馆、辅助技术、工具和框架的可访问性现状,这些技术、工具和框架使盲人和视障人士能够理解和访问PDF表格。研究结果对图书馆、信息科学和信息检索具有启示意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Information Technology and Libraries
Information Technology and Libraries 管理科学-计算机:信息系统
CiteScore
2.90
自引率
5.60%
发文量
25
审稿时长
1 months
期刊介绍: Information Technology and Libraries publishes original material related to all aspects of information technology in all types of libraries. Topic areas include, but are not limited to, library automation, digital libraries, metadata, identity management, distributed systems and networks, computer security, intellectual property rights, technical standards, geographic information systems, desktop applications, information discovery tools, web-scale library services, cloud computing, digital preservation, data curation, virtualization, search-engine optimization, emerging technologies, social networking, open data, the semantic web, mobile services and applications, usability, universal access to technology, library consortia, vendor relations, and digital humanities.
期刊最新文献
Response to "From ChatGPT to CatGPT" To Thine Own 3D Selfie Be True Towards an Open Source-first Praxis in Libraries Response to "From ChatGPT to CatGPT" Drained-pool Politics Versus Digital Libraries in U.S. Cyberspace
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1