Computer vision-based analysis of web page structure for assistive interfaces

M. Cormier
{"title":"Computer vision-based analysis of web page structure for assistive interfaces","authors":"M. Cormier","doi":"10.1145/2899475.2899506","DOIUrl":null,"url":null,"abstract":"My PhD research aims to develop novel solutions to the challenge of identifying web page structure through the visual analysis of web pages as images. The intention is to then combine this back end design with various front end applications in order to provide improved web experiences for users with assistive needs (e.g. assisting visually impaired users by supporting more selective screenreader output, or improving experiences of users with cognitive deficits by allowing reduction of clutter or zooming in on selected web page content). I propose to build a comprehensive computer vision-based system to analyse the semantic structure of web pages based purely on an image of the rendered page, which will produce a rich representation of the page as a tree of regions labelled according to their semantic role. Most research into web page segmentation has focused on the use of the structure of the DOM tree and visual features derived from properties specified in the DOM tree. I argue, however, that the image of the rendered page may be a better representation to use, since it is created by the page designer to convey the structure of the page to the user, while the source code and DOM tree are simply intended to cause the browser's rendering engine to produce the correct appearance, and treat many types of content as black boxes. Additionally, my proposed system uses exactly the information seen by a user regardless of implementation method; this gives advantages in implementation-independence and versatility.","PeriodicalId":337838,"journal":{"name":"Proceedings of the 13th Web for All Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th Web for All Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2899475.2899506","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

My PhD research aims to develop novel solutions to the challenge of identifying web page structure through the visual analysis of web pages as images. The intention is to then combine this back end design with various front end applications in order to provide improved web experiences for users with assistive needs (e.g. assisting visually impaired users by supporting more selective screenreader output, or improving experiences of users with cognitive deficits by allowing reduction of clutter or zooming in on selected web page content). I propose to build a comprehensive computer vision-based system to analyse the semantic structure of web pages based purely on an image of the rendered page, which will produce a rich representation of the page as a tree of regions labelled according to their semantic role. Most research into web page segmentation has focused on the use of the structure of the DOM tree and visual features derived from properties specified in the DOM tree. I argue, however, that the image of the rendered page may be a better representation to use, since it is created by the page designer to convey the structure of the page to the user, while the source code and DOM tree are simply intended to cause the browser's rendering engine to produce the correct appearance, and treat many types of content as black boxes. Additionally, my proposed system uses exactly the information seen by a user regardless of implementation method; this gives advantages in implementation-independence and versatility.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于计算机视觉的辅助界面网页结构分析
我的博士研究旨在开发新的解决方案,通过对网页图像的视觉分析来识别网页结构的挑战。其目的是将后端设计与各种前端应用程序结合起来,为有辅助需求的用户提供更好的网络体验(例如,通过支持更多选择性屏幕阅读器输出来帮助视障用户,或者通过允许减少混乱或放大选定的网页内容来改善认知缺陷用户的体验)。我建议建立一个全面的基于计算机视觉的系统,纯粹基于渲染页面的图像来分析网页的语义结构,这将产生一个丰富的页面表示,作为根据其语义角色标记的区域树。大多数关于网页分割的研究都集中在使用DOM树的结构和从DOM树中指定的属性派生的视觉特征上。但是,我认为所呈现页面的图像可能是一种更好的表示,因为它是由页面设计人员创建的,目的是向用户传达页面的结构,而源代码和DOM树只是为了使浏览器的呈现引擎产生正确的外观,并将许多类型的内容视为黑盒。此外,我提出的系统使用用户看到的信息,而不管实现方法如何;这在实现独立性和多功能性方面具有优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Dytective: towards detecting dyslexia across languages using an online game Life-long learning on the inclusive web Accessible OzPlayer video player WebReader: a screen reader for everyone, everywhere Lake Devo: accessible online role-play
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1