面向定制的终端用户Web抓取

Kapaya Katongo, Geoffrey Litt, D. Jackson
{"title":"面向定制的终端用户Web抓取","authors":"Kapaya Katongo, Geoffrey Litt, D. Jackson","doi":"10.1145/3464432.3464437","DOIUrl":null,"url":null,"abstract":"Websites are malleable: users can run code in the browser to customize them. However, this malleability is typically only accessible to programmers with knowledge of HTML and Javascript. Previously, we developed a tool called Wildcard which empowers end-users to customize websites through a spreadsheet-like table interface without doing traditional programming. However, there is a limit to end-user agency with Wildcard, because programmers need to first create site-specific adapters mapping website data to the table interface. This means that end-users can only customize a website if a programmer has written an adapter for it, and cannot extend or repair existing adapters. In this paper, we extend Wildcard with a new system for end-user web scraping for customization. It enables end-users to create, extend and repair adapters, by performing concrete demonstrations of how the website user interface maps to a data table. We describe three design principles that guided our system’s development and are applicable to other end-user web scraping and customization systems: (a) users should be able to scrape data and use it in a single, unified environment, (b) users should be able to extend and repair the programs that scrape data via demonstration and (c) users should receive live feedback during their demonstrations. We have successfully used our system to create, extend and repair adapters by demonstration on a variety of websites and we provide example usage scenarios that showcase each of our design principles. Our ultimate goal is to empower end-users to customize websites in the course of their daily use in an intuitive and flexible way, and thus making the web more malleable for all of its users.","PeriodicalId":421912,"journal":{"name":"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Towards End-User Web Scraping for Customization\",\"authors\":\"Kapaya Katongo, Geoffrey Litt, D. Jackson\",\"doi\":\"10.1145/3464432.3464437\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Websites are malleable: users can run code in the browser to customize them. However, this malleability is typically only accessible to programmers with knowledge of HTML and Javascript. Previously, we developed a tool called Wildcard which empowers end-users to customize websites through a spreadsheet-like table interface without doing traditional programming. However, there is a limit to end-user agency with Wildcard, because programmers need to first create site-specific adapters mapping website data to the table interface. This means that end-users can only customize a website if a programmer has written an adapter for it, and cannot extend or repair existing adapters. In this paper, we extend Wildcard with a new system for end-user web scraping for customization. It enables end-users to create, extend and repair adapters, by performing concrete demonstrations of how the website user interface maps to a data table. We describe three design principles that guided our system’s development and are applicable to other end-user web scraping and customization systems: (a) users should be able to scrape data and use it in a single, unified environment, (b) users should be able to extend and repair the programs that scrape data via demonstration and (c) users should receive live feedback during their demonstrations. We have successfully used our system to create, extend and repair adapters by demonstration on a variety of websites and we provide example usage scenarios that showcase each of our design principles. Our ultimate goal is to empower end-users to customize websites in the course of their daily use in an intuitive and flexible way, and thus making the web more malleable for all of its users.\",\"PeriodicalId\":421912,\"journal\":{\"name\":\"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3464432.3464437\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3464432.3464437","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

网站是可塑的:用户可以在浏览器中运行代码来定制它们。然而,这种延展性通常只有具备HTML和Javascript知识的程序员才能使用。之前,我们开发了一个名为Wildcard的工具,它使最终用户能够通过类似电子表格的表格界面定制网站,而无需进行传统的编程。然而,通配符对终端用户代理有限制,因为程序员需要首先创建特定于站点的适配器,将网站数据映射到表接口。这意味着终端用户只能在程序员为其编写适配器的情况下定制网站,而不能扩展或修复现有的适配器。在本文中,我们将通配符扩展为一个新的系统,用于最终用户的定制web抓取。它通过执行网站用户界面如何映射到数据表的具体演示,使最终用户能够创建、扩展和修复适配器。我们描述了指导我们系统开发的三个设计原则,这些原则适用于其他终端用户的网络抓取和定制系统:(a)用户应该能够在一个单一的、统一的环境中抓取数据并使用它;(b)用户应该能够通过演示扩展和修复抓取数据的程序;(c)用户应该在演示期间收到实时反馈。通过在各种网站上的演示,我们已经成功地使用了我们的系统来创建、扩展和修复适配器,并且我们提供了展示我们的每个设计原则的示例使用场景。我们的最终目标是让终端用户在日常使用过程中以一种直观和灵活的方式定制网站,从而使网络对所有用户更具可塑性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Towards End-User Web Scraping for Customization
Websites are malleable: users can run code in the browser to customize them. However, this malleability is typically only accessible to programmers with knowledge of HTML and Javascript. Previously, we developed a tool called Wildcard which empowers end-users to customize websites through a spreadsheet-like table interface without doing traditional programming. However, there is a limit to end-user agency with Wildcard, because programmers need to first create site-specific adapters mapping website data to the table interface. This means that end-users can only customize a website if a programmer has written an adapter for it, and cannot extend or repair existing adapters. In this paper, we extend Wildcard with a new system for end-user web scraping for customization. It enables end-users to create, extend and repair adapters, by performing concrete demonstrations of how the website user interface maps to a data table. We describe three design principles that guided our system’s development and are applicable to other end-user web scraping and customization systems: (a) users should be able to scrape data and use it in a single, unified environment, (b) users should be able to extend and repair the programs that scrape data via demonstration and (c) users should receive live feedback during their demonstrations. We have successfully used our system to create, extend and repair adapters by demonstration on a variety of websites and we provide example usage scenarios that showcase each of our design principles. Our ultimate goal is to empower end-users to customize websites in the course of their daily use in an intuitive and flexible way, and thus making the web more malleable for all of its users.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Rec.HTML: Declarative HTML Toward Exploratory Understanding of Software using Test Suites Towards End-User Web Scraping for Customization Studying Programmer Behaviour at Scale: A Case Study using Amazon Mechanical Turk Oron: Towards a Dynamic Analysis Instrumentation Platform for AssemblyScript
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1