首页 > 最新文献

Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming最新文献

英文 中文
Studying Programmer Behaviour at Scale: A Case Study using Amazon Mechanical Turk 大规模研究程序员行为:使用亚马逊土耳其机器人的案例研究
Jason T. Jacques, P. Kristensson
Developing and maintaining a correct and consistent model of how code will be executed is an ongoing challenge for software developers. However, validating the tools and techniques we develop to aid programmers can be a challenge plagued by small sample sizes, high costs, or poor generalisability. This paper serves as a case study using a web-based crowdsourcing approach to study programmer behaviour at scale. We demonstrate this method to create controlled coding experiments at modest cost, highlight the efficacy of this approach with objective validation, and comment on notable findings from our prototype experiment into one of the most ubiquitous, yet understudied, features of modern software development environments: syntax highlighting.
对于软件开发人员来说,开发和维护一个正确且一致的代码执行模型是一个持续的挑战。然而,验证我们为帮助程序员而开发的工具和技术可能是一个挑战,因为样本规模小、成本高或通用性差。本文作为一个案例研究,使用基于网络的众包方法来大规模研究程序员的行为。我们演示了这种方法,以适度的成本创建受控编码实验,通过客观验证强调了这种方法的有效性,并将我们的原型实验的显著发现评论为现代软件开发环境中最普遍但尚未得到充分研究的特征之一:语法突出显示。
{"title":"Studying Programmer Behaviour at Scale: A Case Study using Amazon Mechanical Turk","authors":"Jason T. Jacques, P. Kristensson","doi":"10.1145/3464432.3464436","DOIUrl":"https://doi.org/10.1145/3464432.3464436","url":null,"abstract":"Developing and maintaining a correct and consistent model of how code will be executed is an ongoing challenge for software developers. However, validating the tools and techniques we develop to aid programmers can be a challenge plagued by small sample sizes, high costs, or poor generalisability. This paper serves as a case study using a web-based crowdsourcing approach to study programmer behaviour at scale. We demonstrate this method to create controlled coding experiments at modest cost, highlight the efficacy of this approach with objective validation, and comment on notable findings from our prototype experiment into one of the most ubiquitous, yet understudied, features of modern software development environments: syntax highlighting.","PeriodicalId":421912,"journal":{"name":"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127829187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
From ASTs to Machine Code with LLVM 用LLVM从ast到机器码
Dimitri Racordon
A compiler is a program that translates source code written in a particular language into another language. Internally, the whole process is typically split into multiple stages that handle one particular aspect of this translation. One of these consists of translating the high-level representation of the program, typically an abstract syntax tree, into a simpler form that is suitable for analysis, optimizations, and code generation. This tutorial paper focuses on this process, and uses LLVM to compile programs into optimized machine code. LLVM is a language-agnostic compiler toolchain that handles program optimization and code generation. It is based on its own internal representation, called LLVM IR, which is then transformed into machine code. We give a brief introduction to LLVM IR, and describe a few patterns to translate high-level language constructs expressed as abstract syntax trees. We implement these patterns in a compiler for a toy programming language, named Cocodol, which supports dynamic typing, unbounded loops, and higher-order functions.
编译器是将用一种特定语言编写的源代码翻译成另一种语言的程序。在内部,整个过程通常分为多个阶段,每个阶段处理翻译的一个特定方面。其中之一包括将程序的高级表示(通常是抽象语法树)转换为适合于分析、优化和代码生成的更简单的形式。本教程主要关注这个过程,并使用LLVM将程序编译成优化的机器码。LLVM是一个语言无关的编译器工具链,用于处理程序优化和代码生成。它基于自己的内部表示,称为LLVM IR,然后将其转换为机器码。我们简要介绍了LLVM IR,并描述了一些模式来翻译表达为抽象语法树的高级语言结构。我们在一种名为Cocodol的小编程语言的编译器中实现了这些模式,该语言支持动态类型、无界循环和高阶函数。
{"title":"From ASTs to Machine Code with LLVM","authors":"Dimitri Racordon","doi":"10.1145/3464432.3464777","DOIUrl":"https://doi.org/10.1145/3464432.3464777","url":null,"abstract":"A compiler is a program that translates source code written in a particular language into another language. Internally, the whole process is typically split into multiple stages that handle one particular aspect of this translation. One of these consists of translating the high-level representation of the program, typically an abstract syntax tree, into a simpler form that is suitable for analysis, optimizations, and code generation. This tutorial paper focuses on this process, and uses LLVM to compile programs into optimized machine code. LLVM is a language-agnostic compiler toolchain that handles program optimization and code generation. It is based on its own internal representation, called LLVM IR, which is then transformed into machine code. We give a brief introduction to LLVM IR, and describe a few patterns to translate high-level language constructs expressed as abstract syntax trees. We implement these patterns in a compiler for a toy programming language, named Cocodol, which supports dynamic typing, unbounded loops, and higher-order functions.","PeriodicalId":421912,"journal":{"name":"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128506242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improving on the Experience of Hand-Assembling Programs for Application-Specific Architectures 改进面向特定应用架构的手工组装程序的经验
Ian Piumarta
Creating an application-specific processor is an effective and popular way to solve many problems in embedded hardware design using FPGAs, ASICs, or custom silicon. Programming these processors is complicated by the lack of toolchain support for creating the necessary binary code as part of hardware design, implementation, and evaluation. Hardware developers who cannot create their own ad-hoc assembler are left to hand-assemble their code into binary instructions which is both painful and error prone. We present a tool that supports the rapid creation of assemblers for application-specific processors. A single language is used to specify both instruction formats as collections of bit fields and the instantiation of those formats into sequences of binary instructions as a single, homogeneous activity that is designed to be as familiar and accessible to hardware designers as possible. The output from the tool can be used directly by hardware synthesis tools to initialise the program memory of an application-specific processor.
在使用fpga、asic或定制芯片的嵌入式硬件设计中,创建特定应用程序的处理器是解决许多问题的有效和流行的方法。由于缺乏工具链支持,无法创建必要的二进制代码作为硬件设计、实现和评估的一部分,因此对这些处理器进行编程变得非常复杂。不能创建自己的专用汇编器的硬件开发人员只能手工将代码汇编成二进制指令,这既痛苦又容易出错。我们提供了一个工具,它支持为特定于应用程序的处理器快速创建汇编程序。使用一种语言来指定作为位字段集合的指令格式,并将这些格式实例化为二进制指令序列,作为一个单一的、同构的活动,该活动被设计为硬件设计人员尽可能熟悉和可访问的。硬件合成工具可以直接使用该工具的输出来初始化特定应用程序处理器的程序内存。
{"title":"Improving on the Experience of Hand-Assembling Programs for Application-Specific Architectures","authors":"Ian Piumarta","doi":"10.1145/3464432.3464434","DOIUrl":"https://doi.org/10.1145/3464432.3464434","url":null,"abstract":"Creating an application-specific processor is an effective and popular way to solve many problems in embedded hardware design using FPGAs, ASICs, or custom silicon. Programming these processors is complicated by the lack of toolchain support for creating the necessary binary code as part of hardware design, implementation, and evaluation. Hardware developers who cannot create their own ad-hoc assembler are left to hand-assemble their code into binary instructions which is both painful and error prone. We present a tool that supports the rapid creation of assemblers for application-specific processors. A single language is used to specify both instruction formats as collections of bit fields and the instantiation of those formats into sequences of binary instructions as a single, homogeneous activity that is designed to be as familiar and accessible to hardware designers as possible. The output from the tool can be used directly by hardware synthesis tools to initialise the program memory of an application-specific processor.","PeriodicalId":421912,"journal":{"name":"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132178278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Oron: Towards a Dynamic Analysis Instrumentation Platform for AssemblyScript 面向AssemblyScript的动态分析仪器平台
Aäron Munsters, Angel Luis Scull Pupo, Jim Bauwens, E. G. Boix
The dynamic nature of JavaScript may lead to challenges and issues regarding efficiency and security. Analysis tools can help developers tackle some of these issues. In the context of web applications, dynamic analyses are best suited for handling those dynamic features but may affect the programs execution performance. In a first experiment, we attempted to improve the performance of the Aran dynamic analysis platform for JavaScript by utilizing WebAssembly. The extension caused extra performance hits due to context switches between JavaScript and WebAssembly. Because these context switches are inevitable, we decided to refit our work for the analysis of AssemblyScript, a variant of TypeScript which compiles to WebAssembly (and therefore excluding context switches). In this work, we explore this approach in the form of a new source code instrumentation platform named Oron, which allows for the instrumentation of AssemblyScript code. The presented platform is evaluated and shows promising improvements which provide a solid basis for efficient dynamic analysis of AssemblyScript applications.
JavaScript的动态特性可能会带来效率和安全性方面的挑战和问题。分析工具可以帮助开发人员解决其中的一些问题。在web应用程序的上下文中,动态分析最适合处理那些动态特性,但可能会影响程序的执行性能。在第一个实验中,我们尝试利用WebAssembly来提高JavaScript的Aran动态分析平台的性能。由于JavaScript和WebAssembly之间的上下文切换,这个扩展造成了额外的性能损失。因为这些上下文切换是不可避免的,所以我们决定修改我们的工作来分析AssemblyScript,它是TypeScript的一个变体,可以编译为WebAssembly(因此排除了上下文切换)。在这项工作中,我们以名为Oron的新源代码插装平台的形式探索了这种方法,该平台允许对AssemblyScript代码进行插装。对所提出的平台进行了评估,并显示出有希望的改进,为有效地动态分析AssemblyScript应用程序提供了坚实的基础。
{"title":"Oron: Towards a Dynamic Analysis Instrumentation Platform for AssemblyScript","authors":"Aäron Munsters, Angel Luis Scull Pupo, Jim Bauwens, E. G. Boix","doi":"10.1145/3464432.3464780","DOIUrl":"https://doi.org/10.1145/3464432.3464780","url":null,"abstract":"The dynamic nature of JavaScript may lead to challenges and issues regarding efficiency and security. Analysis tools can help developers tackle some of these issues. In the context of web applications, dynamic analyses are best suited for handling those dynamic features but may affect the programs execution performance. In a first experiment, we attempted to improve the performance of the Aran dynamic analysis platform for JavaScript by utilizing WebAssembly. The extension caused extra performance hits due to context switches between JavaScript and WebAssembly. Because these context switches are inevitable, we decided to refit our work for the analysis of AssemblyScript, a variant of TypeScript which compiles to WebAssembly (and therefore excluding context switches). In this work, we explore this approach in the form of a new source code instrumentation platform named Oron, which allows for the instrumentation of AssemblyScript code. The presented platform is evaluated and shows promising improvements which provide a solid basis for efficient dynamic analysis of AssemblyScript applications.","PeriodicalId":421912,"journal":{"name":"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming","volume":"50 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128453834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rec.HTML: Declarative HTML HTML:声明式HTML
Bob Reynders, Kwanghoon Choi
Interactive user experiences on the web are becoming the norm. Client-side programs are becoming more complicated and have to deal with event handling, reading HTML document state and updating the interface. In this paper we propose a declarative language that supports these three facets of client-side browser development declaratively and provides a programming model where complex interfaces can be written using simple programming techniques such as records, functions and recursion.
网络上的交互式用户体验正在成为常态。客户端程序变得越来越复杂,必须处理事件处理、读取HTML文档状态和更新接口。在本文中,我们提出了一种声明性语言,它声明性地支持客户端浏览器开发的这三个方面,并提供了一种编程模型,其中可以使用简单的编程技术(如记录、函数和递归)编写复杂的接口。
{"title":"Rec.HTML: Declarative HTML","authors":"Bob Reynders, Kwanghoon Choi","doi":"10.1145/3464432.3464779","DOIUrl":"https://doi.org/10.1145/3464432.3464779","url":null,"abstract":"Interactive user experiences on the web are becoming the norm. Client-side programs are becoming more complicated and have to deal with event handling, reading HTML document state and updating the interface. In this paper we propose a declarative language that supports these three facets of client-side browser development declaratively and provides a programming model where complex interfaces can be written using simple programming techniques such as records, functions and recursion.","PeriodicalId":421912,"journal":{"name":"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming","volume":"234 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115114269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward Exploratory Understanding of Software using Test Suites 对使用测试套件的软件的探索性理解
D. Meier, Toni Mattis, R. Hirschfeld
Changing software without correctly understanding it often leads to confusion, as developers do not understand how the change corresponds to the new observed behaviour of the system. Today, many software systems are equipped with a test suite. Test suites document code and give feedback on changed program behaviour. We explored ways to use test suites for software comprehension and implemented a tool that provides additional visualisation and gives immediate feedback on software changes. Information about changes in the software and their implications to the test suite are collected using mutation testing. The tool uses this information to present relevant test cases for developers, and additionally prioritise test executions for immediate feedback. Our research indicates that entropy metrics can find test cases that are relevant for a specific context in the source code. Additionally, simple test case prioritisation strategies can already lead to a significant decrease in feedback time. Based on our case study we argue that test suites are not only useful for regression testing but can be used to generate meaningful information for software comprehension activities.
在没有正确理解软件的情况下更改软件通常会导致混乱,因为开发人员不理解更改如何对应于系统的新观察行为。今天,许多软件系统都配备了测试套件。测试套件记录代码并对更改的程序行为给出反馈。我们探索了使用测试套件来理解软件的方法,并实现了一个工具,它提供了额外的可视化,并对软件更改提供了即时的反馈。关于软件中的变更及其对测试套件的影响的信息是使用突变测试收集的。该工具使用这些信息为开发人员提供相关的测试用例,并为即时反馈确定测试执行的优先级。我们的研究表明,熵度量可以找到与源代码中特定上下文相关的测试用例。此外,简单的测试用例优先级策略已经可以显著减少反馈时间。基于我们的案例研究,我们认为测试套件不仅对回归测试有用,而且可以用于为软件理解活动生成有意义的信息。
{"title":"Toward Exploratory Understanding of Software using Test Suites","authors":"D. Meier, Toni Mattis, R. Hirschfeld","doi":"10.1145/3464432.3464438","DOIUrl":"https://doi.org/10.1145/3464432.3464438","url":null,"abstract":"Changing software without correctly understanding it often leads to confusion, as developers do not understand how the change corresponds to the new observed behaviour of the system. Today, many software systems are equipped with a test suite. Test suites document code and give feedback on changed program behaviour. We explored ways to use test suites for software comprehension and implemented a tool that provides additional visualisation and gives immediate feedback on software changes. Information about changes in the software and their implications to the test suite are collected using mutation testing. The tool uses this information to present relevant test cases for developers, and additionally prioritise test executions for immediate feedback. Our research indicates that entropy metrics can find test cases that are relevant for a specific context in the source code. Additionally, simple test case prioritisation strategies can already lead to a significant decrease in feedback time. Based on our case study we argue that test suites are not only useful for regression testing but can be used to generate meaningful information for software comprehension activities.","PeriodicalId":421912,"journal":{"name":"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123024541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Javardeye: Gaze Input for Cursor Control in a Structured Editor Javardeye:在结构化编辑器中光标控制的注视输入
André L. M. Santos
Programmers spend a considerable time jumping through editing positions in the source code, often requiring the use of the mouse and/or arrow keys to position the cursor at the desired editing position. We developed Javardeye, a prototype code editor for Java integrated with eye tracking technology for controlling the editing cursor. Our implementation is based on a structured editor, leveraging on its particular characteristics, and augmenting it with a secondary—latent cursor—controlled by eye gaze. This paper describes the main design decisions and tradeoffs of our approach.
程序员花了相当多的时间在源代码的编辑位置之间切换,通常需要使用鼠标和/或箭头键来将光标定位到所需的编辑位置。我们开发了Javardeye,这是一个Java代码编辑器的原型,集成了眼球追踪技术来控制编辑光标。我们的实现是基于结构化编辑器,利用其特定的特性,并通过眼睛注视控制的次级潜在光标对其进行扩展。本文描述了我们的方法的主要设计决策和权衡。
{"title":"Javardeye: Gaze Input for Cursor Control in a Structured Editor","authors":"André L. M. Santos","doi":"10.1145/3464432.3464435","DOIUrl":"https://doi.org/10.1145/3464432.3464435","url":null,"abstract":"Programmers spend a considerable time jumping through editing positions in the source code, often requiring the use of the mouse and/or arrow keys to position the cursor at the desired editing position. We developed Javardeye, a prototype code editor for Java integrated with eye tracking technology for controlling the editing cursor. Our implementation is based on a structured editor, leveraging on its particular characteristics, and augmenting it with a secondary—latent cursor—controlled by eye gaze. This paper describes the main design decisions and tradeoffs of our approach.","PeriodicalId":421912,"journal":{"name":"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128522126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Exploring Modal Locking in Window Manipulation: Why Programmers Should Stash, Duplicate, Split, and Link Composite Views 探索窗口操作中的模态锁定:为什么程序员应该隐藏、复制、分割和链接复合视图
Marcel Taeumel, R. Hirschfeld
Window manipulation plays a vital role in multi-tool user interaction, especially for programmers exploring software artifacts, gathering information for better understanding. However, today’s window managers offer only limited means to organize screen contents, which increases cognitive efforts for both tool builders and users. Builders must account for live integration of composite views; users might have to work around disruptive mode errors when actual tasks conflict with a tool’s design. We follow a pattern-finding approach and present four new verbs for direct window manipulation, which we consolidated from existing tools and systems. If window managers would offer to stash, duplicate, split, and link views, we believe that programmers could better maintain flow during exploration activities.
窗口操作在多工具用户交互中起着至关重要的作用,特别是对于程序员探索软件工件,收集信息以更好地理解。然而,今天的窗口管理器只提供有限的方法来组织屏幕内容,这增加了工具构建者和用户的认知努力。构建器必须考虑到组合视图的实时集成;当实际任务与工具的设计相冲突时,用户可能不得不绕过破坏性模式错误。我们采用了一种模式查找方法,并提供了四个用于直接窗口操作的新动词,这些动词是我们从现有工具和系统中整合而来的。如果窗口管理器提供隐藏、复制、分割和链接视图,我们相信程序员可以在探索活动中更好地维护流。
{"title":"Exploring Modal Locking in Window Manipulation: Why Programmers Should Stash, Duplicate, Split, and Link Composite Views","authors":"Marcel Taeumel, R. Hirschfeld","doi":"10.1145/3464432.3464433","DOIUrl":"https://doi.org/10.1145/3464432.3464433","url":null,"abstract":"Window manipulation plays a vital role in multi-tool user interaction, especially for programmers exploring software artifacts, gathering information for better understanding. However, today’s window managers offer only limited means to organize screen contents, which increases cognitive efforts for both tool builders and users. Builders must account for live integration of composite views; users might have to work around disruptive mode errors when actual tasks conflict with a tool’s design. We follow a pattern-finding approach and present four new verbs for direct window manipulation, which we consolidated from existing tools and systems. If window managers would offer to stash, duplicate, split, and link views, we believe that programmers could better maintain flow during exploration activities.","PeriodicalId":421912,"journal":{"name":"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133434370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards End-User Web Scraping for Customization 面向定制的终端用户Web抓取
Kapaya Katongo, Geoffrey Litt, D. Jackson
Websites are malleable: users can run code in the browser to customize them. However, this malleability is typically only accessible to programmers with knowledge of HTML and Javascript. Previously, we developed a tool called Wildcard which empowers end-users to customize websites through a spreadsheet-like table interface without doing traditional programming. However, there is a limit to end-user agency with Wildcard, because programmers need to first create site-specific adapters mapping website data to the table interface. This means that end-users can only customize a website if a programmer has written an adapter for it, and cannot extend or repair existing adapters. In this paper, we extend Wildcard with a new system for end-user web scraping for customization. It enables end-users to create, extend and repair adapters, by performing concrete demonstrations of how the website user interface maps to a data table. We describe three design principles that guided our system’s development and are applicable to other end-user web scraping and customization systems: (a) users should be able to scrape data and use it in a single, unified environment, (b) users should be able to extend and repair the programs that scrape data via demonstration and (c) users should receive live feedback during their demonstrations. We have successfully used our system to create, extend and repair adapters by demonstration on a variety of websites and we provide example usage scenarios that showcase each of our design principles. Our ultimate goal is to empower end-users to customize websites in the course of their daily use in an intuitive and flexible way, and thus making the web more malleable for all of its users.
网站是可塑的:用户可以在浏览器中运行代码来定制它们。然而,这种延展性通常只有具备HTML和Javascript知识的程序员才能使用。之前,我们开发了一个名为Wildcard的工具,它使最终用户能够通过类似电子表格的表格界面定制网站,而无需进行传统的编程。然而,通配符对终端用户代理有限制,因为程序员需要首先创建特定于站点的适配器,将网站数据映射到表接口。这意味着终端用户只能在程序员为其编写适配器的情况下定制网站,而不能扩展或修复现有的适配器。在本文中,我们将通配符扩展为一个新的系统,用于最终用户的定制web抓取。它通过执行网站用户界面如何映射到数据表的具体演示,使最终用户能够创建、扩展和修复适配器。我们描述了指导我们系统开发的三个设计原则,这些原则适用于其他终端用户的网络抓取和定制系统:(a)用户应该能够在一个单一的、统一的环境中抓取数据并使用它;(b)用户应该能够通过演示扩展和修复抓取数据的程序;(c)用户应该在演示期间收到实时反馈。通过在各种网站上的演示,我们已经成功地使用了我们的系统来创建、扩展和修复适配器,并且我们提供了展示我们的每个设计原则的示例使用场景。我们的最终目标是让终端用户在日常使用过程中以一种直观和灵活的方式定制网站,从而使网络对所有用户更具可塑性。
{"title":"Towards End-User Web Scraping for Customization","authors":"Kapaya Katongo, Geoffrey Litt, D. Jackson","doi":"10.1145/3464432.3464437","DOIUrl":"https://doi.org/10.1145/3464432.3464437","url":null,"abstract":"Websites are malleable: users can run code in the browser to customize them. However, this malleability is typically only accessible to programmers with knowledge of HTML and Javascript. Previously, we developed a tool called Wildcard which empowers end-users to customize websites through a spreadsheet-like table interface without doing traditional programming. However, there is a limit to end-user agency with Wildcard, because programmers need to first create site-specific adapters mapping website data to the table interface. This means that end-users can only customize a website if a programmer has written an adapter for it, and cannot extend or repair existing adapters. In this paper, we extend Wildcard with a new system for end-user web scraping for customization. It enables end-users to create, extend and repair adapters, by performing concrete demonstrations of how the website user interface maps to a data table. We describe three design principles that guided our system’s development and are applicable to other end-user web scraping and customization systems: (a) users should be able to scrape data and use it in a single, unified environment, (b) users should be able to extend and repair the programs that scrape data via demonstration and (c) users should receive live feedback during their demonstrations. We have successfully used our system to create, extend and repair adapters by demonstration on a variety of websites and we provide example usage scenarios that showcase each of our design principles. Our ultimate goal is to empower end-users to customize websites in the course of their daily use in an intuitive and flexible way, and thus making the web more malleable for all of its users.","PeriodicalId":421912,"journal":{"name":"Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126422597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1