2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)最新文献_第7页

Automatic Text Input Generation for Mobile Testing 自动文本输入生成移动测试

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-01 DOI: 10.1109/ICSE.2017.65

Peng Liu, X. Zhang, Marco Pistoia, Yunhui Zheng, M. Marques, Lingfei Zeng

Many designs have been proposed to improve the automated mobile testing. Despite these improvements, providing appropriate text inputs remains a prominent obstacle, which hinders the large-scale adoption of automated testing approaches. The key challenge is how to automatically produce the most relevant text in a use case context. For example, a valid website address should be entered in the address bar of a mobile browser app to continue the testing of the app, a singer's name should be entered in the search bar of a music recommendation app. Without the proper text inputs, the testing would get stuck. We propose a novel deep learning based approach to address the challenge, which reduces the problem to a minimization problem. Another challenge is how to make the approach generally applicable to both the trained apps and the untrained apps. We leverage the Word2Vec model to address the challenge. We have built our approaches as a tool and evaluated it with 50 iOS mobile apps including Firefox and Wikipedia. The results show that our approach significantly outperforms existing automatic text input generation methods.

人们提出了许多改进自动移动测试的设计。尽管有这些改进，提供适当的文本输入仍然是一个突出的障碍，它阻碍了自动化测试方法的大规模采用。关键的挑战是如何在用例上下文中自动生成最相关的文本。例如，在移动浏览器应用的地址栏中输入有效的网站地址，继续进行应用的测试;在音乐推荐应用的搜索栏中输入歌手的名字。如果没有适当的文本输入，测试就会卡住。我们提出了一种新的基于深度学习的方法来解决这一挑战，该方法将问题简化为最小化问题。另一个挑战是如何使这种方法普遍适用于经过训练的应用程序和未经训练的应用程序。我们利用Word2Vec模型来应对这一挑战。我们将自己的方法作为一种工具，并通过50款iOS手机应用(包括Firefox和Wikipedia)对其进行了评估。结果表明，我们的方法明显优于现有的自动文本输入生成方法。

{"title":"Automatic Text Input Generation for Mobile Testing","authors":"Peng Liu, X. Zhang, Marco Pistoia, Yunhui Zheng, M. Marques, Lingfei Zeng","doi":"10.1109/ICSE.2017.65","DOIUrl":"https://doi.org/10.1109/ICSE.2017.65","url":null,"abstract":"Many designs have been proposed to improve the automated mobile testing. Despite these improvements, providing appropriate text inputs remains a prominent obstacle, which hinders the large-scale adoption of automated testing approaches. The key challenge is how to automatically produce the most relevant text in a use case context. For example, a valid website address should be entered in the address bar of a mobile browser app to continue the testing of the app, a singer's name should be entered in the search bar of a music recommendation app. Without the proper text inputs, the testing would get stuck. We propose a novel deep learning based approach to address the challenge, which reduces the problem to a minimization problem. Another challenge is how to make the approach generally applicable to both the trained apps and the untrained apps. We leverage the Word2Vec model to address the challenge. We have built our approaches as a tool and evaluated it with 50 iOS mobile apps including Firefox and Wikipedia. The results show that our approach significantly outperforms existing automatic text input generation methods.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"15 1","pages":"643-653"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81964054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 60

Evaluating and Improving Fault Localization 评估和改进故障定位

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-01 DOI: 10.1109/ICSE.2017.62

Spencer Pearson, José Campos, René Just, G. Fraser, Rui Abreu, Michael D. Ernst, D. Pang, Benjamin Keller

Most fault localization techniques take as input a faulty program, and produce as output a ranked list of suspicious code locations at which the program may be defective. When researchers propose a new fault localization technique, they typically evaluate it on programs with known faults. The technique is scored based on where in its output list the defective code appears. This enables the comparison of multiple fault localization techniques to determine which one is better. Previous research has evaluated fault localization techniques using artificial faults, generated either by mutation tools or manually. In other words, previous research has determined which fault localization techniques are best at finding artificial faults. However, it is not known which fault localization techniques are best at finding real faults. It is not obvious that the answer is the same, given previous work showing that artificial faults have both similarities to and differences from real faults. We performed a replication study to evaluate 10 claims in the literature that compared fault localization techniques (from the spectrum-based and mutation-based families). We used 2995 artificial faults in 6 real-world programs. Our results support 7 of the previous claims as statistically significant, but only 3 as having non-negligible effect sizes. Then, we evaluated the same 10 claims, using 310 real faults from the 6 programs. Every previous result was refuted or was statistically and practically insignificant. Our experiments show that artificial faults are not useful for predicting which fault localization techniques perform best on real faults. In light of these results, we identified a design space that includes many previously-studied fault localization techniques as well as hundreds of new techniques. We experimentally determined which factors in the design space are most important, using an overall set of 395 real faults. Then, we extended this design space with new techniques. Several of our novel techniques outperform all existing techniques, notably in terms of ranking defective code in the top-5 or top-10 reports.

大多数故障定位技术将一个有故障的程序作为输入，并产生一个可疑代码位置的排序列表作为输出，在这些位置上程序可能存在缺陷。当研究人员提出一种新的故障定位技术时，他们通常会在已知故障的程序上进行评估。该技术的评分基于其输出列表中出现缺陷代码的位置。这样可以比较多种故障定位技术，以确定哪一种更好。以前的研究已经评估了使用人工故障的故障定位技术，这些故障要么是由突变工具产生的，要么是人工产生的。换句话说，之前的研究已经确定了哪种故障定位技术最适合发现人工故障。然而，目前尚不清楚哪种故障定位技术最适合发现实际故障。鉴于先前的研究表明，人工断层与真实断层既有相似之处，也有不同之处，因此答案并不明显相同。我们进行了一项重复研究，以评估文献中比较故障定位技术(来自基于频谱和基于突变的家族)的10项索赔。我们在6个真实的程序中使用了2995个人为故障。我们的结果支持先前的7项声明具有统计显著性，但只有3项具有不可忽略的效应大小。然后，我们使用来自6个程序的310个真实故障评估了相同的10个索赔。以前的每一个结果都被反驳了，或者在统计上和实际上是微不足道的。我们的实验表明，人工故障对于预测哪种故障定位技术在真实故障上表现最好是没有帮助的。根据这些结果，我们确定了一个设计空间，其中包括许多以前研究过的故障定位技术以及数百种新技术。我们通过实验确定了设计空间中哪些因素是最重要的，使用了395个真实故障的集合。然后，我们用新技术扩展了这个设计空间。我们的一些新技术优于所有现有技术，特别是在将缺陷代码排在前5名或前10名报告中。

{"title":"Evaluating and Improving Fault Localization","authors":"Spencer Pearson, José Campos, René Just, G. Fraser, Rui Abreu, Michael D. Ernst, D. Pang, Benjamin Keller","doi":"10.1109/ICSE.2017.62","DOIUrl":"https://doi.org/10.1109/ICSE.2017.62","url":null,"abstract":"Most fault localization techniques take as input a faulty program, and produce as output a ranked list of suspicious code locations at which the program may be defective. When researchers propose a new fault localization technique, they typically evaluate it on programs with known faults. The technique is scored based on where in its output list the defective code appears. This enables the comparison of multiple fault localization techniques to determine which one is better. Previous research has evaluated fault localization techniques using artificial faults, generated either by mutation tools or manually. In other words, previous research has determined which fault localization techniques are best at finding artificial faults. However, it is not known which fault localization techniques are best at finding real faults. It is not obvious that the answer is the same, given previous work showing that artificial faults have both similarities to and differences from real faults. We performed a replication study to evaluate 10 claims in the literature that compared fault localization techniques (from the spectrum-based and mutation-based families). We used 2995 artificial faults in 6 real-world programs. Our results support 7 of the previous claims as statistically significant, but only 3 as having non-negligible effect sizes. Then, we evaluated the same 10 claims, using 310 real faults from the 6 programs. Every previous result was refuted or was statistically and practically insignificant. Our experiments show that artificial faults are not useful for predicting which fault localization techniques perform best on real faults. In light of these results, we identified a design space that includes many previously-studied fault localization techniques as well as hundreds of new techniques. We experimentally determined which factors in the design space are most important, using an overall set of 395 real faults. Then, we extended this design space with new techniques. Several of our novel techniques outperform all existing techniques, notably in terms of ranking defective code in the top-5 or top-10 reports.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"39 1","pages":"609-620"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90512964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 321

An Unsupervised Approach for Discovering Relevant Tutorial Fragments for APIs 发现api相关教程片段的无监督方法

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-03-05 DOI: 10.1109/ICSE.2017.12

He Jiang, Jingxuan Zhang, Zhilei Ren, Zhang Tao

Developers increasingly rely on API tutorials to facilitate software development. However, it remains a challenging task for them to discover relevant API tutorial fragments explaining unfamiliar APIs. Existing supervised approaches suffer from the heavy burden of manually preparing corpus-specific annotated data and features. In this study, we propose a novel unsupervised approach, namely Fragment Recommender for APIs with PageRank and Topic model (FRAPT). FRAPT can well address two main challenges lying in the task and effectively determine relevant tutorial fragments for APIs. In FRAPT, a Fragment Parser is proposed to identify APIs in tutorial fragments and replace ambiguous pronouns and variables with related ontologies and API names, so as to address the pronoun and variable resolution challenge. Then, a Fragment Filter employs a set of non-explanatory detection rules to remove non-explanatory fragments, thus address the non-explanatory fragment identification challenge. Finally, two correlation scores are achieved and aggregated to determine relevant fragments for APIs, by applying both topic model and PageRank algorithm to the retained fragments. Extensive experiments over two publicly open tutorial corpora show that, FRAPT improves the state-of-the-art approach by 8.77% and 12.32% respectively in terms of F-Measure. The effectiveness of key components of FRAPT is also validated.

开发人员越来越依赖API教程来促进软件开发。然而，对于他们来说，发现解释不熟悉的API的相关API教程片段仍然是一项具有挑战性的任务。现有的监督方法需要手动准备特定于语料库的注释数据和特征。在本研究中，我们提出了一种新的无监督方法，即基于PageRank和Topic模型的api片段推荐(FRAPT)。FRAPT可以很好地解决任务中的两个主要挑战，并有效地确定api的相关教程片段。在FRAPT中，提出了一个片段解析器来识别教程片段中的API，并用相关的本体和API名称替换歧义代词和变量，从而解决代词和变量解析的难题。然后，Fragment Filter采用一组非解释性检测规则来去除非解释性片段，从而解决了非解释性片段识别的挑战。最后，通过对保留的片段应用主题模型和PageRank算法，得到两个相关分数并进行汇总，确定api的相关片段。在两个公开开放的教程语料库上进行的大量实验表明，FRAPT在F-Measure方面分别提高了最先进的方法8.77%和12.32%。验证了FRAPT关键组件的有效性。

{"title":"An Unsupervised Approach for Discovering Relevant Tutorial Fragments for APIs","authors":"He Jiang, Jingxuan Zhang, Zhilei Ren, Zhang Tao","doi":"10.1109/ICSE.2017.12","DOIUrl":"https://doi.org/10.1109/ICSE.2017.12","url":null,"abstract":"Developers increasingly rely on API tutorials to facilitate software development. However, it remains a challenging task for them to discover relevant API tutorial fragments explaining unfamiliar APIs. Existing supervised approaches suffer from the heavy burden of manually preparing corpus-specific annotated data and features. In this study, we propose a novel unsupervised approach, namely Fragment Recommender for APIs with PageRank and Topic model (FRAPT). FRAPT can well address two main challenges lying in the task and effectively determine relevant tutorial fragments for APIs. In FRAPT, a Fragment Parser is proposed to identify APIs in tutorial fragments and replace ambiguous pronouns and variables with related ontologies and API names, so as to address the pronoun and variable resolution challenge. Then, a Fragment Filter employs a set of non-explanatory detection rules to remove non-explanatory fragments, thus address the non-explanatory fragment identification challenge. Finally, two correlation scores are achieved and aggregated to determine relevant fragments for APIs, by applying both topic model and PageRank algorithm to the retained fragments. Extensive experiments over two publicly open tutorial corpora show that, FRAPT improves the state-of-the-art approach by 8.77% and 12.32% respectively in terms of F-Measure. The effectiveness of key components of FRAPT is also validated.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"1 1","pages":"38-48"},"PeriodicalIF":0.0,"publicationDate":"2017-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88571091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 55

What Causes My Test Alarm? Automatic Cause Analysis for Test Alarms in System and Integration Testing 什么导致我的测试警报?系统与集成测试中测试告警的自动原因分析

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-03-02 DOI: 10.1109/ICSE.2017.71

He Jiang, Xiaochen Li, Z. Yang, J. Xuan

Driven by new software development processes and testing in clouds, system and integration testing nowadays tends to produce enormous number of alarms. Such test alarms lay an almost unbearable burden on software testing engineers who have to manually analyze the causes of these alarms. The causes are critical because they decide which stakeholders are responsible to fix the bugs detected during the testing. In this paper, we present a novel approach that aims to relieve the burden by automating the procedure. Our approach, called Cause Analysis Model, exploits information retrieval techniques to efficiently infer test alarm causes based on test logs. We have developed a prototype and evaluated our tool on two industrial datasets with more than 14,000 test alarms. Experiments on the two datasets show that our tool achieves an accuracy of 58.3% and 65.8%, respectively, which outperforms the baseline algorithms by up to 13.3%. Our algorithm is also extremely efficient, spending about 0.1s per cause analysis. Due to the attractive experimental results, our industrial partner, a leading information and communication technology company in the world, has deployed the tool and it achieves an average accuracy of 72% after two months of running, nearly three times more accurate than a previous strategy based on regular expressions.

在新的软件开发过程和云测试的驱动下，系统和集成测试现在往往会产生大量的警报。这样的测试警报给软件测试工程师带来了难以承受的负担，他们不得不手动分析这些警报的原因。原因是至关重要的，因为它们决定了哪些涉众负责修复测试期间检测到的错误。在本文中，我们提出了一种新的方法，旨在通过自动化过程来减轻负担。我们的方法，称为原因分析模型，利用信息检索技术有效地推断基于测试日志的测试报警原因。我们已经开发了一个原型，并在两个工业数据集上评估了我们的工具，其中包含超过14,000个测试警报。在两个数据集上的实验表明，我们的工具分别达到了58.3%和65.8%的准确率，比基线算法高出13.3%。我们的算法也非常高效，每次原因分析花费大约0.1s。由于具有吸引力的实验结果，我们的工业合作伙伴，一家世界领先的信息和通信技术公司，已经部署了该工具，经过两个月的运行，它达到了72%的平均准确率，比以前基于正则表达式的策略准确率提高了近三倍。

{"title":"What Causes My Test Alarm? Automatic Cause Analysis for Test Alarms in System and Integration Testing","authors":"He Jiang, Xiaochen Li, Z. Yang, J. Xuan","doi":"10.1109/ICSE.2017.71","DOIUrl":"https://doi.org/10.1109/ICSE.2017.71","url":null,"abstract":"Driven by new software development processes and testing in clouds, system and integration testing nowadays tends to produce enormous number of alarms. Such test alarms lay an almost unbearable burden on software testing engineers who have to manually analyze the causes of these alarms. The causes are critical because they decide which stakeholders are responsible to fix the bugs detected during the testing. In this paper, we present a novel approach that aims to relieve the burden by automating the procedure. Our approach, called Cause Analysis Model, exploits information retrieval techniques to efficiently infer test alarm causes based on test logs. We have developed a prototype and evaluated our tool on two industrial datasets with more than 14,000 test alarms. Experiments on the two datasets show that our tool achieves an accuracy of 58.3% and 65.8%, respectively, which outperforms the baseline algorithms by up to 13.3%. Our algorithm is also extremely efficient, spending about 0.1s per cause analysis. Due to the attractive experimental results, our industrial partner, a leading information and communication technology company in the world, has deployed the tool and it achieves an average accuracy of 72% after two months of running, nearly three times more accurate than a previous strategy based on regular expressions.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"47 1","pages":"712-723"},"PeriodicalIF":0.0,"publicationDate":"2017-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88111222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

Statically Checking Web API Requests in JavaScript 静态检查JavaScript中的Web API请求

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-02-13 DOI: 10.1109/ICSE.2017.30

Erik Wittern, Annie T. T. Ying, Yunhui Zheng, Julian T Dolby, Jim Laredo

Many JavaScript applications perform HTTP requests to web APIs, relying on the request URL, HTTP method, and request data to be constructed correctly by string operations. Traditional compile-time error checking, such as calling a non-existent method in Java, are not available for checking whether such requests comply with the requirements of a web API. In this paper, we propose an approach to statically check web API requests in JavaScript. Our approach first extracts a request's URL string, HTTP method, and the corresponding request data using an inter-procedural string analysis, and then checks whether the request conforms to given web API specifications. We evaluated our approach by checking whether web API requests in JavaScript files mined from GitHub are consistent or inconsistent with publicly available API specifications. From the 6575 requests in scope, our approach determined whether the request's URL and HTTP method was consistent or inconsistent with web API specifications with a precision of 96.0%. Our approach also correctly determined whether extracted request data was consistent or inconsistent with the data requirements with a precision of 87.9% for payload data and 99.9% for query data. In a systematic analysis of the inconsistent cases, we found that many of them were due to errors in the client code. The here proposed checker can be integrated with code editors or with continuous integration tools to warn programmers about code containing potentially erroneous requests.

许多JavaScript应用程序对web api执行HTTP请求，依赖于请求URL、HTTP方法和通过字符串操作正确构造的请求数据。传统的编译时错误检查，例如调用Java中不存在的方法，无法用于检查此类请求是否符合web API的要求。在本文中，我们提出了一种用JavaScript静态检查web API请求的方法。我们的方法首先使用过程间字符串分析提取请求的URL字符串、HTTP方法和相应的请求数据，然后检查请求是否符合给定的web API规范。我们通过检查从GitHub挖掘的JavaScript文件中的web API请求是否与公开可用的API规范一致来评估我们的方法。从范围内的6575个请求中，我们的方法确定请求的URL和HTTP方法是否与web API规范一致或不一致，精度为96.0%。我们的方法还正确地确定提取的请求数据是否与数据需求一致，有效负载数据的精度为87.9%，查询数据的精度为99.9%。在对不一致案例的系统分析中，我们发现其中许多是由于客户端代码中的错误造成的。这里建议的检查器可以与代码编辑器或持续集成工具集成，以警告程序员包含潜在错误请求的代码。

{"title":"Statically Checking Web API Requests in JavaScript","authors":"Erik Wittern, Annie T. T. Ying, Yunhui Zheng, Julian T Dolby, Jim Laredo","doi":"10.1109/ICSE.2017.30","DOIUrl":"https://doi.org/10.1109/ICSE.2017.30","url":null,"abstract":"Many JavaScript applications perform HTTP requests to web APIs, relying on the request URL, HTTP method, and request data to be constructed correctly by string operations. Traditional compile-time error checking, such as calling a non-existent method in Java, are not available for checking whether such requests comply with the requirements of a web API. In this paper, we propose an approach to statically check web API requests in JavaScript. Our approach first extracts a request's URL string, HTTP method, and the corresponding request data using an inter-procedural string analysis, and then checks whether the request conforms to given web API specifications. We evaluated our approach by checking whether web API requests in JavaScript files mined from GitHub are consistent or inconsistent with publicly available API specifications. From the 6575 requests in scope, our approach determined whether the request's URL and HTTP method was consistent or inconsistent with web API specifications with a precision of 96.0%. Our approach also correctly determined whether extracted request data was consistent or inconsistent with the data requirements with a precision of 87.9% for payload data and 99.9% for query data. In a systematic analysis of the inconsistent cases, we found that many of them were due to errors in the client code. The here proposed checker can be integrated with code editors or with continuous integration tools to warn programmers about code containing potentially erroneous requests.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"3 1","pages":"244-254"},"PeriodicalIF":0.0,"publicationDate":"2017-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86067982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

Learning Syntactic Program Transformations from Examples 从例子中学习语法程序转换

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2016-08-31 DOI: 10.1109/ICSE.2017.44

R. Sousa, Gustavo Soares, Loris D'antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, Bjoern Hartmann

Automatic program transformation tools can be valuable for programmers to help them with refactoring tasks, and for Computer Science students in the form of tutoring systems that suggest repairs to programming assignments. However, manually creating catalogs of transformations is complex and time-consuming. In this paper, we present REFAZER, a technique for automatically learning program transformations. REFAZER builds on the observation that code edits performed by developers can be used as input-output examples for learning program transformations. Example edits may share the same structure but involve different variables and subexpressions, which must be generalized in a transformation at the right level of abstraction. To learn transformations, REFAZER leverages state-of-the-art programming-by-example methodology using the following key components: (a) a novel domain-specific language (DSL) for describing program transformations, (b) domain-specific deductive algorithms for efficiently synthesizing transformations in the DSL, and (c) functions for ranking the synthesized transformations. We instantiate and evaluate REFAZER in two domains. First, given examples of code edits used by students to fix incorrect programming assignment submissions, we learn program transformations that can fix other students' submissions with similar faults. In our evaluation conducted on 4 programming tasks performed by 720 students, our technique helped to fix incorrect submissions for 87% of the students. In the second domain, we use repetitive code edits applied by developers to the same project to synthesize a program transformation that applies these edits to other locations in the code. In our evaluation conducted on 56 scenarios of repetitive edits taken from three large C# open-source projects, REFAZER learns the intended program transformation in 84% of the cases using only 2.9 examples on average.

自动程序转换工具对于程序员来说很有价值，它可以帮助他们完成重构任务，对于计算机科学专业的学生来说，它以辅导系统的形式建议对编程作业进行修复。然而，手动创建转换目录既复杂又耗时。在本文中，我们提出了REFAZER，一种自动学习程序转换的技术。REFAZER基于这样的观察:开发人员执行的代码编辑可以用作学习程序转换的输入-输出示例。示例编辑可能共享相同的结构，但涉及不同的变量和子表达式，这些变量和子表达式必须在适当抽象级别的转换中一般化。为了学习转换，REFAZER利用最先进的实例编程方法，使用以下关键组件:(a)用于描述程序转换的新型领域特定语言(DSL)， (b)用于有效合成DSL中的转换的领域特定演绎算法，以及(c)用于对合成转换进行排序的函数。我们在两个域中实例化和评估REFAZER。首先，给出了学生使用代码编辑来修复错误的编程作业提交的示例，我们学习了可以修复其他有类似错误的学生提交的程序转换。在我们对720名学生执行的4个编程任务进行的评估中，我们的技术帮助修复了87%的学生的错误提交。在第二个领域中，我们使用由开发人员应用于相同项目的重复代码编辑来合成将这些编辑应用于代码中的其他位置的程序转换。在我们对来自三个大型c#开源项目的56个重复编辑场景进行的评估中，REFAZER平均只使用2.9个示例就能在84%的情况下学习到预期的程序转换。

{"title":"Learning Syntactic Program Transformations from Examples","authors":"R. Sousa, Gustavo Soares, Loris D'antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, Bjoern Hartmann","doi":"10.1109/ICSE.2017.44","DOIUrl":"https://doi.org/10.1109/ICSE.2017.44","url":null,"abstract":"Automatic program transformation tools can be valuable for programmers to help them with refactoring tasks, and for Computer Science students in the form of tutoring systems that suggest repairs to programming assignments. However, manually creating catalogs of transformations is complex and time-consuming. In this paper, we present REFAZER, a technique for automatically learning program transformations. REFAZER builds on the observation that code edits performed by developers can be used as input-output examples for learning program transformations. Example edits may share the same structure but involve different variables and subexpressions, which must be generalized in a transformation at the right level of abstraction. To learn transformations, REFAZER leverages state-of-the-art programming-by-example methodology using the following key components: (a) a novel domain-specific language (DSL) for describing program transformations, (b) domain-specific deductive algorithms for efficiently synthesizing transformations in the DSL, and (c) functions for ranking the synthesized transformations. We instantiate and evaluate REFAZER in two domains. First, given examples of code edits used by students to fix incorrect programming assignment submissions, we learn program transformations that can fix other students' submissions with similar faults. In our evaluation conducted on 4 programming tasks performed by 720 students, our technique helped to fix incorrect submissions for 87% of the students. In the second domain, we use repetitive code edits applied by developers to the same project to synthesize a program transformation that applies these edits to other locations in the code. In our evaluation conducted on 56 scenarios of repetitive edits taken from three large C# open-source projects, REFAZER learns the intended program transformation in 84% of the cases using only 2.9 examples on average.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"22 1","pages":"404-415"},"PeriodicalIF":0.0,"publicationDate":"2016-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78816885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 199

Precise Condition Synthesis for Program Repair 程序修复的精确条件综合

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2016-08-28 DOI: 10.1109/ICSE.2017.45

Yingfei Xiong, Jie Wang, Runfa Yan, Jiachen Zhang, Shi Han, Gang Huang, Lu Zhang

Due to the difficulty of repairing defect, many research efforts have been devoted into automatic defect repair. Given a buggy program that fails some test cases, a typical automatic repair technique tries to modify the program to make all tests pass. However, since the test suites in real world projects are usually insufficient, aiming at passing the test suites often leads to incorrect patches. This problem is known as weak test suites or overfitting. In this paper we aim to produce precise patches, that is, any patch we produce has a relatively high probability to be correct. More concretely, we focus on condition synthesis, which was shown to be able to repair more than half of the defects in existing approaches. Our key insight is threefold. First, it is important to know what variables in a local context should be used in an "if" condition, and we propose a sorting method based on the dependency relations between variables. Second, we observe that the API document can be used to guide the repair process, and propose document analysis technique to further filter the variables. Third, it is important to know what predicates should be performed on the set of variables, and we propose to mine a set of frequently used predicates in similar contexts from existing projects. Based on the insight, we develop a novel program repair system, ACS, that could generate precise conditions at faulty locations. Furthermore, given the generated conditions are very precise, we can perform a repair operation that is previously deemed to be too overfitting: directly returning the test oracle to repair the defect. Using our approach, we successfully repaired 18 defects on four projects of Defects4J, which is the largest number of fully automatically repaired defects reported on the dataset so far. More importantly, the precision of our approach in the evaluation is 78.3%, which is significantly higher than previous approaches, which are usually less than 40%.

由于缺陷修复的困难，许多研究都致力于缺陷的自动修复。给定一个不通过某些测试用例的错误程序，典型的自动修复技术试图修改程序以使所有测试通过。然而，由于实际项目中的测试套件通常是不够的，以通过测试套件为目标通常会导致不正确的补丁。这个问题被称为弱测试套件或过拟合。在本文中，我们的目标是生成精确的patch，即我们生成的任何patch都有相对高的概率是正确的。更具体地说，我们专注于条件综合，它被证明能够修复现有方法中一半以上的缺陷。我们的主要观点有三个方面。首先，重要的是要知道局部上下文中应该在“if”条件下使用哪些变量，我们提出了一种基于变量之间依赖关系的排序方法。其次，我们观察到API文档可以用来指导修复过程，并提出文档分析技术来进一步过滤变量。第三，重要的是要知道应该在变量集上执行什么谓词，我们建议从现有项目中挖掘一组在类似上下文中经常使用的谓词。基于这一见解，我们开发了一种新的程序修复系统，ACS，它可以在故障位置产生精确的条件。此外，给定生成的条件非常精确，我们可以执行先前被认为过于拟合的修复操作:直接返回测试oracle来修复缺陷。使用我们的方法，我们成功地修复了缺陷4j的四个项目中的18个缺陷，这是迄今为止在数据集中报告的完全自动修复缺陷的最大数量。更重要的是，我们的方法在评估中的精度为78.3%，明显高于以往的方法，通常不到40%。

{"title":"Precise Condition Synthesis for Program Repair","authors":"Yingfei Xiong, Jie Wang, Runfa Yan, Jiachen Zhang, Shi Han, Gang Huang, Lu Zhang","doi":"10.1109/ICSE.2017.45","DOIUrl":"https://doi.org/10.1109/ICSE.2017.45","url":null,"abstract":"Due to the difficulty of repairing defect, many research efforts have been devoted into automatic defect repair. Given a buggy program that fails some test cases, a typical automatic repair technique tries to modify the program to make all tests pass. However, since the test suites in real world projects are usually insufficient, aiming at passing the test suites often leads to incorrect patches. This problem is known as weak test suites or overfitting. In this paper we aim to produce precise patches, that is, any patch we produce has a relatively high probability to be correct. More concretely, we focus on condition synthesis, which was shown to be able to repair more than half of the defects in existing approaches. Our key insight is threefold. First, it is important to know what variables in a local context should be used in an \"if\" condition, and we propose a sorting method based on the dependency relations between variables. Second, we observe that the API document can be used to guide the repair process, and propose document analysis technique to further filter the variables. Third, it is important to know what predicates should be performed on the set of variables, and we propose to mine a set of frequently used predicates in similar contexts from existing projects. Based on the insight, we develop a novel program repair system, ACS, that could generate precise conditions at faulty locations. Furthermore, given the generated conditions are very precise, we can perform a repair operation that is previously deemed to be too overfitting: directly returning the test oracle to repair the defect. Using our approach, we successfully repaired 18 defects on four projects of Defects4J, which is the largest number of fully automatically repaired defects reported on the dataset so far. More importantly, the precision of our approach in the evaluation is 78.3%, which is significantly higher than previous approaches, which are usually less than 40%.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"16 1","pages":"416-426"},"PeriodicalIF":0.0,"publicationDate":"2016-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80160914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 261

Classifying Developers into Core and Peripheral: An Empirical Study on Count and Network Metrics 将开发者分为核心和外围:基于数量和网络度量的实证研究

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2016-04-04 DOI: 10.1109/ICSE.2017.23

Mitchell Joblin, S. Apel, Claus Hunsen, W. Mauerer

Knowledge about the roles developers play in a software project is crucial to understanding the project's collaborative dynamics. In practice, developers are often classified according to the dichotomy of core and peripheral roles. Typically, count-based operationalizations, which rely on simple counts of individual developer activities (e.g., number of commits), are used for this purpose, but there is concern regarding their validity and ability to elicit meaningful insights. To shed light on this issue, we investigate whether count-based operationalizations of developer roles produce consistent results, and we validate them with respect to developers' perceptions by surveying 166 developers. Improving over the state of the art, we propose a relational perspective on developer roles, using fine-grained developer networks modeling the organizational structure, and by examining developer roles in terms of developers' positions and stability within the developer network. In a study of 10 substantial open-source projects, we found that the primary difference between the count-based and our proposed network-based core–peripheral operationalizations is that the network-based ones agree more with developer perception than count-based ones. Furthermore, we demonstrate that a relational perspective can reveal further meaningful insights, such as that core developers exhibit high positional stability, upper positions in the hierarchy, and high levels of coordination with other core developers, which confirms assumptions of previous work.

了解开发人员在软件项目中扮演的角色对于理解项目的协作动态是至关重要的。在实践中，开发人员通常根据核心和外围角色的二分法进行分类。通常，基于计数的操作化，它依赖于单个开发人员活动的简单计数(例如，提交的数量)，用于此目的，但是存在关于它们的有效性和引出有意义的见解的能力的担忧。为了阐明这个问题，我们调查了基于计数的开发人员角色的操作化是否产生一致的结果，并且我们通过调查166名开发人员来验证他们对开发人员的看法。在现有技术的基础上，我们提出了一个关于开发人员角色的关系视角，使用细粒度的开发人员网络对组织结构进行建模，并根据开发人员在开发人员网络中的位置和稳定性来检查开发人员角色。在对10个重要的开源项目的研究中，我们发现基于计数的和我们提出的基于网络的核心外设操作实现之间的主要区别在于基于网络的比基于计数的更符合开发人员的看法。此外，我们证明了关系视角可以揭示进一步有意义的见解，例如核心开发人员表现出高度的位置稳定性，在层次结构中的上层位置，以及与其他核心开发人员的高度协调，这证实了先前工作的假设。

{"title":"Classifying Developers into Core and Peripheral: An Empirical Study on Count and Network Metrics","authors":"Mitchell Joblin, S. Apel, Claus Hunsen, W. Mauerer","doi":"10.1109/ICSE.2017.23","DOIUrl":"https://doi.org/10.1109/ICSE.2017.23","url":null,"abstract":"Knowledge about the roles developers play in a software project is crucial to understanding the project's collaborative dynamics. In practice, developers are often classified according to the dichotomy of core and peripheral roles. Typically, count-based operationalizations, which rely on simple counts of individual developer activities (e.g., number of commits), are used for this purpose, but there is concern regarding their validity and ability to elicit meaningful insights. To shed light on this issue, we investigate whether count-based operationalizations of developer roles produce consistent results, and we validate them with respect to developers' perceptions by surveying 166 developers. Improving over the state of the art, we propose a relational perspective on developer roles, using fine-grained developer networks modeling the organizational structure, and by examining developer roles in terms of developers' positions and stability within the developer network. In a study of 10 substantial open-source projects, we found that the primary difference between the count-based and our proposed network-based core–peripheral operationalizations is that the network-based ones agree more with developer perception than count-based ones. Furthermore, we demonstrate that a relational perspective can reveal further meaningful insights, such as that core developers exhibit high positional stability, upper positions in the hierarchy, and high levels of coordination with other core developers, which confirms assumptions of previous work.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"52 1","pages":"164-174"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82884710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 86