2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)最新文献_第8页

Incrementally slicing editable submodels 增量切片可编辑的子模型

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2017-10-30 DOI: 10.1109/ASE.2017.8115704

Christopher Pietsch, Manuel Ohrndorf, U. Kelter, Timo Kehrer

Model slicers are tools which provide two services: (a) finding parts of interest in a model and (b) displaying these parts somehow or extract these parts as a new, autonomous model, which is referred to as slice or sub-model. This paper focuses on the creation of editable slices, which can be processed by model editors, analysis tools, model management tools etc. Slices are useful if, e.g., only a part of a large model shall be analyzed, compared or processed by time-consuming algorithms, or if sub-models shall be modified independently. We present a new generic incremental slicer which can slice models of arbitrary type and which creates slices which are consistent in the sense that they are editable by standard editors. It is built on top of a model differencing framework and does not require additional configuration data beyond those available in the differencing framework. The slicer can incrementally extend or reduce an existing slice if model elements shall be added or removed, even if the slice has been edited meanwhile. We demonstrate the usefulness of our slicer in several scenarios using a large UML model. A screencast of the demonstrated scenarios is provided at http://pi.informatik.uni-siegen.de/projects/SiLift/ase2017.

模型切片器是提供两种服务的工具:(a)在模型中找到感兴趣的部分，(b)以某种方式显示这些部分，或者将这些部分提取为一个新的、自治的模型，这被称为切片或子模型。本文主要讨论了可编辑切片的创建，这些切片可以通过模型编辑器、分析工具、模型管理工具等进行处理。切片在一些情况下是有用的，例如，大型模型中只有一部分需要用耗时的算法进行分析、比较或处理，或者子模型需要独立修改。我们提出了一种新的通用增量切片器，它可以对任意类型的模型进行切片，并且创建的切片在标准编辑器可编辑的意义上是一致的。它构建在模型差异框架之上，并且不需要除差异框架中可用的配置数据之外的其他配置数据。如果需要添加或删除模型元素，切片器可以增量地扩展或减少现有的切片，即使该切片已经被编辑过。我们在使用大型UML模型的几个场景中演示了切片器的有用性。在http://pi.informatik.uni-siegen.de/projects/SiLift/ase2017上提供了演示场景的屏幕截图。

{"title":"Incrementally slicing editable submodels","authors":"Christopher Pietsch, Manuel Ohrndorf, U. Kelter, Timo Kehrer","doi":"10.1109/ASE.2017.8115704","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115704","url":null,"abstract":"Model slicers are tools which provide two services: (a) finding parts of interest in a model and (b) displaying these parts somehow or extract these parts as a new, autonomous model, which is referred to as slice or sub-model. This paper focuses on the creation of editable slices, which can be processed by model editors, analysis tools, model management tools etc. Slices are useful if, e.g., only a part of a large model shall be analyzed, compared or processed by time-consuming algorithms, or if sub-models shall be modified independently. We present a new generic incremental slicer which can slice models of arbitrary type and which creates slices which are consistent in the sense that they are editable by standard editors. It is built on top of a model differencing framework and does not require additional configuration data beyond those available in the differencing framework. The slicer can incrementally extend or reduce an existing slice if model elements shall be added or removed, even if the slice has been edited meanwhile. We demonstrate the usefulness of our slicer in several scenarios using a large UML model. A screencast of the demonstrated scenarios is provided at http://pi.informatik.uni-siegen.de/projects/SiLift/ase2017.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129231126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Detecting fragile comments 检测脆弱注释

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2017-10-30 DOI: 10.1109/ASE.2017.8115624

Inderjot Kaur Ratol, M. Robillard

Refactoring is a common software development practice and many simple refactorings can be performed automatically by tools. Identifier renaming is a widely performed refactoring activity. With tool support, rename refactorings can rely on the program structure to ensure correctness of the code transformation. Unfortunately, the textual references to the renamed identifier present in the unstructured comment text cannot be formally detected through the syntax of the language, and are thus fragile with respect to identifier renaming. We designed a new rule-based approach to detect fragile comments. Our approach, called Fraco, takes into account the type of identifier, its morphology, the scope of the identifier and the location of comments. We evaluated the approach by comparing its precision and recall against hand-annotated benchmarks created for six target Java systems, and compared the results against the performance of Eclipse's automated in-comment identifier replacement feature. Fraco performed with near-optimal precision and recall on most components of our evaluation data set, and generally outperformed the baseline Eclipse feature. As part of our evaluation, we also noted that more than half of the total number of identifiers in our data set had fragile comments after renaming, which further motivates the need for research on automatic comment refactoring.

重构是一种常见的软件开发实践，许多简单的重构可以通过工具自动执行。标识符重命名是一种广泛执行的重构活动。有了工具的支持，重命名重构可以依赖于程序结构来确保代码转换的正确性。不幸的是，对非结构化注释文本中出现的重命名标识符的文本引用不能通过语言的语法正式检测到，因此在标识符重命名方面很脆弱。我们设计了一种新的基于规则的方法来检测脆弱的注释。我们的方法称为Fraco，它考虑了标识符的类型、形态、标识符的范围和注释的位置。我们通过将其精度和召回率与为六个目标Java系统创建的手动注释基准进行比较来评估该方法，并将结果与Eclipse的自动注释内标识符替换特性的性能进行比较。Fraco在我们的评估数据集的大多数组件上以近乎最佳的精度和召回率执行，并且通常优于基线Eclipse特性。作为评估的一部分，我们还注意到，在我们的数据集中，超过一半的标识符在重命名后具有脆弱的注释，这进一步激发了对自动注释重构的研究需求。

{"title":"Detecting fragile comments","authors":"Inderjot Kaur Ratol, M. Robillard","doi":"10.1109/ASE.2017.8115624","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115624","url":null,"abstract":"Refactoring is a common software development practice and many simple refactorings can be performed automatically by tools. Identifier renaming is a widely performed refactoring activity. With tool support, rename refactorings can rely on the program structure to ensure correctness of the code transformation. Unfortunately, the textual references to the renamed identifier present in the unstructured comment text cannot be formally detected through the syntax of the language, and are thus fragile with respect to identifier renaming. We designed a new rule-based approach to detect fragile comments. Our approach, called Fraco, takes into account the type of identifier, its morphology, the scope of the identifier and the location of comments. We evaluated the approach by comparing its precision and recall against hand-annotated benchmarks created for six target Java systems, and compared the results against the performance of Eclipse's automated in-comment identifier replacement feature. Fraco performed with near-optimal precision and recall on most components of our evaluation data set, and generally outperformed the baseline Eclipse feature. As part of our evaluation, we also noted that more than half of the total number of identifiers in our data set had fragile comments after renaming, which further motivates the need for research on automatic comment refactoring.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133279491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

Synthetic data generation for statistical testing 用于统计测试的合成数据生成

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2017-10-30 DOI: 10.1109/ASE.2017.8115698

Ghanem Soltana, M. Sabetzadeh, L. Briand

Usage-based statistical testing employs knowledge about the actual or anticipated usage profile of the system under test for estimating system reliability. For many systems, usage-based statistical testing involves generating synthetic test data. Such data must possess the same statistical characteristics as the actual data that the system will process during operation. Synthetic test data must further satisfy any logical validity constraints that the actual data is subject to. Targeting data-intensive systems, we propose an approach for generating synthetic test data that is both statistically representative and logically valid. The approach works by first generating a data sample that meets the desired statistical characteristics, without taking into account the logical constraints. Subsequently, the approach tweaks the generated sample to fix any logical constraint violations. The tweaking process is iterative and continuously guided toward achieving the desired statistical characteristics. We report on a realistic evaluation of the approach, where we generate a synthetic population of citizens' records for testing a public administration IT system. Results suggest that our approach is scalable and capable of simultaneously fulfilling the statistical representativeness and logical validity requirements.

基于使用情况的统计测试利用有关被测系统的实际或预期使用情况的知识来估计系统可靠性。对于许多系统，基于使用情况的统计测试包括生成合成测试数据。这些数据必须具有与系统在运行过程中处理的实际数据相同的统计特征。合成测试数据必须进一步满足实际数据所受的任何逻辑有效性约束。针对数据密集型系统，我们提出了一种生成综合测试数据的方法，该方法在统计上具有代表性，在逻辑上有效。该方法的工作原理是，首先生成满足所需统计特征的数据样本，而不考虑逻辑约束。随后，该方法调整生成的样例以修复任何违反逻辑约束的情况。调整过程是迭代的，并不断地指导实现所需的统计特性。我们报告了对该方法的实际评估，其中我们生成了用于测试公共管理IT系统的公民记录的合成人口。结果表明，我们的方法具有可扩展性，能够同时满足统计代表性和逻辑有效性要求。

{"title":"Synthetic data generation for statistical testing","authors":"Ghanem Soltana, M. Sabetzadeh, L. Briand","doi":"10.1109/ASE.2017.8115698","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115698","url":null,"abstract":"Usage-based statistical testing employs knowledge about the actual or anticipated usage profile of the system under test for estimating system reliability. For many systems, usage-based statistical testing involves generating synthetic test data. Such data must possess the same statistical characteristics as the actual data that the system will process during operation. Synthetic test data must further satisfy any logical validity constraints that the actual data is subject to. Targeting data-intensive systems, we propose an approach for generating synthetic test data that is both statistically representative and logically valid. The approach works by first generating a data sample that meets the desired statistical characteristics, without taking into account the logical constraints. Subsequently, the approach tweaks the generated sample to fix any logical constraint violations. The tweaking process is iterative and continuously guided toward achieving the desired statistical characteristics. We report on a realistic evaluation of the approach, where we generate a synthetic population of citizens' records for testing a public administration IT system. Results suggest that our approach is scalable and capable of simultaneously fulfilling the statistical representativeness and logical validity requirements.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127238013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Characterizing and taming non-deterministic bugs in Javascript applications 描述和驯服Javascript应用程序中的不确定性bug

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2017-10-30 DOI: 10.1109/ASE.2017.8115720

Jie Wang

JavaScript has become one of the most popular programming languages for both client-side and server-side applications. In JavaScript applications, events may be generated, triggered and consumed non-deterministically. Thus, JavaScript applications may suffer from non-deterministic bugs, when events are triggered and consumed in an unexpected order. In this proposal, we aim to characterize and combat non-deterministic bugs in JavaScript applications. Specifically, we first perform a comprehensive study about real-world non-deterministic bugs in server-side JavaScript applications. In order to facilitate bug diagnosis, we further propose approaches to isolate the necessary events that are responsible for the occurrence of a failure. We also plan to design new techniques in detecting non-deterministic bugs in JavaScript applications.

JavaScript已经成为客户端和服务器端应用程序中最流行的编程语言之一。在JavaScript应用程序中，事件可能是不确定地生成、触发和使用的。因此，当事件以意想不到的顺序触发和使用时，JavaScript应用程序可能会出现不确定的bug。在这个建议中，我们的目标是描述和解决JavaScript应用程序中的不确定性bug。具体来说，我们首先对服务器端JavaScript应用程序中的实际不确定性错误进行全面研究。为了便于错误诊断，我们进一步提出了隔离导致故障发生的必要事件的方法。我们还计划设计新的技术来检测JavaScript应用程序中的不确定性bug。

引用次数: 2

EHBDroid: Beyond GUI testing for Android applications EHBDroid:超越Android应用的GUI测试

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2017-10-30 DOI: 10.1109/ASE.2017.8115615

Wei Song, Xiangxing Qian, Jeff Huang

With the prevalence of Android-based mobile devices, automated testing for Android apps has received increasing attention. However, owing to the large variety of events that Android supports, test input generation is a challenging task. In this paper, we present a novel approach and an open source tool called EHBDroid for testing Android apps. In contrast to conventional GUI testing approaches, a key novelty of EHBDroid is that it does not generate events from the GUI, but directly invokes callbacks of event handlers. By doing so, EHBDroid can efficiently simulate a large number of events that are difficult to generate by traditional UI-based approaches. We have evaluated EHBDroid on a collection of 35 real-world large-scale Android apps and compared its performance with two state-of-the-art UI-based approaches, Monkey and Dynodroid. Our experimental results show that EHBDroid is significantly more effective and efficient than Monkey and Dynodroid: in a much shorter time, EHBDroid achieves as much as 22.3% higher statement coverage (11.1% on average) than the other two approaches, and found 12 bugs in these benchmarks, including 5 new bugs that the other two failed to find.

随着基于Android的移动设备的普及，Android应用程序的自动化测试受到越来越多的关注。然而，由于Android支持各种各样的事件，测试输入生成是一项具有挑战性的任务。在本文中，我们提出了一种新颖的方法和一个名为EHBDroid的开源工具，用于测试Android应用程序。与传统的GUI测试方法相比，EHBDroid的一个关键新颖之处在于它不从GUI生成事件，而是直接调用事件处理程序的回调。通过这样做，EHBDroid可以有效地模拟大量事件，这是传统的基于ui的方法难以生成的。我们在35个真实世界的大型Android应用程序上评估了EHBDroid，并将其性能与两种最先进的基于ui的方法(Monkey和Dynodroid)进行了比较。我们的实验结果表明，EHBDroid比Monkey和Dynodroid更有效和高效:在更短的时间内，EHBDroid的语句覆盖率比其他两种方法高出22.3%(平均11.1%)，并在这些基准测试中发现了12个bug，其中包括5个其他两种方法未能发现的新bug。

{"title":"EHBDroid: Beyond GUI testing for Android applications","authors":"Wei Song, Xiangxing Qian, Jeff Huang","doi":"10.1109/ASE.2017.8115615","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115615","url":null,"abstract":"With the prevalence of Android-based mobile devices, automated testing for Android apps has received increasing attention. However, owing to the large variety of events that Android supports, test input generation is a challenging task. In this paper, we present a novel approach and an open source tool called EHBDroid for testing Android apps. In contrast to conventional GUI testing approaches, a key novelty of EHBDroid is that it does not generate events from the GUI, but directly invokes callbacks of event handlers. By doing so, EHBDroid can efficiently simulate a large number of events that are difficult to generate by traditional UI-based approaches. We have evaluated EHBDroid on a collection of 35 real-world large-scale Android apps and compared its performance with two state-of-the-art UI-based approaches, Monkey and Dynodroid. Our experimental results show that EHBDroid is significantly more effective and efficient than Monkey and Dynodroid: in a much shorter time, EHBDroid achieves as much as 22.3% higher statement coverage (11.1% on average) than the other two approaches, and found 12 bugs in these benchmarks, including 5 new bugs that the other two failed to find.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127916054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 63

Semantics-assisted code review: An efficient tool chain and a user study 语义辅助代码审查:一个有效的工具链和用户研究

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2017-10-30 DOI: 10.1109/ASE.2017.8115666

M. Menarini, Yan Yan, W. Griswold

Code changes are often reviewed before they are deployed. Popular source control systems aid code review by presenting textual differences between old and new versions of the code, leaving developers with the difficult task of determining whether the differences actually produced the desired behavior. Fortunately, we can mine such information from code repositories. We propose aiding code review with inter-version semantic differential analysis. During review of a new commit, a developer is presented with summaries of both code differences and behavioral differences, which are expressed as diffs of likely invariants extracted by running the system's test cases. As a result, developers can more easily determine that the code changes produced the desired effect. We created an invariant-mining tool chain, Getty, to support our concept of semantically-assisted code review. To validate our approach, 1) we applied Getty to the commits of 6 popular open source projects, 2) we assessed the performance and cost of running Getty in different configurations, and 3) we performed a comparative user study with 18 developers. Our results demonstrate that semantically-assisted code review is feasible, effective, and that real programmers can leverage it to improve the quality of their reviews.

代码更改通常在部署之前进行审查。流行的源代码控制系统通过呈现新旧代码版本之间的文本差异来帮助代码审查，这就给开发人员留下了一个困难的任务，即确定这些差异是否真的产生了期望的行为。幸运的是，我们可以从代码存储库中挖掘这些信息。我们建议用版本间语义差异分析来辅助代码审查。在审查新提交的过程中，开发人员会看到代码差异和行为差异的摘要，这些差异表示为通过运行系统的测试用例提取的可能不变量的差异。因此，开发人员可以更容易地确定代码更改产生了期望的效果。我们创建了一个不变量挖掘工具链Getty，以支持语义辅助代码审查的概念。为了验证我们的方法，1)我们将Getty应用到6个流行的开源项目的提交中，2)我们评估了在不同配置下运行Getty的性能和成本，3)我们对18个开发人员进行了比较用户研究。我们的结果表明，语义辅助的代码审查是可行的，有效的，并且真正的程序员可以利用它来提高他们的审查的质量。

{"title":"Semantics-assisted code review: An efficient tool chain and a user study","authors":"M. Menarini, Yan Yan, W. Griswold","doi":"10.1109/ASE.2017.8115666","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115666","url":null,"abstract":"Code changes are often reviewed before they are deployed. Popular source control systems aid code review by presenting textual differences between old and new versions of the code, leaving developers with the difficult task of determining whether the differences actually produced the desired behavior. Fortunately, we can mine such information from code repositories. We propose aiding code review with inter-version semantic differential analysis. During review of a new commit, a developer is presented with summaries of both code differences and behavioral differences, which are expressed as diffs of likely invariants extracted by running the system's test cases. As a result, developers can more easily determine that the code changes produced the desired effect. We created an invariant-mining tool chain, Getty, to support our concept of semantically-assisted code review. To validate our approach, 1) we applied Getty to the commits of 6 popular open source projects, 2) we assessed the performance and cost of running Getty in different configurations, and 3) we performed a comparative user study with 18 developers. Our results demonstrate that semantically-assisted code review is feasible, effective, and that real programmers can leverage it to improve the quality of their reviews.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130953098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Towards robust instruction-level trace alignment of binary code 对二进制代码的鲁棒指令级跟踪对齐

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2017-10-30 DOI: 10.1109/ASE.2017.8115647

Ulf Kargén, N. Shahmehri

Program trace alignment is the process of establishing a correspondence between dynamic instruction instances in executions of two semantically similar but syntactically different programs. In this paper we present what is, to the best of our knowledge, the first method capable of aligning realistically long execution traces of real programs. To maximize generality, our method works entirely on the machine code level, i.e. it does not require access to source code. Moreover, the method is based entirely on dynamic analysis, which avoids the many challenges associated with static analysis of binary code, and which additionally makes our approach inherently resilient to e.g. static code obfuscation. Therefore, we believe that our trace alignment method could prove to be a useful aid in many program analysis tasks, such as debugging, reverse-engineering, investigating plagiarism, and malware analysis. We empirically evaluate our method on 11 popular Linux programs, and show that it is capable of producing meaningful alignments in the presence of various code transformations such as optimization or obfuscation, and that it easily scales to traces with tens of millions of instructions.

程序跟踪对齐是在执行两个语义相似但语法不同的程序时，在动态指令实例之间建立对应关系的过程。在本文中，据我们所知，我们提出了第一种能够实际地对齐实际程序的长执行轨迹的方法。为了使通用性最大化，我们的方法完全在机器码级别上工作，也就是说，它不需要访问源代码。此外，该方法完全基于动态分析，这避免了与二进制代码的静态分析相关的许多挑战，并且还使我们的方法具有固有的弹性，例如静态代码混淆。因此，我们相信我们的跟踪对齐方法可以证明在许多程序分析任务中是一个有用的帮助，例如调试、逆向工程、调查剽窃和恶意软件分析。我们在11个流行的Linux程序上对我们的方法进行了经验评估，并表明它能够在各种代码转换(如优化或混淆)的存在下产生有意义的对齐，并且它很容易扩展到具有数千万条指令的跟踪。

{"title":"Towards robust instruction-level trace alignment of binary code","authors":"Ulf Kargén, N. Shahmehri","doi":"10.1109/ASE.2017.8115647","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115647","url":null,"abstract":"Program trace alignment is the process of establishing a correspondence between dynamic instruction instances in executions of two semantically similar but syntactically different programs. In this paper we present what is, to the best of our knowledge, the first method capable of aligning realistically long execution traces of real programs. To maximize generality, our method works entirely on the machine code level, i.e. it does not require access to source code. Moreover, the method is based entirely on dynamic analysis, which avoids the many challenges associated with static analysis of binary code, and which additionally makes our approach inherently resilient to e.g. static code obfuscation. Therefore, we believe that our trace alignment method could prove to be a useful aid in many program analysis tasks, such as debugging, reverse-engineering, investigating plagiarism, and malware analysis. We empirically evaluate our method on 11 popular Linux programs, and show that it is capable of producing meaningful alignments in the presence of various code transformations such as optimization or obfuscation, and that it easily scales to traces with tens of millions of instructions.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"193 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122508104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Symlnfer: Inferring program invariants using symbolic states Symlnfer:使用符号状态推断程序不变量

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2017-10-30 DOI: 10.1109/ASE.2017.8115691

Thanhvu Nguyen, Matthew B. Dwyer, W. Visser

We introduce a new technique for inferring program invariants that uses symbolic states generated by symbolic execution. Symbolic states, which consist of path conditions and constraints on local variables, are a compact description of sets of concrete program states and they can be used for both invariant inference and invariant verification. Our technique uses a counterexample-based algorithm that creates concrete states from symbolic states, infers candidate invariants from concrete states, and then verifies or refutes candidate invariants using symbolic states. The refutation case produces concrete counterexamples that prevent spurious results and allow the technique to obtain more precise invariants. This process stops when the algorithm reaches a stable set of invariants. We present Symlnfer, a tool that implements these ideas to automatically generate invariants at arbitrary locations in a Java program. The tool obtains symbolic states from Symbolic PathFinder and uses existing algorithms to infer complex (potentially nonlinear) numerical invariants. Our preliminary results show that Symlnfer is effective in using symbolic states to generate precise and useful invariants for proving program safety and analyzing program runtime complexity. We also show that Symlnfer outperforms existing invariant generation systems.

我们介绍了一种利用符号执行产生的符号状态来推断程序不变量的新技术。符号状态由路径条件和局部变量约束组成，是对具体程序状态集合的简洁描述，可用于不变量推理和不变量验证。我们的技术使用基于反例的算法，该算法从符号状态创建具体状态，从具体状态推断候选不变量，然后使用符号状态验证或反驳候选不变量。反驳情况产生具体的反例，防止虚假的结果，并允许技术获得更精确的不变量。当算法达到一组稳定的不变量时，此过程停止。我们介绍了Symlnfer，它是一个实现这些思想的工具，可以在Java程序的任意位置自动生成不变量。该工具从symbolic PathFinder获得符号状态，并使用现有算法推断复杂(可能是非线性的)数值不变量。我们的初步结果表明，Symlnfer可以有效地使用符号状态生成精确而有用的不变量，以证明程序安全性和分析程序运行时复杂性。我们还证明了Symlnfer优于现有的不变生成系统。

{"title":"Symlnfer: Inferring program invariants using symbolic states","authors":"Thanhvu Nguyen, Matthew B. Dwyer, W. Visser","doi":"10.1109/ASE.2017.8115691","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115691","url":null,"abstract":"We introduce a new technique for inferring program invariants that uses symbolic states generated by symbolic execution. Symbolic states, which consist of path conditions and constraints on local variables, are a compact description of sets of concrete program states and they can be used for both invariant inference and invariant verification. Our technique uses a counterexample-based algorithm that creates concrete states from symbolic states, infers candidate invariants from concrete states, and then verifies or refutes candidate invariants using symbolic states. The refutation case produces concrete counterexamples that prevent spurious results and allow the technique to obtain more precise invariants. This process stops when the algorithm reaches a stable set of invariants. We present Symlnfer, a tool that implements these ideas to automatically generate invariants at arbitrary locations in a Java program. The tool obtains symbolic states from Symbolic PathFinder and uses existing algorithms to infer complex (potentially nonlinear) numerical invariants. Our preliminary results show that Symlnfer is effective in using symbolic states to generate precise and useful invariants for proving program safety and analyzing program runtime complexity. We also show that Symlnfer outperforms existing invariant generation systems.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122678985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Why and how JavaScript developers use linters JavaScript开发人员为什么以及如何使用lint

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2017-10-30 DOI: 10.1109/ASE.2017.8115668

Kristín Fjóla Tómasdóttir, M. Aniche, A. Deursen

Automatic static analysis tools help developers to automatically spot code issues in their software. They can be of extreme value in languages with dynamic characteristics, such as JavaScript, where developers can easily introduce mistakes which can go unnoticed for a long time, e.g. a simple syntactic or spelling mistake. Although research has already shown how developers perceive such tools for strongly-typed languages such as Java, little is known about their perceptions when it comes to dynamic languages. In this paper, we investigate what motivates and how developers make use of such tools in JavaScript projects. To that goal, we apply a qualitative research method to conduct and analyze a series of 15 interviews with developers responsible for the linter configuration in reputable OSS JavaScript projects that apply the most commonly used linter, ESLint. The results describe the benefits that developers obtain when using ESLint, the different ways one can configure the tool and prioritize its rules, and the existing challenges in applying linters in the real world. These results have direct implications for developers, tool makers, and researchers, such as tool improvements, and a research agenda that aims to increase our knowledge about the usefulness of such analyzers.

自动静态分析工具帮助开发人员自动发现软件中的代码问题。它们在具有动态特征的语言(如JavaScript)中非常有价值，因为开发人员可以很容易地引入长时间不被注意的错误，例如简单的语法或拼写错误。尽管研究已经显示了开发人员是如何看待此类用于强类型语言(如Java)的工具的，但对于动态语言，他们的看法却知之甚少。在本文中，我们调查了开发人员在JavaScript项目中使用这些工具的动机和方式。为了达到这个目标，我们采用了一种定性的研究方法，对15位负责linter配置的开发人员进行了一系列的访谈，这些开发人员都是在著名的OSS JavaScript项目中使用最常用的linter ESLint的。结果描述了开发人员在使用ESLint时获得的好处，配置工具和优先考虑其规则的不同方式，以及在现实世界中应用lint时存在的挑战。这些结果对开发人员、工具制造商和研究人员有直接的影响，例如工具改进，以及旨在增加我们对这些分析器有用性的知识的研究议程。

{"title":"Why and how JavaScript developers use linters","authors":"Kristín Fjóla Tómasdóttir, M. Aniche, A. Deursen","doi":"10.1109/ASE.2017.8115668","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115668","url":null,"abstract":"Automatic static analysis tools help developers to automatically spot code issues in their software. They can be of extreme value in languages with dynamic characteristics, such as JavaScript, where developers can easily introduce mistakes which can go unnoticed for a long time, e.g. a simple syntactic or spelling mistake. Although research has already shown how developers perceive such tools for strongly-typed languages such as Java, little is known about their perceptions when it comes to dynamic languages. In this paper, we investigate what motivates and how developers make use of such tools in JavaScript projects. To that goal, we apply a qualitative research method to conduct and analyze a series of 15 interviews with developers responsible for the linter configuration in reputable OSS JavaScript projects that apply the most commonly used linter, ESLint. The results describe the benefits that developers obtain when using ESLint, the different ways one can configure the tool and prioritize its rules, and the existing challenges in applying linters in the real world. These results have direct implications for developers, tool makers, and researchers, such as tool improvements, and a research agenda that aims to increase our knowledge about the usefulness of such analyzers.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123514710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 41

Improved query reformulation for concept location using CodeRank and document structures 使用CodeRank和文档结构改进了概念位置的查询重构

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2017-10-30 DOI: 10.7287/peerj.preprints.3186v2

M. M. Rahman, C. Roy

During software maintenance, developers usually deal with a significant number of software change requests. As a part of this, they often formulate an initial query from the request texts, and then attempt to map the concepts discussed in the request to relevant source code locations in the software system (a.k.a., concept location). Unfortunately, studies suggest that they often perform poorly in choosing the right search terms for a change task. In this paper, we propose a novel technique-ACER-that takes an initial query, identifies appropriate search terms from the source code using a novel term weight-CodeRank, and then suggests effective reformulation to the initial query by exploiting the source document structures, query quality analysis and machine learning. Experiments with 1,675 baseline queries from eight subject systems report that our technique can improve 71% of the baseline queries which is highly promising. Comparison with five closely related existing techniques in query reformulation not only validates our empirical findings but also demonstrates the superiority of our technique.

在软件维护期间，开发人员通常要处理大量的软件变更请求。作为其中的一部分，他们经常从请求文本形成一个初始查询，然后尝试将请求中讨论的概念映射到软件系统中的相关源代码位置(也就是概念位置)。不幸的是，研究表明，他们在为变更任务选择正确的搜索词时往往表现不佳。在本文中，我们提出了一种新的技术- acer，该技术采用初始查询，使用新的术语权重coderank从源代码中识别合适的搜索词，然后通过利用源文档结构，查询质量分析和机器学习对初始查询提出有效的重构建议。对来自8个主题系统的1,675个基线查询进行的实验表明，我们的技术可以提高71%的基线查询，这是非常有前途的。与现有的五种密切相关的查询重构技术进行比较，不仅验证了我们的实证研究结果，而且证明了我们的技术的优越性。

{"title":"Improved query reformulation for concept location using CodeRank and document structures","authors":"M. M. Rahman, C. Roy","doi":"10.7287/peerj.preprints.3186v2","DOIUrl":"https://doi.org/10.7287/peerj.preprints.3186v2","url":null,"abstract":"During software maintenance, developers usually deal with a significant number of software change requests. As a part of this, they often formulate an initial query from the request texts, and then attempt to map the concepts discussed in the request to relevant source code locations in the software system (a.k.a., concept location). Unfortunately, studies suggest that they often perform poorly in choosing the right search terms for a change task. In this paper, we propose a novel technique-ACER-that takes an initial query, identifies appropriate search terms from the source code using a novel term weight-CodeRank, and then suggests effective reformulation to the initial query by exploiting the source document structures, query quality analysis and machine learning. Experiments with 1,675 baseline queries from eight subject systems report that our technique can improve 71% of the baseline queries which is highly promising. Comparison with five closely related existing techniques in query reformulation not only validates our empirical findings but also demonstrates the superiority of our technique.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114786992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22