首页 > 最新文献

2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)最新文献

英文 中文
Synthesising Interprocedural Bit-Precise Termination Proofs (T) 综合程序间位精确终止证明(T)
Hong-Yi Chen, C. David, D. Kroening, P. Schrammel, Björn Wachter
Proving program termination is key to guaranteeing absence of undesirable behaviour, such as hanging programs and even security vulnerabilities such as denial-of-service attacks. To make termination checks scale to large systems, interprocedural termination analysis seems essential, which is a largely unexplored area of research in termination analysis, where most effort has focussed on difficult single-procedure problems. We present a modular termination analysis for C programs using template-based interprocedural summarisation. Our analysis combines a context-sensitive, over-approximating forward analysis with the inference of under-approximating preconditions for termination. Bit-precise termination arguments are synthesised over lexicographic linear ranking function templates. Our experimental results show that our tool 2LS outperforms state-of-the-art alternatives, and demonstrate the clear advantage of interprocedural reasoning over monolithic analysis in terms of efficiency, while retaining comparable precision.
证明程序终止是保证不存在不良行为的关键,例如挂起程序,甚至是拒绝服务攻击等安全漏洞。为了使终止检查扩展到大型系统,程序间终止分析似乎是必不可少的,这是终止分析中一个很大程度上未开发的研究领域,其中大多数努力都集中在困难的单程序问题上。我们提出了使用基于模板的过程间总结的C程序的模块化终止分析。我们的分析结合了上下文敏感的,过度近似的前向分析与近似不足的终止前提条件的推断。位精确终止参数是在字典线性排序函数模板上合成的。我们的实验结果表明,我们的工具2LS优于最先进的替代方案,并在保持相当精度的同时,在效率方面展示了程序间推理相对于整体分析的明显优势。
{"title":"Synthesising Interprocedural Bit-Precise Termination Proofs (T)","authors":"Hong-Yi Chen, C. David, D. Kroening, P. Schrammel, Björn Wachter","doi":"10.1109/ASE.2015.10","DOIUrl":"https://doi.org/10.1109/ASE.2015.10","url":null,"abstract":"Proving program termination is key to guaranteeing absence of undesirable behaviour, such as hanging programs and even security vulnerabilities such as denial-of-service attacks. To make termination checks scale to large systems, interprocedural termination analysis seems essential, which is a largely unexplored area of research in termination analysis, where most effort has focussed on difficult single-procedure problems. We present a modular termination analysis for C programs using template-based interprocedural summarisation. Our analysis combines a context-sensitive, over-approximating forward analysis with the inference of under-approximating preconditions for termination. Bit-precise termination arguments are synthesised over lexicographic linear ranking function templates. Our experimental results show that our tool 2LS outperforms state-of-the-art alternatives, and demonstrate the clear advantage of interprocedural reasoning over monolithic analysis in terms of efficiency, while retaining comparable precision.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"18 1","pages":"53-64"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81856704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Interpolation Guided Compositional Verification (T) 插值引导成分验证(T)
Shang-Wei Lin, Jun Sun, Truong Khanh Nguyen, Yang Liu, J. Dong
Model checking suffers from the state space explosion problem. Compositional verification techniques such as assume-guarantee reasoning (AGR) have been proposed to alleviate the problem. However, there are at least three challenges in applying AGR. Firstly, given a system M1 ? M2, how do we automatically construct and refine (in the presence of spurious counterexamples) an assumption A2, which must be an abstraction of M2? Previous approaches suggest to incrementally learn and modify the assumption through multiple invocations of a model checker, which could be often time consuming. Secondly, how do we keep the state space small when checking M1 ? A2 = f if multiple refinements of A2 are necessary? Lastly, in the presence of multiple parallel components, how do we partition the components? In this work, we propose interpolation-guided compositional verification. The idea is to tackle three challenges by using interpolations to generate and refine the abstraction of M2, to abstract M1 at the same time (so that the state space is reduced even if A2 is refined all the way to M2), and to find good partitions. Experimental results show that the proposed approach outperforms existing approaches consistently.
模型检验存在状态空间爆炸问题。假设-保证推理(AGR)等组合验证技术已经被提出来缓解这个问题。然而,应用AGR至少有三个挑战。首先,给定一个系统M1 ?M2,我们如何自动构建和完善(在存在虚假反例的情况下)假设A2,它必须是M2的抽象?以前的方法建议通过多次调用模型检查器来增量地学习和修改假设,这通常非常耗时。其次,在检查M1时,我们如何保持状态空间小?A2 = f如果需要对A2进行多次细化?最后,在存在多个并行组件的情况下,我们如何划分组件?在这项工作中,我们提出了插值引导的成分验证。这个想法是通过使用插值来生成和细化M2的抽象来解决三个挑战,同时抽象M1(这样即使A2一直细化到M2,状态空间也会减少),并找到好的分区。实验结果表明,该方法的性能优于现有方法。
{"title":"Interpolation Guided Compositional Verification (T)","authors":"Shang-Wei Lin, Jun Sun, Truong Khanh Nguyen, Yang Liu, J. Dong","doi":"10.1109/ASE.2015.33","DOIUrl":"https://doi.org/10.1109/ASE.2015.33","url":null,"abstract":"Model checking suffers from the state space explosion problem. Compositional verification techniques such as assume-guarantee reasoning (AGR) have been proposed to alleviate the problem. However, there are at least three challenges in applying AGR. Firstly, given a system M1 ? M2, how do we automatically construct and refine (in the presence of spurious counterexamples) an assumption A2, which must be an abstraction of M2? Previous approaches suggest to incrementally learn and modify the assumption through multiple invocations of a model checker, which could be often time consuming. Secondly, how do we keep the state space small when checking M1 ? A2 = f if multiple refinements of A2 are necessary? Lastly, in the presence of multiple parallel components, how do we partition the components? In this work, we propose interpolation-guided compositional verification. The idea is to tackle three challenges by using interpolations to generate and refine the abstraction of M2, to abstract M1 at the same time (so that the state space is reduced even if A2 is refined all the way to M2), and to find good partitions. Experimental results show that the proposed approach outperforms existing approaches consistently.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"8 1","pages":"65-74"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88455864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Do Automatically Generated Unit Tests Find Real Faults? An Empirical Study of Effectiveness and Challenges (T) 自动生成的单元测试能发现真正的错误吗?有效性与挑战的实证研究(T)
S. Shamshiri, René Just, J. Rojas, G. Fraser, Phil McMinn, Andrea Arcuri
Rather than tediously writing unit tests manually, tools can be used to generate them automatically - sometimes even resulting in higher code coverage than manual testing. But how good are these tests at actually finding faults? To answer this question, we applied three state-of-the-art unit test generation tools for Java (Randoop, EvoSuite, and Agitar) to the 357 real faults in the Defects4J dataset and investigated how well the generated test suites perform at detecting these faults. Although the automatically generated test suites detected 55.7% of the faults overall, only 19.9% of all the individual test suites detected a fault. By studying the effectiveness and problems of the individual tools and the tests they generate, we derive insights to support the development of automated unit test generators that achieve a higher fault detection rate. These insights include 1) improving the obtained code coverage so that faulty statements are executed in the first instance, 2) improving the propagation of faulty program states to an observable output, coupled with the generation of more sensitive assertions, and 3) improving the simulation of the execution environment to detect faults that are dependent on external factors such as date and time.
与其单调乏味地手工编写单元测试,不如使用工具自动生成单元测试——有时甚至比手工测试获得更高的代码覆盖率。但是这些测试在发现错误方面有多好呢?为了回答这个问题,我们将三个最先进的Java单元测试生成工具(Randoop、EvoSuite和Agitar)应用于缺陷4j数据集中的357个真实错误,并研究生成的测试套件在检测这些错误方面的表现。尽管自动生成的测试套件检测到55.7%的错误,但是只有19.9%的单个测试套件检测到一个错误。通过研究单个工具及其生成的测试的有效性和问题,我们获得了支持自动化单元测试生成器开发的见解,从而实现更高的故障检测率。这些见解包括:1)改进获得的代码覆盖率,以便在第一个实例中执行错误语句;2)改进将错误程序状态传播到可观察的输出,同时生成更敏感的断言;3)改进执行环境的模拟,以检测依赖于外部因素(如日期和时间)的错误。
{"title":"Do Automatically Generated Unit Tests Find Real Faults? An Empirical Study of Effectiveness and Challenges (T)","authors":"S. Shamshiri, René Just, J. Rojas, G. Fraser, Phil McMinn, Andrea Arcuri","doi":"10.1109/ASE.2015.86","DOIUrl":"https://doi.org/10.1109/ASE.2015.86","url":null,"abstract":"Rather than tediously writing unit tests manually, tools can be used to generate them automatically - sometimes even resulting in higher code coverage than manual testing. But how good are these tests at actually finding faults? To answer this question, we applied three state-of-the-art unit test generation tools for Java (Randoop, EvoSuite, and Agitar) to the 357 real faults in the Defects4J dataset and investigated how well the generated test suites perform at detecting these faults. Although the automatically generated test suites detected 55.7% of the faults overall, only 19.9% of all the individual test suites detected a fault. By studying the effectiveness and problems of the individual tools and the tests they generate, we derive insights to support the development of automated unit test generators that achieve a higher fault detection rate. These insights include 1) improving the obtained code coverage so that faulty statements are executed in the first instance, 2) improving the propagation of faulty program states to an observable output, coupled with the generation of more sensitive assertions, and 3) improving the simulation of the execution environment to detect faults that are dependent on external factors such as date and time.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"55 1","pages":"201-211"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87321845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 192
Predicting Delays in Software Projects Using Networked Classification (T) 使用网络分类(T)预测软件项目的延迟
Morakot Choetkiertikul, K. Dam, T. Tran, A. Ghose
Software projects have a high risk of cost and schedule overruns, which has been a source of concern for the software engineering community for a long time. One of the challenges in software project management is to make reliable prediction of delays in the context of constant and rapid changes inherent in software projects. This paper presents a novel approach to providing automated support for project managers and other decision makers in predicting whether a subset of software tasks (among the hundreds to thousands of ongoing tasks) in a software project have a risk of being delayed. Our approach makes use of not only features specific to individual software tasks (i.e. local data) -- as done in previous work -- but also their relationships (i.e. networked data). In addition, using collective classification, our approach can simultaneously predict the degree of delay for a group of related tasks. Our evaluation results show a significant improvement over traditional approaches which perform classification on each task independently: achieving 46% -- 97% precision (49% improved), 46% -- 97% recall (28% improved), 56% -- 75% F-measure (39% improved), and 78% -- 95% Area Under the ROC Curve (16% improved).
软件项目具有成本和进度超支的高风险,这一直是软件工程社区长期关注的问题。软件项目管理的挑战之一是在软件项目中固有的不断和快速变化的背景下对延迟做出可靠的预测。本文提出了一种新颖的方法,为项目经理和其他决策者提供自动化支持,以预测软件项目中的软件任务子集(在数百到数千个正在进行的任务中)是否有延迟的风险。我们的方法不仅利用了特定于单个软件任务的特性(即本地数据)——正如在以前的工作中所做的那样——而且还利用了它们的关系(即网络数据)。此外,使用集体分类,我们的方法可以同时预测一组相关任务的延迟程度。我们的评估结果显示,与独立对每个任务进行分类的传统方法相比,该方法有了显著的改进:达到46% - 97%的精度(提高49%),46% - 97%的召回率(提高28%),56% - 75%的f测量(提高39%)和78% - 95%的ROC曲线下面积(提高16%)。
{"title":"Predicting Delays in Software Projects Using Networked Classification (T)","authors":"Morakot Choetkiertikul, K. Dam, T. Tran, A. Ghose","doi":"10.1109/ASE.2015.55","DOIUrl":"https://doi.org/10.1109/ASE.2015.55","url":null,"abstract":"Software projects have a high risk of cost and schedule overruns, which has been a source of concern for the software engineering community for a long time. One of the challenges in software project management is to make reliable prediction of delays in the context of constant and rapid changes inherent in software projects. This paper presents a novel approach to providing automated support for project managers and other decision makers in predicting whether a subset of software tasks (among the hundreds to thousands of ongoing tasks) in a software project have a risk of being delayed. Our approach makes use of not only features specific to individual software tasks (i.e. local data) -- as done in previous work -- but also their relationships (i.e. networked data). In addition, using collective classification, our approach can simultaneously predict the degree of delay for a group of related tasks. Our evaluation results show a significant improvement over traditional approaches which perform classification on each task independently: achieving 46% -- 97% precision (49% improved), 46% -- 97% recall (28% improved), 56% -- 75% F-measure (39% improved), and 78% -- 95% Area Under the ROC Curve (16% improved).","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"75 1","pages":"353-364"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86279917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Region and Effect Inference for Safe Parallelism (T) 安全并行性(T)的区域与效果推断
Alexandros Tzannes, Stephen Heumann, Lamyaa Eloussi, Mohsen Vakilian, Vikram S. Adve, Michael Han
In this paper, we present the first full regions-and-effects inference algorithm for explicitly parallel fork-join programs. We infer annotations inspired by Deterministic Parallel Java (DPJ) for a type-safe subset of C++. We chose the DPJ annotations because they give the strongest safety guarantees of any existing concurrency-checking approach we know of, static or dynamic, and it is also the most expressive static checking system we know of that gives strong safety guarantees. This expressiveness, however, makes manual annotation difficult and tedious, which motivates the need for automatic inference, but it also makes the inference problem very challenging: the code may use region polymorphism, imperative updates with complex aliasing, arbitrary recursion, hierarchical region specifications, and wildcard elements to describe potentially infinite sets of regions. We express the inference as a constraint satisfaction problem and develop, implement, and evaluate an algorithm for solving it. The region and effect annotations inferred by the algorithm constitute a checkable proof of safe parallelism, and it can be recorded both for documentation and for fast and modular safety checking.
在本文中,我们提出了第一个用于显式并行分叉连接程序的完整区域-效果推理算法。我们为c++的一个类型安全子集推断受确定性并行Java (DPJ)启发的注释。我们之所以选择DPJ注释,是因为它提供了我们所知道的所有现有并发检查方法(无论是静态的还是动态的)中最强大的安全保证,而且它也是我们所知道的最具表现力的静态检查系统,它提供了强大的安全保证。然而,这种表达性使得手动注释变得困难和乏味,这激发了对自动推理的需求,但它也使推理问题变得非常具有挑战性:代码可能使用区域多态性、具有复杂混叠的强制更新、任意递归、分层区域规范和通配符元素来描述可能无限的区域集。我们将推理表达为约束满足问题,并开发、实现和评估一种算法来解决它。由算法推断出的区域和效果注释构成了安全并行性的可检查证明,并且可以将其记录下来,用于文档化和快速模块化的安全检查。
{"title":"Region and Effect Inference for Safe Parallelism (T)","authors":"Alexandros Tzannes, Stephen Heumann, Lamyaa Eloussi, Mohsen Vakilian, Vikram S. Adve, Michael Han","doi":"10.1109/ASE.2015.59","DOIUrl":"https://doi.org/10.1109/ASE.2015.59","url":null,"abstract":"In this paper, we present the first full regions-and-effects inference algorithm for explicitly parallel fork-join programs. We infer annotations inspired by Deterministic Parallel Java (DPJ) for a type-safe subset of C++. We chose the DPJ annotations because they give the strongest safety guarantees of any existing concurrency-checking approach we know of, static or dynamic, and it is also the most expressive static checking system we know of that gives strong safety guarantees. This expressiveness, however, makes manual annotation difficult and tedious, which motivates the need for automatic inference, but it also makes the inference problem very challenging: the code may use region polymorphism, imperative updates with complex aliasing, arbitrary recursion, hierarchical region specifications, and wildcard elements to describe potentially infinite sets of regions. We express the inference as a constraint satisfaction problem and develop, implement, and evaluate an algorithm for solving it. The region and effect annotations inferred by the algorithm constitute a checkable proof of safe parallelism, and it can be recorded both for documentation and for fast and modular safety checking.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"61 1","pages":"512-523"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87323999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Experiences from Designing and Validating a Software Modernization Transformation (E) 设计和验证软件现代化转换的经验(E)
Alexandru F. Iosif-Lazar, Ahmad Salim Al-Sibahi, Aleksandar S. Dimovski, J. Savolainen, K. Sierszecki, A. Wąsowski
Software modernization often involves complex code transformations that convert legacy code to new architectures or platforms, while preserving the semantics of the original programs. We present the lessons learnt from an industrial software modernization project of considerable size. This includes collecting requirements for a code-to-model transformation, designing and implementing the transformation algorithm, and then validating correctness of this transformation for the code-base at hand. Our transformation is implemented in the TXL rewriting language and assumes specifically structured C++ code as input, which it translates to a declarative configuration model. The correctness criterion for the transformation is that the produced model admits the same configurations as the input code. The transformation converts C++ functions specifying around a thousand configuration parameters. We verify the correctness for each run individually, using translation validation and symbolic execution. The technique is formally specified and is applicable automatically for most of the code-base.
软件现代化通常涉及复杂的代码转换,将遗留代码转换为新的体系结构或平台,同时保留原始程序的语义。我们介绍了从一个相当规模的工业软件现代化项目中吸取的经验教训。这包括收集代码到模型转换的需求,设计和实现转换算法,然后为手头的代码库验证这种转换的正确性。我们的转换是用TXL重写语言实现的,并假设特别结构化的c++代码作为输入,并将其转换为声明性配置模型。转换的正确性标准是生成的模型承认与输入代码相同的配置。该转换转换指定了大约一千个配置参数的c++函数。我们使用翻译验证和符号执行分别验证每个运行的正确性。该技术是正式指定的,并且自动适用于大多数代码库。
{"title":"Experiences from Designing and Validating a Software Modernization Transformation (E)","authors":"Alexandru F. Iosif-Lazar, Ahmad Salim Al-Sibahi, Aleksandar S. Dimovski, J. Savolainen, K. Sierszecki, A. Wąsowski","doi":"10.1109/ASE.2015.84","DOIUrl":"https://doi.org/10.1109/ASE.2015.84","url":null,"abstract":"Software modernization often involves complex code transformations that convert legacy code to new architectures or platforms, while preserving the semantics of the original programs. We present the lessons learnt from an industrial software modernization project of considerable size. This includes collecting requirements for a code-to-model transformation, designing and implementing the transformation algorithm, and then validating correctness of this transformation for the code-base at hand. Our transformation is implemented in the TXL rewriting language and assumes specifically structured C++ code as input, which it translates to a declarative configuration model. The correctness criterion for the transformation is that the produced model admits the same configurations as the input code. The transformation converts C++ functions specifying around a thousand configuration parameters. We verify the correctness for each run individually, using translation validation and symbolic execution. The technique is formally specified and is applicable automatically for most of the code-base.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"597-607"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84238017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Copy and Paste Redeemed (T) 复制粘贴赎回(T)
Krishna Narasimhan, Christoph Reichenbach
Modern software development relies on code reuse, which software engineers typically realise through handwritten abstractions, such as functions, methods, or classes. However, such abstractions can be challenging to develop and maintain. One alternative form of re-use is copy-paste-modify, a methodology in which developers explicitly duplicate source code to adapt the duplicate for a new purpose. We observe that copy-paste-modify can be substantially faster to use than manual abstraction, and past research strongly suggests that it is a popular technique among software developers. We therefore propose that software engineers should forego hand-written abstractions in favour of copying and pasting. However, empirical evidence also shows that copy-paste-modify complicates software maintenance and increases the frequency of bugs. To address this concern, we propose a software tool that merges together similar pieces of code and automatically creates suitable abstractions. This allows software developers to get the best of both worlds: custom abstraction together with easy re-use. To demonstrate the feasibility of our approach, we have implemented and evaluated a prototype merging tool for C++ on a number of near-miss clones (clones with some modifications) in popular Open Source packages. We found that maintainers find our algorithmically created abstractions to be largely preferable to existing duplicated code.
现代软件开发依赖于代码重用,软件工程师通常通过手写的抽象(如函数、方法或类)来实现代码重用。然而,这样的抽象在开发和维护方面具有挑战性。一种可选的重用形式是复制-粘贴-修改,这是一种开发人员显式复制源代码以使副本适应新目的的方法。我们观察到,复制-粘贴-修改比手工抽象使用起来要快得多,过去的研究强烈表明,它是软件开发人员中流行的技术。因此,我们建议软件工程师应该放弃手写的抽象,而选择复制和粘贴。然而,经验证据也表明,复制-粘贴-修改使软件维护变得复杂,并增加了错误的频率。为了解决这个问题,我们提出了一个软件工具,它可以将相似的代码片段合并在一起,并自动创建合适的抽象。这使得软件开发人员可以两全其美:自定义抽象和易于重用。为了证明我们的方法的可行性,我们在流行的开放源码包中实现并评估了一个c++的原型合并工具,该工具使用了许多险些失败的克隆(经过一些修改的克隆)。我们发现,维护者发现我们通过算法创建的抽象在很大程度上优于现有的重复代码。
{"title":"Copy and Paste Redeemed (T)","authors":"Krishna Narasimhan, Christoph Reichenbach","doi":"10.1109/ASE.2015.39","DOIUrl":"https://doi.org/10.1109/ASE.2015.39","url":null,"abstract":"Modern software development relies on code reuse, which software engineers typically realise through handwritten abstractions, such as functions, methods, or classes. However, such abstractions can be challenging to develop and maintain. One alternative form of re-use is copy-paste-modify, a methodology in which developers explicitly duplicate source code to adapt the duplicate for a new purpose. We observe that copy-paste-modify can be substantially faster to use than manual abstraction, and past research strongly suggests that it is a popular technique among software developers. We therefore propose that software engineers should forego hand-written abstractions in favour of copying and pasting. However, empirical evidence also shows that copy-paste-modify complicates software maintenance and increases the frequency of bugs. To address this concern, we propose a software tool that merges together similar pieces of code and automatically creates suitable abstractions. This allows software developers to get the best of both worlds: custom abstraction together with easy re-use. To demonstrate the feasibility of our approach, we have implemented and evaluated a prototype merging tool for C++ on a number of near-miss clones (clones with some modifications) in popular Open Source packages. We found that maintainers find our algorithmically created abstractions to be largely preferable to existing duplicated code.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"PP 1","pages":"630-640"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84363388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
CLAMI: Defect Prediction on Unlabeled Datasets (T) CLAMI:未标记数据集的缺陷预测(T)
Jaechang Nam, Sunghun Kim
Defect prediction on new projects or projects with limited historical data is an interesting problem in software engineering. This is largely because it is difficult to collect defect information to label a dataset for training a prediction model. Cross-project defect prediction (CPDP) has tried to address this problem by reusing prediction models built by other projects that have enough historical data. However, CPDP does not always build a strong prediction model because of the different distributions among datasets. Approaches for defect prediction on unlabeled datasets have also tried to address the problem by adopting unsupervised learning but it has one major limitation, the necessity for manual effort. In this study, we propose novel approaches, CLA and CLAMI, that show the potential for defect prediction on unlabeled datasets in an automated manner without need for manual effort. The key idea of the CLA and CLAMI approaches is to label an unlabeled dataset by using the magnitude of metric values. In our empirical study on seven open-source projects, the CLAMI approach led to the promising prediction performances, 0.636 and 0.723 in average f-measure and AUC, that are comparable to those of defect prediction based on supervised learning.
对新项目或历史数据有限的项目进行缺陷预测是软件工程中一个有趣的问题。这主要是因为很难收集缺陷信息来标记训练预测模型的数据集。跨项目缺陷预测(CPDP)试图通过重用其他有足够历史数据的项目构建的预测模型来解决这个问题。然而,由于数据集之间的分布不同,CPDP并不总是建立一个强大的预测模型。对未标记数据集进行缺陷预测的方法也试图通过采用无监督学习来解决问题,但它有一个主要的限制,即需要人工努力。在这项研究中,我们提出了新的方法,CLA和CLAMI,它们显示了在不需要人工的情况下,以自动化的方式对未标记的数据集进行缺陷预测的潜力。CLA和CLAMI方法的关键思想是通过使用度量值的大小来标记未标记的数据集。在我们对7个开源项目的实证研究中,CLAMI方法的预测性能很好,平均f-measure和AUC分别为0.636和0.723,与基于监督学习的缺陷预测相当。
{"title":"CLAMI: Defect Prediction on Unlabeled Datasets (T)","authors":"Jaechang Nam, Sunghun Kim","doi":"10.1109/ASE.2015.56","DOIUrl":"https://doi.org/10.1109/ASE.2015.56","url":null,"abstract":"Defect prediction on new projects or projects with limited historical data is an interesting problem in software engineering. This is largely because it is difficult to collect defect information to label a dataset for training a prediction model. Cross-project defect prediction (CPDP) has tried to address this problem by reusing prediction models built by other projects that have enough historical data. However, CPDP does not always build a strong prediction model because of the different distributions among datasets. Approaches for defect prediction on unlabeled datasets have also tried to address the problem by adopting unsupervised learning but it has one major limitation, the necessity for manual effort. In this study, we propose novel approaches, CLA and CLAMI, that show the potential for defect prediction on unlabeled datasets in an automated manner without need for manual effort. The key idea of the CLA and CLAMI approaches is to label an unlabeled dataset by using the magnitude of metric values. In our empirical study on seven open-source projects, the CLAMI approach led to the promising prediction performances, 0.636 and 0.723 in average f-measure and AUC, that are comparable to those of defect prediction based on supervised learning.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"7 1","pages":"452-463"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82109128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 124
Scaling Size and Parameter Spaces in Variability-Aware Software Performance Models (T) 可变性感知软件性能模型的尺度大小和参数空间(T)
M. Kowal, Max Tschaikowski, M. Tribastone, Ina Schaefer
In software performance engineering, what-if scenarios, architecture optimization, capacity planning, run-time adaptation, and uncertainty management of realistic models typically require the evaluation of many instances. Effective analysis is however hindered by two orthogonal sources of complexity. The first is the infamous problem of state space explosion -- the analysis of a single model becomes intractable with its size. The second is due to massive parameter spaces to be explored, but such that computations cannot be reused across model instances. In this paper, we efficiently analyze many queuing models with the distinctive feature of more accurately capturing variability and uncertainty of execution rates by incorporating general (i.e., non-exponential) distributions. Applying product-line engineering methods, we consider a family of models generated by a core that evolves into concrete instances by applying simple delta operations affecting both the topology and the model's parameters. State explosion is tackled by turning to a scalable approximation based on ordinary differential equations. The entire model space is analyzed in a family-based fashion, i.e., at once using an efficient symbolic solution of a super-model that subsumes every concrete instance. Extensive numerical tests show that this is orders of magnitude faster than a naive instance-by-instance analysis.
在软件性能工程中,实际模型的假设场景、架构优化、容量规划、运行时适应和不确定性管理通常需要对许多实例进行评估。然而,有效的分析受到两个相互正交的复杂性来源的阻碍。第一个是臭名昭著的状态空间爆炸问题——单个模型的分析变得难以处理。第二个原因是由于需要探索大量的参数空间,但这样的计算不能跨模型实例重用。在本文中,我们有效地分析了许多排队模型,这些模型的独特特征是通过结合一般(即非指数)分布更准确地捕获执行速率的可变性和不确定性。应用产品线工程方法,我们考虑由核心生成的一系列模型,这些模型通过应用影响拓扑和模型参数的简单增量操作演变为具体实例。状态爆炸通过转向基于常微分方程的可伸缩近似来解决。整个模型空间以基于家庭的方式进行分析,即立即使用包含每个具体实例的超级模型的有效符号解决方案。大量的数值测试表明,这比简单的逐个实例分析要快几个数量级。
{"title":"Scaling Size and Parameter Spaces in Variability-Aware Software Performance Models (T)","authors":"M. Kowal, Max Tschaikowski, M. Tribastone, Ina Schaefer","doi":"10.1109/ASE.2015.16","DOIUrl":"https://doi.org/10.1109/ASE.2015.16","url":null,"abstract":"In software performance engineering, what-if scenarios, architecture optimization, capacity planning, run-time adaptation, and uncertainty management of realistic models typically require the evaluation of many instances. Effective analysis is however hindered by two orthogonal sources of complexity. The first is the infamous problem of state space explosion -- the analysis of a single model becomes intractable with its size. The second is due to massive parameter spaces to be explored, but such that computations cannot be reused across model instances. In this paper, we efficiently analyze many queuing models with the distinctive feature of more accurately capturing variability and uncertainty of execution rates by incorporating general (i.e., non-exponential) distributions. Applying product-line engineering methods, we consider a family of models generated by a core that evolves into concrete instances by applying simple delta operations affecting both the topology and the model's parameters. State explosion is tackled by turning to a scalable approximation based on ordinary differential equations. The entire model space is analyzed in a family-based fashion, i.e., at once using an efficient symbolic solution of a super-model that subsumes every concrete instance. Extensive numerical tests show that this is orders of magnitude faster than a naive instance-by-instance analysis.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"407-417"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79808788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Automatically Generating Test Templates from Test Names (N) 从测试名称自动生成测试模板(N)
Benwen Zhang, Emily Hill, J. Clause
Existing specification-based testing techniques require specifications that either do not exist or are too difficult to create. As a result, they often fall short of their goal of helping developers test expected behaviors. In this paper we present a novel, natural language-based approach that exploits the descriptive nature of test names to generate test templates. Similar to how modern IDEs simplify development by providing templates for common constructs such as loops, test templates can save time and lower the cognitive barrier for writing tests. The results of our evaluation show that the approach is feasible: despite the difficulty of the task, when test names contain a sufficient amount of information, the approach's accuracy is over 80% when parsing the relevant information from the test name and generating the template.
现有的基于规范的测试技术需要的规范要么不存在,要么很难创建。因此,它们往往达不到帮助开发人员测试预期行为的目标。在本文中,我们提出了一种新颖的、基于自然语言的方法,利用测试名称的描述性来生成测试模板。与现代ide通过为常见结构(如循环)提供模板来简化开发的方式类似,测试模板可以节省时间并降低编写测试的认知障碍。我们的评估结果表明,该方法是可行的:尽管任务困难,但当测试名称包含足够数量的信息时,从测试名称中解析相关信息并生成模板时,该方法的准确率超过80%。
{"title":"Automatically Generating Test Templates from Test Names (N)","authors":"Benwen Zhang, Emily Hill, J. Clause","doi":"10.1109/ASE.2015.68","DOIUrl":"https://doi.org/10.1109/ASE.2015.68","url":null,"abstract":"Existing specification-based testing techniques require specifications that either do not exist or are too difficult to create. As a result, they often fall short of their goal of helping developers test expected behaviors. In this paper we present a novel, natural language-based approach that exploits the descriptive nature of test names to generate test templates. Similar to how modern IDEs simplify development by providing templates for common constructs such as loops, test templates can save time and lower the cognitive barrier for writing tests. The results of our evaluation show that the approach is feasible: despite the difficulty of the task, when test names contain a sufficient amount of information, the approach's accuracy is over 80% when parsing the relevant information from the test name and generating the template.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"46 1","pages":"506-511"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80921367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
期刊
2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1