2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation最新文献

Impact of Code Refactoring Using Object-Oriented Methodology on a Scientific Computing Application 面向对象方法对科学计算应用程序代码重构的影响

2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation

Pub Date : 2014-09-28 DOI: 10.1109/SCAM.2014.21

Malin Källén, S. Holmgren, E. Hvannberg

Methods and tools for refactoring of software have been extensively studied during the last decades, and we argue that there is now a need for additional studies of the effects of refactoring on code quality and external code attributes such as computational performance. To study these effects, we have refactored the central parts of a code base developed in academia for a class of computationally demanding scientific computing problems. We made design choices on the basis of the SOLID principles and we used object-oriented techniques, such as the Gang of Four patterns, in the implementation. In this paper, we discuss the effect on maintainability qualitatively and also analyze it quantitatively using a set of software metrics extending the Chidamber-Kemerer suite. Not surprisingly, we find that maintainability has increased as an effect of the refactoring. We also study performance and find that dynamic binding, which inhibits in lining by the compiler, in the most frequently executed parts of the code makes the execution times increase by over 700%. By exploiting static polymorphism, we have been able able to reduce the relative increase in execution times to less than 100%. We argue that the code version implementing static polymorphism is less maintainable than the one using dynamic polymorphism, although both versions are considerably more maintainable than the original code.

在过去的几十年里，软件重构的方法和工具得到了广泛的研究，我们认为现在有必要进一步研究重构对代码质量和外部代码属性(如计算性能)的影响。为了研究这些影响，我们重构了学术界为一类计算要求很高的科学计算问题开发的代码库的中心部分。我们在SOLID原则的基础上做出了设计选择，并且在实现中使用了面向对象的技术，例如四人组模式。在本文中，我们定性地讨论了对可维护性的影响，并使用一组扩展Chidamber-Kemerer套件的软件度量来定量地分析它。不足为奇的是，我们发现可维护性由于重构的影响而提高了。我们还研究了性能，发现在代码最频繁执行的部分，动态绑定抑制了编译器的内嵌，使执行时间增加了700%以上。通过利用静态多态性，我们已经能够将执行时间的相对增长减少到100%以下。我们认为实现静态多态的代码版本比使用动态多态的代码版本更难维护，尽管这两个版本都比原始代码更容易维护。

{"title":"Impact of Code Refactoring Using Object-Oriented Methodology on a Scientific Computing Application","authors":"Malin Källén, S. Holmgren, E. Hvannberg","doi":"10.1109/SCAM.2014.21","DOIUrl":"https://doi.org/10.1109/SCAM.2014.21","url":null,"abstract":"Methods and tools for refactoring of software have been extensively studied during the last decades, and we argue that there is now a need for additional studies of the effects of refactoring on code quality and external code attributes such as computational performance. To study these effects, we have refactored the central parts of a code base developed in academia for a class of computationally demanding scientific computing problems. We made design choices on the basis of the SOLID principles and we used object-oriented techniques, such as the Gang of Four patterns, in the implementation. In this paper, we discuss the effect on maintainability qualitatively and also analyze it quantitatively using a set of software metrics extending the Chidamber-Kemerer suite. Not surprisingly, we find that maintainability has increased as an effect of the refactoring. We also study performance and find that dynamic binding, which inhibits in lining by the compiler, in the most frequently executed parts of the code makes the execution times increase by over 700%. By exploiting static polymorphism, we have been able able to reduce the relative increase in execution times to less than 100%. We argue that the code version implementing static polymorphism is less maintainable than the one using dynamic polymorphism, although both versions are considerably more maintainable than the original code.","PeriodicalId":407060,"journal":{"name":"2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129481516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Pangea: A Workbench for Statically Analyzing Multi-language Software Corpora Pangea:一个静态分析多语言软件语料库的工作台

2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation

Pub Date : 2014-09-28 DOI: 10.1109/SCAM.2014.39

A. Caracciolo, Andrei Chis, B. Spasojevic, M. Lungu

Software corpora facilitate reproducibility of analyses, however, static analysis for an entire corpus still requires considerable effort, often duplicated unnecessarily by multiple users. Moreover, most corpora are designed for single languages increasing the effort for cross-language analysis. To address these aspects we propose Pangea, an infrastructure allowing fast development of static analyses on multi-language corpora. Pangea uses language-independent meta-models stored as object model snapshots that can be directly loaded into memory and queryed without any parsing overhead. To reduce the effort of performing static analyses, Pangea provides out-of-the box support for: creating and refining analyses in a dedicated environment, deploying an analysis on an entire corpus, using a runner that supports parallel execution, and exporting results in various formats. In this tool demonstration we introduce Pangea and provide several usage scenarios that illustrate how it reduces the cost of analysis.

软件语料库促进了分析的再现性，然而，对整个语料库的静态分析仍然需要相当大的努力，经常被多个用户不必要地重复。此外，大多数语料库是为单一语言设计的，增加了跨语言分析的工作量。为了解决这些问题，我们提出了Pangea，这是一个允许在多语言语料库上快速开发静态分析的基础设施。Pangea使用独立于语言的元模型作为对象模型快照存储，可以直接加载到内存中并进行查询，而不需要任何解析开销。为了减少执行静态分析的工作量，Pangea提供了开箱即用的支持:在专用环境中创建和细化分析，在整个语料库上部署分析，使用支持并行执行的运行器，以及以各种格式导出结果。在这个工具演示中，我们将介绍Pangea，并提供几个使用场景来说明它如何降低分析成本。

引用次数: 13

Source Meter Sonar Qube Plug-in 源仪表声纳Qube插件

2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation

Pub Date : 2014-09-28 DOI: 10.1109/SCAM.2014.31

R. Ferenc, Laszlo Lango, István Siket, T. Gyimóthy, Tibor Bakota

The SourceMeter Sonar Qube plug-in is an extension of Sonar Qube, an open-source platform for managing code quality made by Sonar Source S.A, Switzerland. The plug-in extends the built-in Java code analysis engine of Sonar Qube with Front End ART's high-end Java code analysis engine. Most of Sonar Qubes original analysis results are replaced (including the detected source code duplications), while the range of available analyses is extended with a number of additional metrics and issue detectors. Additionally, the plug-in offers new GUI features on the Sonar Qube dashboard and drilldown views, making the Sonar Qube user experience more comfortable and the work with the tool more productive.

SourceMeter Sonar Qube插件是Sonar Qube的扩展，Sonar Qube是一个管理代码质量的开源平台，由瑞士Sonar Source S.A制作。该插件扩展了Sonar Qube内置的Java代码分析引擎与前端艺术的高端Java代码分析引擎。Sonar Qubes的大部分原始分析结果被替换(包括检测到的源代码重复)，而可用分析的范围则通过一些额外的指标和问题检测器得到扩展。此外，该插件还在Sonar Qube仪表板和钻入视图上提供了新的GUI功能，使Sonar Qube的用户体验更加舒适，使用该工具的工作效率更高。

引用次数: 15

On the Accuracy of Forward Dynamic Slicing and Its Effects on Software Maintenance 前向动态切片的精度及其对软件维护的影响

2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation

Pub Date : 2014-09-28 DOI: 10.1109/SCAM.2014.23

Siyuan Jiang, Raúl A. Santelices, M. Grechanik, Haipeng Cai

Dynamic slicing is a practical and popular analysis technique used in various software-engineering tasks. Dynamic slicing is known to be incomplete because it analyzes only a subset of all possible executions of a program. However, it is less known that its results may inaccurately represent the dependencies that occur in those executions. Some researchers have identified this problem and developed extensions such as relevant slicing, which incorporates static information. Yet, dynamic slicing continues to be widely used, even though the extent of its inaccuracy is not well understood, which can affect the benefits of this analysis. In this paper, we present an approach to assess the accuracy of forward dynamic slices, which are used in software maintenance and evolution tasks. Because finding all actual dependencies is an undecidable problem, our approach instead computes bounds of the precision and recall of forward dynamic slices. Our approach uses sensitivity analysis and execution differencing to find a subset of all program statements that truly depend at runtime on another statement. Using this approach, we studied the accuracy of many forward dynamic slices from a variety of Java applications. Our results show that forward dynamic slicing can have low recall -- for dependencies in the analyzed executions -- and some potential imprecision. We also conducted a case study that shows how this inaccuracy affects a software maintenance task. To the best of our knowledge, ours is the first work that quantifies the intrinsic limitations of dynamic slicing.

动态切片是一种实用而流行的分析技术，用于各种软件工程任务。众所周知，动态切片是不完整的，因为它只分析程序所有可能执行的一个子集。然而，鲜为人知的是，它的结果可能不准确地表示那些执行中发生的依赖关系。一些研究人员已经发现了这个问题，并开发了相关切片等扩展，其中包含了静态信息。然而，动态切片继续被广泛使用，尽管其不准确的程度还没有得到很好的理解，这可能会影响这种分析的好处。在本文中，我们提出了一种评估前向动态切片准确性的方法，该方法用于软件维护和进化任务。因为找到所有实际的依赖关系是一个无法确定的问题，所以我们的方法是计算正向动态切片的精度和召回范围。我们的方法使用敏感性分析和执行差异来找到在运行时真正依赖于另一条语句的所有程序语句的子集。使用这种方法，我们研究了来自各种Java应用程序的许多前向动态切片的准确性。我们的结果表明，前向动态切片具有较低的召回率(对于分析执行中的依赖性)和一些潜在的不精确。我们还进行了一个案例研究，展示了这种不准确性是如何影响软件维护任务的。据我们所知，我们是第一个量化动态切片内在局限性的工作。

{"title":"On the Accuracy of Forward Dynamic Slicing and Its Effects on Software Maintenance","authors":"Siyuan Jiang, Raúl A. Santelices, M. Grechanik, Haipeng Cai","doi":"10.1109/SCAM.2014.23","DOIUrl":"https://doi.org/10.1109/SCAM.2014.23","url":null,"abstract":"Dynamic slicing is a practical and popular analysis technique used in various software-engineering tasks. Dynamic slicing is known to be incomplete because it analyzes only a subset of all possible executions of a program. However, it is less known that its results may inaccurately represent the dependencies that occur in those executions. Some researchers have identified this problem and developed extensions such as relevant slicing, which incorporates static information. Yet, dynamic slicing continues to be widely used, even though the extent of its inaccuracy is not well understood, which can affect the benefits of this analysis. In this paper, we present an approach to assess the accuracy of forward dynamic slices, which are used in software maintenance and evolution tasks. Because finding all actual dependencies is an undecidable problem, our approach instead computes bounds of the precision and recall of forward dynamic slices. Our approach uses sensitivity analysis and execution differencing to find a subset of all program statements that truly depend at runtime on another statement. Using this approach, we studied the accuracy of many forward dynamic slices from a variety of Java applications. Our results show that forward dynamic slicing can have low recall -- for dependencies in the analyzed executions -- and some potential imprecision. We also conducted a case study that shows how this inaccuracy affects a software maintenance task. To the best of our knowledge, ours is the first work that quantifies the intrinsic limitations of dynamic slicing.","PeriodicalId":407060,"journal":{"name":"2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128164497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Semantic Versioning versus Breaking Changes: A Study of the Maven Repository 语义版本控制与破坏性变更:Maven存储库的研究

2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation

Pub Date : 2014-09-28 DOI: 10.1109/SCAM.2014.30

S. Raemaekers, A. Deursen, Joost Visser

For users of software libraries or public programming interfaces (APIs), backward compatibility is a desirable trait. Without compatibility, library users will face increased risk and cost when upgrading their dependencies. In this study, we investigate semantic versioning, a versioning scheme which provides strict rules on major versus minor and patch releases. We analyze seven years of library release history in Maven Central, and contrast version identifiers with actual incompatibilities. We find that around one third of all releases introduce at least one breaking change, and that this figure is the same for minor and major releases, indicating that version numbers do not provide developers with information in stability of interfaces. Additionally, we find that the adherence to semantic versioning principles has only marginally increased over time. We also investigate the use of deprecation tags and find out that methods get deleted without applying deprecated tags, and methods with deprecated tags are never deleted. We conclude the paper by arguing that the adherence to semantic versioning principles should increase because it provides users of an interface with a way to determine the amount of rework that is expected when upgrading to a new version.

对于软件库或公共编程接口(api)的用户来说，向后兼容性是一个理想的特性。如果没有兼容性，库用户在升级依赖项时将面临更大的风险和成本。在这项研究中，我们研究了语义版本控制，这是一种版本控制方案，它为主要版本与次要版本和补丁版本提供了严格的规则。我们分析了Maven Central中七年来的库发布历史，并将版本标识符与实际不兼容性进行了对比。我们发现大约有三分之一的发行版引入了至少一个破坏性的变更，并且这个数字对于次要发行版和主要发行版都是一样的，这表明版本号并没有为开发人员提供有关接口稳定性的信息。此外，我们发现，随着时间的推移，对语义版本控制原则的遵守只略微增加。我们还研究了弃用标记的使用，发现方法在不应用弃用标记的情况下被删除，而带有弃用标记的方法永远不会被删除。在本文的最后，我们认为应该增加对语义版本控制原则的遵守，因为它为界面用户提供了一种方法来确定升级到新版本时预期的返工量。

{"title":"Semantic Versioning versus Breaking Changes: A Study of the Maven Repository","authors":"S. Raemaekers, A. Deursen, Joost Visser","doi":"10.1109/SCAM.2014.30","DOIUrl":"https://doi.org/10.1109/SCAM.2014.30","url":null,"abstract":"For users of software libraries or public programming interfaces (APIs), backward compatibility is a desirable trait. Without compatibility, library users will face increased risk and cost when upgrading their dependencies. In this study, we investigate semantic versioning, a versioning scheme which provides strict rules on major versus minor and patch releases. We analyze seven years of library release history in Maven Central, and contrast version identifiers with actual incompatibilities. We find that around one third of all releases introduce at least one breaking change, and that this figure is the same for minor and major releases, indicating that version numbers do not provide developers with information in stability of interfaces. Additionally, we find that the adherence to semantic versioning principles has only marginally increased over time. We also investigate the use of deprecation tags and find out that methods get deleted without applying deprecated tags, and methods with deprecated tags are never deleted. We conclude the paper by arguing that the adherence to semantic versioning principles should increase because it provides users of an interface with a way to determine the amount of rework that is expected when upgrading to a new version.","PeriodicalId":407060,"journal":{"name":"2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129371806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 119

Bulk Fixing Coding Issues and Its Effects on Software Quality: Is It Worth Refactoring? 批量修复编码问题及其对软件质量的影响:值得重构吗?

2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation

Pub Date : 2014-09-28 DOI: 10.1109/SCAM.2014.18

Gábor Szoke, Gábor Antal, Csaba Nagy, R. Ferenc, T. Gyimóthy

The quality of a software system is mostly defined by its source code. Software evolves continuously, it gets modified, enhanced, and new requirements always arise. If we do not spend time periodically on improving our source code, it becomes messy and its quality will decrease inevitably. Literature tells us that we can improve the quality of our software product by regularly refactoring it. But does refactoring really increase software quality? Can it happen that a refactoring decreases the quality? Is it possible to recognize the change in quality caused by a single refactoring operation? In our paper, we seek answers to these questions in a case study of refactoring large-scale proprietary software systems. We analyzed the source code of 5 systems, and measured the quality of several revisions for a period of time. We analyzed 2 million lines of code and identified nearly 200 refactoring commits which fixed over 500 coding issues. We found that one single refactoring only makes a small change (sometimes even decreases quality), but when we do them in blocks, we can significantly increase quality, which can result not only in the local, but also in the global improvement of the code.

软件系统的质量主要是由其源代码定义的。软件不断发展，它被修改、增强，并且总是出现新的需求。如果我们不定期地花时间改进我们的源代码，它就会变得混乱，其质量将不可避免地下降。文献告诉我们，我们可以通过定期重构来提高软件产品的质量。但是重构真的能提高软件质量吗?重构会降低质量吗?是否有可能识别由单个重构操作引起的质量变化?在我们的论文中，我们通过重构大型专有软件系统的案例研究来寻求这些问题的答案。我们分析了5个系统的源代码，并在一段时间内测量了几个版本的质量。我们分析了200万行代码，确定了近200个重构提交，修复了500多个编码问题。我们发现，一次重构只会带来很小的改变(有时甚至会降低质量)，但是当我们以块的方式进行重构时，我们可以显著提高质量，这不仅会导致局部的改进，还会导致代码的全局改进。

{"title":"Bulk Fixing Coding Issues and Its Effects on Software Quality: Is It Worth Refactoring?","authors":"Gábor Szoke, Gábor Antal, Csaba Nagy, R. Ferenc, T. Gyimóthy","doi":"10.1109/SCAM.2014.18","DOIUrl":"https://doi.org/10.1109/SCAM.2014.18","url":null,"abstract":"The quality of a software system is mostly defined by its source code. Software evolves continuously, it gets modified, enhanced, and new requirements always arise. If we do not spend time periodically on improving our source code, it becomes messy and its quality will decrease inevitably. Literature tells us that we can improve the quality of our software product by regularly refactoring it. But does refactoring really increase software quality? Can it happen that a refactoring decreases the quality? Is it possible to recognize the change in quality caused by a single refactoring operation? In our paper, we seek answers to these questions in a case study of refactoring large-scale proprietary software systems. We analyzed the source code of 5 systems, and measured the quality of several revisions for a period of time. We analyzed 2 million lines of code and identified nearly 200 refactoring commits which fixed over 500 coding issues. We found that one single refactoring only makes a small change (sometimes even decreases quality), but when we do them in blocks, we can significantly increase quality, which can result not only in the local, but also in the global improvement of the code.","PeriodicalId":407060,"journal":{"name":"2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132769442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 41

Supplementary Bug Fixes vs. Re-opened Bugs 补充Bug修复vs.重新打开的Bug

2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation

Pub Date : 2014-09-28 DOI: 10.1109/SCAM.2014.29

Le An, Foutse Khomh, Bram Adams

A typical bug fixing cycle involves the reporting of a bug, the triaging of the report, the production and verification of a fix, and the closing of the bug. However, previous work has studied two phenomena where more than one fix are associated with the same bug report. The first one is the case where developers re-open a previously fixed bug in the bug repository (sometimes even multiple times) to provide a new bug fix that replace a previous fix, whereas the second one is the case where multiple commits in the version control system contribute to the same bug report ("supplementary bug fixes"). Even though both phenomena seem related, they have never been studied together, i.e., are supplementary fixes a subset of re-opened bugs or the other way around? This paper investigates the interplay between both phenomena in five open source software projects: Mozilla, Net beans, Eclipse JDT Core, Eclipse Platform SWT, and Web Kit. We found that re-opened bugs account for between 21.6% and 33.8% of all supplementary fixes. However, 33% to 57.5% of re-opened bugs had only one commit associated, which means that the original bug report was prematurely closed instead of fixed incorrectly. Furthermore, we constructed predictive models for re-opened bugs using historical information about supplementary bug fixes with a precision between 72.2% and 97%, as well as a recall between 47.7% and 65.3%. Software researchers and practitioners who are mining data repositories can use our approach to identify potential failures of their bug fixes and the re-opening of bug reports.

典型的错误修复周期包括报告错误、对报告进行分类、生成和验证修复以及关闭错误。然而，以前的工作已经研究了两个现象，即多个修复与同一个错误报告相关联。第一种情况是开发人员重新打开错误存储库中先前修复的错误(有时甚至多次)，以提供取代先前修复的新错误修复，而第二种情况是版本控制系统中的多次提交对相同的错误报告做出贡献(“补充错误修复”)。尽管这两种现象似乎是相关的，但它们从未被一起研究过，也就是说，补充修复是重新打开的错误的子集还是相反?本文研究了五个开源软件项目中这两种现象之间的相互作用:Mozilla、Net beans、Eclipse JDT Core、Eclipse Platform SWT和Web Kit。我们发现重新打开的bug占所有补充修复的21.6%到33.8%。然而，33%到57.5%的重新打开的bug只有一个相关的提交，这意味着原始的bug报告是过早关闭的，而不是错误地修复了。此外，我们利用补充bug修复的历史信息构建了重新打开bug的预测模型，准确率在72.2% ~ 97%之间，召回率在47.7% ~ 65.3%之间。挖掘数据存储库的软件研究人员和实践者可以使用我们的方法来识别错误修复的潜在失败，并重新打开错误报告。

{"title":"Supplementary Bug Fixes vs. Re-opened Bugs","authors":"Le An, Foutse Khomh, Bram Adams","doi":"10.1109/SCAM.2014.29","DOIUrl":"https://doi.org/10.1109/SCAM.2014.29","url":null,"abstract":"A typical bug fixing cycle involves the reporting of a bug, the triaging of the report, the production and verification of a fix, and the closing of the bug. However, previous work has studied two phenomena where more than one fix are associated with the same bug report. The first one is the case where developers re-open a previously fixed bug in the bug repository (sometimes even multiple times) to provide a new bug fix that replace a previous fix, whereas the second one is the case where multiple commits in the version control system contribute to the same bug report (\"supplementary bug fixes\"). Even though both phenomena seem related, they have never been studied together, i.e., are supplementary fixes a subset of re-opened bugs or the other way around? This paper investigates the interplay between both phenomena in five open source software projects: Mozilla, Net beans, Eclipse JDT Core, Eclipse Platform SWT, and Web Kit. We found that re-opened bugs account for between 21.6% and 33.8% of all supplementary fixes. However, 33% to 57.5% of re-opened bugs had only one commit associated, which means that the original bug report was prematurely closed instead of fixed incorrectly. Furthermore, we constructed predictive models for re-opened bugs using historical information about supplementary bug fixes with a precision between 72.2% and 97%, as well as a recall between 47.7% and 65.3%. Software researchers and practitioners who are mining data repositories can use our approach to identify potential failures of their bug fixes and the re-opening of bug reports.","PeriodicalId":407060,"journal":{"name":"2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation","volume":"1096 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133588153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

On the Use of Context in Recommending Exception Handling Code Examples 关于上下文在推荐异常处理代码示例中的使用

2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation

Pub Date : 2014-09-28 DOI: 10.1109/SCAM.2014.15

M. M. Rahman, C. Roy

Studies show that software developers often either misuse exception handling features or use them inefficiently, and such a practice may lead an undergoing software project to a fragile, insecure and non-robust application system. In this paper, we propose a context-aware code recommendation approach that recommends exception handling code examples from a number of popular open source code repositories hosted at GitHub. It collects the code examples exploiting GitHub code search API, and then analyzes, filters and ranks them against the code under development in the IDE by leveraging not only the structural (i.e., graph-based) and lexical features but also the heuristic quality measures of exception handlers in the examples. Experiments with 4,400 code examples and 65 exception handling scenarios as well as comparisons with four existing approaches show that the proposed approach is highly promising.

研究表明，软件开发人员经常误用异常处理功能或使用异常处理功能效率低下，这种做法可能导致正在进行的软件项目成为一个脆弱、不安全和不健壮的应用系统。在本文中，我们提出了一种上下文感知的代码推荐方法，该方法推荐来自GitHub上托管的许多流行开源代码存储库的异常处理代码示例。它收集利用GitHub代码搜索API的代码示例，然后通过不仅利用结构(即基于图的)和词法特性，而且利用示例中异常处理程序的启发式质量度量，对IDE中正在开发的代码进行分析、筛选和排序。对4400个代码示例和65个异常处理场景的实验以及与四种现有方法的比较表明，所提出的方法非常有前途。

引用次数: 40

Concolic Fault Abstraction 共凝故障抽象

2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation

Pub Date : 2014-09-28 DOI: 10.1109/SCAM.2014.22

Chanseok Oh, Martin Schäf, Daniel Schwartz-Narbonne, Thomas Wies

An integral part of all debugging activities is the task of diagnosing the cause of an error. Most existing fault diagnosis techniques rely on the availability of high quality test suites because they work by comparing failing and passing runs to identify the error cause. This limits their applicability. One alternative are techniques that statically analyze an error trace of the program without relying on additional passing runs to compare against. Particularly promising are novel proof-based approaches that leverage the advances in automated theorem proving to obtain an abstraction of the program that aids fault diagnostics. However, existing proof-based approaches still have practical limitations such as reduced scalability and dependence on complex mathematical models of programs. Such models are notoriously difficult to develop for real-world programs. Inspired by concolic testing, we propose a novel algorithm that integrates concrete execution and symbolic reasoning about the error trace to address these challenges. Specifically, we execute the error trace to obtain intermediate program states that allow us to split the trace into smaller fragments, each of which can be analyzed in isolation using an automated theorem prover. Moreover, we show how this approach can avoid complex logical encodings when reasoning about traces in low-level C programs. We have conducted an experiment where we applied our new algorithm to error traces generated from faulty versions of UNIX utils such as gzip and sed. Our experiment indicates that our concolic fault abstraction scales to real-world error traces and generates useful error diagnoses.

所有调试活动的一个组成部分是诊断错误的原因。大多数现有的故障诊断技术依赖于高质量测试套件的可用性，因为它们通过比较失败和通过的运行来识别错误原因。这限制了它们的适用性。一种替代方法是静态地分析程序的错误跟踪，而不依赖于额外的传递运行来进行比较。特别有希望的是新的基于证明的方法，它利用自动化定理证明的进步来获得有助于故障诊断的程序的抽象。然而，现有的基于证明的方法仍然有实际的局限性，如降低可扩展性和依赖于复杂的程序数学模型。众所周知，为现实世界的程序开发这样的模型是非常困难的。受concolic测试的启发，我们提出了一种新的算法，该算法集成了错误跟踪的具体执行和符号推理来解决这些挑战。具体来说，我们执行错误跟踪以获得中间程序状态，这些状态允许我们将跟踪分割成更小的片段，每个片段都可以使用自动定理证明器单独分析。此外，我们还展示了在低级C程序中对跟踪进行推理时，这种方法如何避免复杂的逻辑编码。我们进行了一个实验，将我们的新算法应用于由错误版本的UNIX utils(如gzip和sed)生成的错误跟踪。我们的实验表明，我们的圆锥故障抽象可扩展到现实世界的错误轨迹，并产生有用的错误诊断。

{"title":"Concolic Fault Abstraction","authors":"Chanseok Oh, Martin Schäf, Daniel Schwartz-Narbonne, Thomas Wies","doi":"10.1109/SCAM.2014.22","DOIUrl":"https://doi.org/10.1109/SCAM.2014.22","url":null,"abstract":"An integral part of all debugging activities is the task of diagnosing the cause of an error. Most existing fault diagnosis techniques rely on the availability of high quality test suites because they work by comparing failing and passing runs to identify the error cause. This limits their applicability. One alternative are techniques that statically analyze an error trace of the program without relying on additional passing runs to compare against. Particularly promising are novel proof-based approaches that leverage the advances in automated theorem proving to obtain an abstraction of the program that aids fault diagnostics. However, existing proof-based approaches still have practical limitations such as reduced scalability and dependence on complex mathematical models of programs. Such models are notoriously difficult to develop for real-world programs. Inspired by concolic testing, we propose a novel algorithm that integrates concrete execution and symbolic reasoning about the error trace to address these challenges. Specifically, we execute the error trace to obtain intermediate program states that allow us to split the trace into smaller fragments, each of which can be analyzed in isolation using an automated theorem prover. Moreover, we show how this approach can avoid complex logical encodings when reasoning about traces in low-level C programs. We have conducted an experiment where we applied our new algorithm to error traces generated from faulty versions of UNIX utils such as gzip and sed. Our experiment indicates that our concolic fault abstraction scales to real-world error traces and generates useful error diagnoses.","PeriodicalId":407060,"journal":{"name":"2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124675123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Are Object Graphs Extracted Using Abstract Interpretation Significantly Different from the Code? 使用抽象解释提取的对象图与代码有显著不同吗?

2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation

Pub Date : 2014-09-28 DOI: 10.1109/SCAM.2014.42

Marwan Abi-Antoun, Sumukhi Chandrashekar, R. Vanciu, Andrew Giang

To evolve object-oriented code, one must understand both the code structure in terms of classes, and the runtime structure in terms of abstractions of objects that are being created and relations between those objects. To help with this understanding, static program analysis can extract heap abstractions such as object graphs. But the extracted graphs can become too large if they do not sufficiently abstract objects, or too imprecise if they abstract objects excessively to the point of being similar to a class diagram, where one box for a class represents all the instances of that class. One previously proposed solution uses both annotations and abstract interpretation to extract a global, hierarchical, abstract object graph that conveys both abstraction and design intent, but can still be related to the code structure. In this paper, we define metrics that relate nodes and edges in the object graph to elements in the code structure, to measure how they differ, and if the differences are indicative of language or design features such as encapsulation, polymorphism and inheritance. We compute the metrics across eight systems totaling over 100 KLOC, and show a statistically significant difference between the code and the object graph. In several cases, the magnitude of this difference is large.

要发展面向对象的代码，必须既要理解类方面的代码结构，又要理解正在创建的对象的抽象以及这些对象之间的关系方面的运行时结构。为了帮助理解这一点，静态程序分析可以提取堆抽象，比如对象图。但是，如果提取的图没有充分抽象对象，那么它们可能会变得太大，或者如果它们过于抽象对象，以至于与类图相似，那么提取的图就会变得太不精确，在类图中，类的一个框表示该类的所有实例。先前提出的一种解决方案使用注释和抽象解释来提取一个全局的、分层的、抽象的对象图，它既传达了抽象，也传达了设计意图，但仍然可以与代码结构相关。在本文中，我们定义了将对象图中的节点和边缘与代码结构中的元素联系起来的度量，以衡量它们之间的差异，以及这些差异是否表明了语言或设计特征，如封装、多态性和继承。我们计算了8个系统的度量，总共超过100 KLOC，并在统计上显示了代码和对象图之间的显著差异。在一些情况下，这种差异的幅度很大。

{"title":"Are Object Graphs Extracted Using Abstract Interpretation Significantly Different from the Code?","authors":"Marwan Abi-Antoun, Sumukhi Chandrashekar, R. Vanciu, Andrew Giang","doi":"10.1109/SCAM.2014.42","DOIUrl":"https://doi.org/10.1109/SCAM.2014.42","url":null,"abstract":"To evolve object-oriented code, one must understand both the code structure in terms of classes, and the runtime structure in terms of abstractions of objects that are being created and relations between those objects. To help with this understanding, static program analysis can extract heap abstractions such as object graphs. But the extracted graphs can become too large if they do not sufficiently abstract objects, or too imprecise if they abstract objects excessively to the point of being similar to a class diagram, where one box for a class represents all the instances of that class. One previously proposed solution uses both annotations and abstract interpretation to extract a global, hierarchical, abstract object graph that conveys both abstraction and design intent, but can still be related to the code structure. In this paper, we define metrics that relate nodes and edges in the object graph to elements in the code structure, to measure how they differ, and if the differences are indicative of language or design features such as encapsulation, polymorphism and inheritance. We compute the metrics across eight systems totaling over 100 KLOC, and show a statistically significant difference between the code and the object graph. In several cases, the magnitude of this difference is large.","PeriodicalId":407060,"journal":{"name":"2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130630412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4