2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)最新文献_第4页

Towards Better Understanding Developer Perception of Refactoring 更好地理解开发人员对重构的看法

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2019-09-01 DOI: 10.1109/ICSME.2019.00100

E. Alomar

Refactoring is a critical task in software maintenance. It is typically performed to enforce best design practices or to cope with design defects. Research in refactoring has been driven by the need to improve system structures. However, recent studies have shown that developers may incorporate refactoring strategies in other development-related activities that go beyond improving the design. Unfortunately, these studies are limited to developer interviews and a reduced set of projects. In this context, we aim at exploring how developers document their refactoring activities during the software life cycle, we call such activity Self-Affirmed Refactoring (SAR), by understanding developers perception of refactorings so that we can bridge the gap between refactoring and automation in general. We aim in more accurately mimicking the human decision making when recommending better software refactoring and remodularization.

重构是软件维护中的一项关键任务。它通常用于执行最佳设计实践或处理设计缺陷。对重构的研究是由改进系统结构的需要推动的。然而，最近的研究表明，开发人员可能会在改进设计之外的其他与开发相关的活动中结合重构策略。不幸的是，这些研究仅限于开发人员访谈和减少的项目集。在这种情况下，我们的目标是探索开发人员如何在软件生命周期中记录他们的重构活动，我们将这种活动称为自确认重构(SAR)，通过了解开发人员对重构的看法，我们可以弥合重构和自动化之间的鸿沟。在推荐更好的软件重构和重构时，我们的目标是更准确地模仿人类的决策。

引用次数: 4

DeepEvolution: A Search-Based Testing Approach for Deep Neural Networks 深度进化:一种基于搜索的深度神经网络测试方法

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2019-09-01 DOI: 10.1109/ICSME.2019.00078

Houssem Ben Braiek, Foutse Khomh

The increasing inclusion of Deep Learning (DL) models in safety-critical systems such as autonomous vehicles have led to the development of multiple model-based DL testing techniques. One common denominator of these testing techniques is the automated generation of test cases, e.g., new inputs transformed from the original training data with the aim to optimize some test adequacy criteria. So far, the effectiveness of these approaches has been hindered by their reliance on random fuzzing or transformations that do not always produce test cases with a good diversity. To overcome these limitations, we propose, DeepEvolution, a novel search-based approach for testing DL models that relies on metaheuristics to ensure a maximum diversity in generated test cases. We assess the effectiveness of DeepEvolution in testing computer-vision DL models and found that it significantly increases the neuronal coverage of generated test cases. Moreover, using DeepEvolution, we could successfully find several corner-case behaviors. Finally, DeepEvolution outperformed Tensorfuzz (a coverage-guided fuzzing tool developed at Google Brain) in detecting latent defects introduced during the quantization of the models. These results suggest that search-based approaches can help build effective testing tools for DL systems.

深度学习(DL)模型越来越多地应用于安全关键系统，如自动驾驶汽车，这导致了多种基于模型的深度学习测试技术的发展。这些测试技术的一个共同点是自动生成测试用例，例如，从原始训练数据转换的新输入，目的是优化一些测试充分性标准。到目前为止，这些方法的有效性已经被它们对随机模糊或转换的依赖所阻碍，这些转换并不总是产生具有良好多样性的测试用例。为了克服这些限制，我们提出了DeepEvolution，这是一种基于搜索的测试DL模型的新方法，它依赖于元启发式来确保生成的测试用例的最大多样性。我们评估了DeepEvolution在测试计算机视觉深度学习模型中的有效性，发现它显著增加了生成的测试用例的神经元覆盖率。此外，使用DeepEvolution，我们可以成功地找到几个角落情况的行为。最后，在检测模型量化过程中引入的潜在缺陷方面，DeepEvolution优于Tensorfuzz(一种由Google Brain开发的覆盖引导模糊工具)。这些结果表明，基于搜索的方法可以帮助为深度学习系统构建有效的测试工具。

{"title":"DeepEvolution: A Search-Based Testing Approach for Deep Neural Networks","authors":"Houssem Ben Braiek, Foutse Khomh","doi":"10.1109/ICSME.2019.00078","DOIUrl":"https://doi.org/10.1109/ICSME.2019.00078","url":null,"abstract":"The increasing inclusion of Deep Learning (DL) models in safety-critical systems such as autonomous vehicles have led to the development of multiple model-based DL testing techniques. One common denominator of these testing techniques is the automated generation of test cases, e.g., new inputs transformed from the original training data with the aim to optimize some test adequacy criteria. So far, the effectiveness of these approaches has been hindered by their reliance on random fuzzing or transformations that do not always produce test cases with a good diversity. To overcome these limitations, we propose, DeepEvolution, a novel search-based approach for testing DL models that relies on metaheuristics to ensure a maximum diversity in generated test cases. We assess the effectiveness of DeepEvolution in testing computer-vision DL models and found that it significantly increases the neuronal coverage of generated test cases. Moreover, using DeepEvolution, we could successfully find several corner-case behaviors. Finally, DeepEvolution outperformed Tensorfuzz (a coverage-guided fuzzing tool developed at Google Brain) in detecting latent defects introduced during the quantization of the models. These results suggest that search-based approaches can help build effective testing tools for DL systems.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127832627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

Towards Generating Transformation Rules without Examples for Android API Replacement Android API替换生成转换规则

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2019-09-01 DOI: 10.1109/ICSME.2019.00032

Ferdian Thung, Hong Jin Kang, Lingxiao Jiang, D. Lo

Deprecation of APIs in software libraries is common when library maintainers make changes to a library and will no longer support certain APIs in the future. When deprecation occurs, developers whose programs depend on the APIs need to replace the usages of the deprecated APIs sooner or later. Often times, software documentation specifies which new APIs the developers should use in place of a deprecated API. However, replacing the usages of a deprecated API remains a challenge since developers may not know exactly how to use the new APIs. The developers would first need to understand the API changes before they can replace the deprecated API correctly. Previous work has proposed an approach to assist developers in deprecatedAndroid API replacement by learning from examples. In this work, we also focus on Android APIs and propose an approach named No Example API Transformation (NEAT) to generate transformation rules that can assist developers in deprecatedAPI replacement even when code examples are not available (e.g., when the deprecation has happened recently). We have validated the effectiveness of NEATon generating transformation rules for deprecated Android APIs. Using NEAT, we can generate 37 transformation rules for 37 out of a selection of 100 deprecatedAPIs and have validated these rules to be correct.

当库维护者对库进行更改并且将来不再支持某些api时，在软件库中弃用api是很常见的。当弃用发生时，其程序依赖于api的开发人员迟早需要替换弃用api的用法。通常情况下，软件文档会指定开发人员应该使用哪些新API来代替已弃用的API。然而，替换已弃用API的用法仍然是一个挑战，因为开发人员可能不知道如何确切地使用新API。开发人员首先需要了解API的变化，然后才能正确地替换已弃用的API。以前的工作已经提出了一种方法，通过从示例中学习来帮助开发人员替换已弃用的android API。在这项工作中，我们还关注Android API，并提出了一种名为“无示例API转换”(NEAT)的方法来生成转换规则，即使在代码示例不可用的情况下(例如，当最近发生弃用时)，也可以帮助开发人员替换已弃用的API。我们已经验证了NEATon为已弃用的Android api生成转换规则的有效性。使用NEAT，我们可以为100个废弃api中的37个生成37个转换规则，并验证这些规则的正确性。

{"title":"Towards Generating Transformation Rules without Examples for Android API Replacement","authors":"Ferdian Thung, Hong Jin Kang, Lingxiao Jiang, D. Lo","doi":"10.1109/ICSME.2019.00032","DOIUrl":"https://doi.org/10.1109/ICSME.2019.00032","url":null,"abstract":"Deprecation of APIs in software libraries is common when library maintainers make changes to a library and will no longer support certain APIs in the future. When deprecation occurs, developers whose programs depend on the APIs need to replace the usages of the deprecated APIs sooner or later. Often times, software documentation specifies which new APIs the developers should use in place of a deprecated API. However, replacing the usages of a deprecated API remains a challenge since developers may not know exactly how to use the new APIs. The developers would first need to understand the API changes before they can replace the deprecated API correctly. Previous work has proposed an approach to assist developers in deprecatedAndroid API replacement by learning from examples. In this work, we also focus on Android APIs and propose an approach named No Example API Transformation (NEAT) to generate transformation rules that can assist developers in deprecatedAPI replacement even when code examples are not available (e.g., when the deprecation has happened recently). We have validated the effectiveness of NEATon generating transformation rules for deprecated Android APIs. Using NEAT, we can generate 37 transformation rules for 37 out of a selection of 100 deprecatedAPIs and have validated these rules to be correct.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124122389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Impact of Switching Bug Trackers: A Case Study on a Medium-Sized Open Source Project 切换Bug跟踪器的影响:一个中型开源项目的案例研究

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2019-09-01 DOI: 10.1109/ICSME.2019.00011

Théo Zimmermann, Annalí Casanueva Artís

For most software projects, the bug tracker is an essential tool. In open source development, this tool plays an even more central role as it is generally open to all users, who are encouraged to test the software and report bugs. Previous studies have highlighted the act of reporting a bug as a first step leading a user to become an active contributor. The impact of the bug reporting environment on the bug tracking activity is difficult to assess because of the lack of comparison points. In this paper, we take advantage of the switch, from Bugzilla to GitHub, of the bug tracker of Coq, a medium-sized open source project, to evaluate and interpret the impact that such a change can have. We first report on the switch itself, including the migration of preexisting issues. Then we analyze data from before and after the switch using a regression discontinuity design, an econometric methodology imported from quantitative policy analysis. We complete this quantitative analysis with qualitative data from interviews with developers. We show that the switch induces an increase in bug reporting, particularly from principal developers themselves, and more generally an increased engagement with the bug tracking platform, with more comments by developers and also more external commentators.

对于大多数软件项目，bug跟踪器是必不可少的工具。在开源开发中，这个工具扮演着更重要的角色，因为它通常对所有用户开放，鼓励用户测试软件并报告错误。之前的研究强调，报告bug是引导用户成为积极贡献者的第一步。由于缺乏比较点，bug报告环境对bug跟踪活动的影响很难评估。在本文中，我们利用Coq(一个中等规模的开源项目)的bug跟踪器从Bugzilla切换到GitHub的优势，来评估和解释这种变化可能产生的影响。我们首先报告交换机本身，包括先前存在的问题的迁移。然后，我们使用回归不连续设计(一种从定量政策分析引入的计量经济学方法)来分析转换前后的数据。我们使用来自开发者访谈的定性数据来完成这个定量分析。我们表明，这种转换导致了bug报告的增加，特别是来自主要开发人员自己的，更普遍的是，bug跟踪平台的参与度增加了，开发人员和外部评论员的评论也增加了。

{"title":"Impact of Switching Bug Trackers: A Case Study on a Medium-Sized Open Source Project","authors":"Théo Zimmermann, Annalí Casanueva Artís","doi":"10.1109/ICSME.2019.00011","DOIUrl":"https://doi.org/10.1109/ICSME.2019.00011","url":null,"abstract":"For most software projects, the bug tracker is an essential tool. In open source development, this tool plays an even more central role as it is generally open to all users, who are encouraged to test the software and report bugs. Previous studies have highlighted the act of reporting a bug as a first step leading a user to become an active contributor. The impact of the bug reporting environment on the bug tracking activity is difficult to assess because of the lack of comparison points. In this paper, we take advantage of the switch, from Bugzilla to GitHub, of the bug tracker of Coq, a medium-sized open source project, to evaluate and interpret the impact that such a change can have. We first report on the switch itself, including the migration of preexisting issues. Then we analyze data from before and after the switch using a regression discontinuity design, an econometric methodology imported from quantitative policy analysis. We complete this quantitative analysis with qualitative data from interviews with developers. We show that the switch induces an increase in bug reporting, particularly from principal developers themselves, and more generally an increased engagement with the bug tracking platform, with more comments by developers and also more external commentators.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131798663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

What Factors Make SQL Test Cases Understandable for Testers? A Human Study of Automated Test Data Generation Techniques 哪些因素使测试人员可以理解SQL测试用例?自动化测试数据生成技术的人类研究

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2019-09-01 DOI: 10.1109/ICSME.2019.00076

A. Alsharif, G. M. Kapfhammer, Phil McMinn

Since relational databases are a key component of software systems ranging from small mobile to large enterprise applications, there are well-studied methods that automatically generate test cases for database-related functionality. Yet, there has been no research to analyze how well testers - who must often serve as an "oracle" - both understand tests involving SQL and decide if they reveal flaws. This paper reports on a human study of test comprehension in the context of automatically generated tests that assess the correct specification of the integrity constraints in a relational database schema. In this domain, a tool generates INSERT statements with data values designed to either satisfy (i.e., be accepted into the database) or violate the schema (i.e., be rejected from the database). The study reveals two key findings. First, the choice of data values in INSERTs influences human understandability: the use of default values for elements not involved in the test (but necessary for adhering to SQL's syntax rules) aided participants, allowing them to easily identify and understand the important test values. Yet, negative numbers and "garbage" strings hindered this process. The second finding is more far reaching: humans found the outcome of test cases very difficult to predict when NULL was used in conjunction with foreign keys and CHECK constraints. This suggests that, while including NULLs can surface the confusing semantics of database schemas, their use makes tests less understandable for humans.

由于关系数据库是从小型移动应用程序到大型企业应用程序的软件系统的关键组件，因此有许多经过充分研究的方法可以自动生成与数据库相关的功能的测试用例。然而，目前还没有研究分析测试人员(他们通常必须充当“oracle”)对涉及SQL的测试的理解程度，以及他们是否发现了缺陷。本文报告了在自动生成的测试环境中测试理解的人类研究，这些测试评估了关系数据库模式中完整性约束的正确规范。在这个领域中，工具生成带有数据值的INSERT语句，这些数据值要么满足(即被数据库接受)，要么违背模式(即被数据库拒绝)。这项研究揭示了两个关键发现。首先，insert中数据值的选择影响人类的可理解性:使用测试中不涉及的元素的默认值(但对于遵守SQL的语法规则是必要的)帮助参与者，使他们能够轻松地识别和理解重要的测试值。然而，负数和“垃圾”字符串阻碍了这个过程。第二个发现更为深远:当NULL与外键和CHECK约束一起使用时，人们发现测试用例的结果很难预测。这表明，虽然包含null可能会暴露数据库模式令人困惑的语义，但它们的使用使测试对人类来说更难以理解。

{"title":"What Factors Make SQL Test Cases Understandable for Testers? A Human Study of Automated Test Data Generation Techniques","authors":"A. Alsharif, G. M. Kapfhammer, Phil McMinn","doi":"10.1109/ICSME.2019.00076","DOIUrl":"https://doi.org/10.1109/ICSME.2019.00076","url":null,"abstract":"Since relational databases are a key component of software systems ranging from small mobile to large enterprise applications, there are well-studied methods that automatically generate test cases for database-related functionality. Yet, there has been no research to analyze how well testers - who must often serve as an \"oracle\" - both understand tests involving SQL and decide if they reveal flaws. This paper reports on a human study of test comprehension in the context of automatically generated tests that assess the correct specification of the integrity constraints in a relational database schema. In this domain, a tool generates INSERT statements with data values designed to either satisfy (i.e., be accepted into the database) or violate the schema (i.e., be rejected from the database). The study reveals two key findings. First, the choice of data values in INSERTs influences human understandability: the use of default values for elements not involved in the test (but necessary for adhering to SQL's syntax rules) aided participants, allowing them to easily identify and understand the important test values. Yet, negative numbers and \"garbage\" strings hindered this process. The second finding is more far reaching: humans found the outcome of test cases very difficult to predict when NULL was used in conjunction with foreign keys and CHECK constraints. This suggests that, while including NULLs can surface the confusing semantics of database schemas, their use makes tests less understandable for humans.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131209125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Challenges in re-Platforming Mixed Language PL/I and COBOL IS to an Open Systems Platform 将混合语言PL/I和COBOL IS重新平台化为开放系统平台的挑战

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2019-09-01 DOI: 10.1109/ICSME.2019.00056

Thomas Wagner, Christian Brem, S. Strobl, T. Grechenig

Re-platforming — the transfer of a legacy system to a new target platform — is often seen as a first cost-reduction step to enable a long term reengineering effort of a legacy system. However, recreating a functionally equivalent ecosystem on a different platform entails significant technical and organisational challenges. This extended abstract describes challenges, their solutions as well as an important lesson learned during a re-platforming project of 40 year old PL/I and COBOL applications.

重新搭建平台——将遗留系统转移到新的目标平台——通常被视为降低成本的第一步，以实现遗留系统的长期再工程工作。然而，在不同的平台上重新创建功能相同的生态系统需要重大的技术和组织挑战。这个扩展的摘要描述了挑战，它们的解决方案，以及在一个有40年历史的PL/I和COBOL应用程序的重新平台项目中吸取的重要教训。

引用次数: 2

Aiding Code Change Understanding with Semantic Change Impact Analysis 用语义变化影响分析帮助理解代码变化

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2019-09-01 DOI: 10.1109/ICSME.2019.00031

Quinn Hanam, A. Mesbah, Reid Holmes

Code reviews are often used as a means for developers to manually examine source code changes to ensure the behavioural effects of a change are well understood. Unfortunately, the behavioural impact of a change can include parts of the system outside of the area syntactically affected by the change. In the context of code reviews this can be problematic, as the impact of a change can extend beyond the diff that is presented to the reviewer. Change impact analysis is a promising technique which could potentially assist developers by helping surface parts of the code not present in the diff but that could be affected by the change. In this work we investigate the utility of change impact analysis as a tool for assisting developers understand the effects of code changes. While we find that traditional techniques may not benefit developers, more precise techniques may reduce time and increase accuracy. Specifically, we propose and study a novel technique which extracts semantic, rather than syntactic, change impact relations from JavaScript commits. We (1) define four novel semantic change impact relations and (2) implement an analysis tool called tool that interprets structural changes over partial JavaScript programs to extract these relations. In a study of 2,000 commits from the version history of three popular NodeJS applications, tool reduced false positives by 9–37% and further reduced the size of change impact sets by 19–91% by splitting up unrelated semantic relations, compared to change impact sets computed with Unix diff and control and data dependencies. Additionally, through a user study in which developers performed code review tasks with tool, we found that reducing false positives and providing stronger semantics had a meaningful impact on their ability to find defects within code change diffs.

代码审查通常被用作开发人员手动检查源代码更改的一种手段，以确保更改的行为影响被很好地理解。不幸的是，变化的行为影响可能包括语法上受变化影响的区域之外的系统部分。在代码审查的上下文中，这可能是有问题的，因为变更的影响可能超出呈现给审查者的差异。变更影响分析是一种很有前途的技术，它可以帮助开发人员发现代码中不存在但可能受到变更影响的部分。在这项工作中，我们调查了变更影响分析作为帮助开发人员理解代码变更影响的工具的效用。虽然我们发现传统技术可能对开发人员没有好处，但更精确的技术可能会减少时间并提高准确性。具体来说，我们提出并研究了一种从JavaScript提交中提取语义而不是语法变化影响关系的新技术。我们(1)定义了四种新的语义变化影响关系，(2)实现了一个名为tool的分析工具，该工具解释了部分JavaScript程序的结构变化，以提取这些关系。在一项对三种流行的NodeJS应用的版本历史的2000次提交的研究中，与使用Unix diff、控制和数据依赖计算的变更影响集相比，通过拆分不相关的语义关系，该工具减少了误报9-37%，并进一步减少了19-91%的变更影响集大小。此外，通过用户研究，开发人员使用工具执行代码审查任务，我们发现减少误报和提供更强的语义对他们在代码变更差异中发现缺陷的能力有有意义的影响。

{"title":"Aiding Code Change Understanding with Semantic Change Impact Analysis","authors":"Quinn Hanam, A. Mesbah, Reid Holmes","doi":"10.1109/ICSME.2019.00031","DOIUrl":"https://doi.org/10.1109/ICSME.2019.00031","url":null,"abstract":"Code reviews are often used as a means for developers to manually examine source code changes to ensure the behavioural effects of a change are well understood. Unfortunately, the behavioural impact of a change can include parts of the system outside of the area syntactically affected by the change. In the context of code reviews this can be problematic, as the impact of a change can extend beyond the diff that is presented to the reviewer. Change impact analysis is a promising technique which could potentially assist developers by helping surface parts of the code not present in the diff but that could be affected by the change. In this work we investigate the utility of change impact analysis as a tool for assisting developers understand the effects of code changes. While we find that traditional techniques may not benefit developers, more precise techniques may reduce time and increase accuracy. Specifically, we propose and study a novel technique which extracts semantic, rather than syntactic, change impact relations from JavaScript commits. We (1) define four novel semantic change impact relations and (2) implement an analysis tool called tool that interprets structural changes over partial JavaScript programs to extract these relations. In a study of 2,000 commits from the version history of three popular NodeJS applications, tool reduced false positives by 9–37% and further reduced the size of change impact sets by 19–91% by splitting up unrelated semantic relations, compared to change impact sets computed with Unix diff and control and data dependencies. Additionally, through a user study in which developers performed code review tasks with tool, we found that reducing false positives and providing stronger semantics had a meaningful impact on their ability to find defects within code change diffs.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132756695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Characterizing Performance Regression Introducing Code Changes 描述性能回归介绍代码更改

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2019-09-01 DOI: 10.1109/ICSME.2019.00102

Deema Alshoaibi

Performance regression testing is highly expensive as it delays system development when optimally conducted after each code change. As a result, performance regression testing should be devoted to code changes highly probably encountering regression. In this context, recent studies focus on the early identification of potentially problematic code changes through characterizing them using static and dynamic metrics. The aim of my research thesis is to support performance regression by better identifying and characterizing performance regression introducing code changes. Our first contribution has tackled the detection of these changes as an optimization problem. Our proposed approach used a combination of static and dynamic metrics and built using evolutionary computation, a detection rule, which was shown to outperform recent state-of-the-art studies. To extend our research, we are planning to increase metrics used, to better profile problematic code changes. We also plan on reducing the identification cost by searching for a traedeoff that reduces the use of dynamic metrics, while maintaining the detection performance. In addition, we would like to prioritize test case based on code changes characteristics to be conducted when regression predicted.

性能回归测试是非常昂贵的，因为当在每次代码更改之后进行优化时，它会延迟系统开发。因此，性能回归测试应该专门用于很可能遇到回归的代码更改。在这种情况下，最近的研究集中在通过使用静态和动态度量来描述潜在问题代码更改的早期识别上。我的研究论文的目的是通过更好地识别和描述引入代码更改的性能回归来支持性能回归。我们的第一个贡献是将这些变化的检测作为一个优化问题来解决。我们提出的方法使用静态和动态指标的组合，并使用进化计算(一种检测规则)构建，该规则被证明优于最近最先进的研究。为了扩展我们的研究，我们计划增加使用的指标，以更好地分析有问题的代码更改。我们还计划在保持检测性能的同时，通过寻找减少动态度量使用的折衷方案来降低识别成本。另外，我们想要根据预测回归时要执行的代码变更特征对测试用例进行优先级排序。

{"title":"Characterizing Performance Regression Introducing Code Changes","authors":"Deema Alshoaibi","doi":"10.1109/ICSME.2019.00102","DOIUrl":"https://doi.org/10.1109/ICSME.2019.00102","url":null,"abstract":"Performance regression testing is highly expensive as it delays system development when optimally conducted after each code change. As a result, performance regression testing should be devoted to code changes highly probably encountering regression. In this context, recent studies focus on the early identification of potentially problematic code changes through characterizing them using static and dynamic metrics. The aim of my research thesis is to support performance regression by better identifying and characterizing performance regression introducing code changes. Our first contribution has tackled the detection of these changes as an optimization problem. Our proposed approach used a combination of static and dynamic metrics and built using evolutionary computation, a detection rule, which was shown to outperform recent state-of-the-art studies. To extend our research, we are planning to increase metrics used, to better profile problematic code changes. We also plan on reducing the identification cost by searching for a traedeoff that reduces the use of dynamic metrics, while maintaining the detection performance. In addition, we would like to prioritize test case based on code changes characteristics to be conducted when regression predicted.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133356250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Qualitative Study on Framework Debugging 框架调试的定性研究

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2019-09-01 DOI: 10.1109/ICSME.2019.00091

Zack Coker, D. Widder, Claire Le Goues, C. Bogart, Joshua Sunshine

Features of frameworks, such as inversion of control and the structure of framework applications, require developers to adjust their programming and debugging strategies as compared to sequential programs. However, the benefits and challenges of framework debugging are not fully understood, and gaining this knowledge could provide guidance in debugging strategies and framework tool design. To gain insight into the framework application debugging process, we performed two human studies investigating how developers fix applications that use a framework API incorrectly. These studies focused on the Android Fragment class and the ROS framework. We analyzed the results of the studies using a mixed-methods approach, using techniques from qualitative approaches. Our analysis found that participants benefited from the structure of frameworks and the pre-made solutions to common problems in the domain. Participants encountered challenges with understanding frame-work abstractions, and had particular difficulty with inversion of control and object protocol issues. When compared to prior work on debugging, these results show that framework applications have unique debugging challenges.

框架的特性，如控制反转和框架应用程序的结构，要求开发人员调整他们的编程和调试策略，而不是顺序程序。然而，框架调试的好处和挑战还没有被完全理解，获得这些知识可以为调试策略和框架工具设计提供指导。为了深入了解框架应用程序调试过程，我们进行了两项人类研究，调查开发人员如何修复错误使用框架API的应用程序。这些研究主要集中在Android Fragment类和ROS框架。我们使用混合方法方法分析研究结果，使用定性方法的技术。我们的分析发现，参与者受益于框架的结构和该领域常见问题的预制解决方案。参与者在理解框架抽象方面遇到了挑战，在控制反转和对象协议问题上遇到了特别的困难。与之前的调试工作相比，这些结果表明框架应用程序具有独特的调试挑战。

{"title":"A Qualitative Study on Framework Debugging","authors":"Zack Coker, D. Widder, Claire Le Goues, C. Bogart, Joshua Sunshine","doi":"10.1109/ICSME.2019.00091","DOIUrl":"https://doi.org/10.1109/ICSME.2019.00091","url":null,"abstract":"Features of frameworks, such as inversion of control and the structure of framework applications, require developers to adjust their programming and debugging strategies as compared to sequential programs. However, the benefits and challenges of framework debugging are not fully understood, and gaining this knowledge could provide guidance in debugging strategies and framework tool design. To gain insight into the framework application debugging process, we performed two human studies investigating how developers fix applications that use a framework API incorrectly. These studies focused on the Android Fragment class and the ROS framework. We analyzed the results of the studies using a mixed-methods approach, using techniques from qualitative approaches. Our analysis found that participants benefited from the structure of frameworks and the pre-made solutions to common problems in the domain. Participants encountered challenges with understanding frame-work abstractions, and had particular difficulty with inversion of control and object protocol issues. When compared to prior work on debugging, these results show that framework applications have unique debugging challenges.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132920559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

The Impact of Rare Failures on Statistical Fault Localization: The Case of the Defects4J Suite 罕见故障对统计故障定位的影响:以缺陷4j套件为例

2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2019-09-01 DOI: 10.1109/ICSME.2019.00012

Yigit Küçük, Tim A. D. Henderson, Andy Podgurski

Statistical Fault Localization (SFL) uses coverage profiles (or "spectra") collected from passing and failing tests, together with statistical metrics, which are typically composed of simple estimators, to identify which elements of a program are most likely to have caused observed failures. Previous SFL research has not thoroughly examined how the effectiveness of SFL metrics is related to the proportion of failures in test suites and related quantities. To address this issue, we studied the Defects4J benchmark suite of programs and test suites and found that if a test suite has very few failures, SFL performs poorly. To better understand this phenomenon, we investigated the precision of some statistical estimators of which SFL metrics are composed, as measured by their coefficients of variation. The precision of an embedded estimator, which depends on the dataset, was found to correlate with the effectiveness of a metric containing it: low precision is associated with poor effectiveness. Boosting precision by adding test cases was found to improve overall SFL effectiveness. We present our findings and discuss their implications for the evaluation and use of SFL metrics.

统计故障定位(SFL)使用从通过和失败的测试中收集的覆盖概要文件(或“谱”)，以及通常由简单估计器组成的统计度量，来确定程序的哪些元素最有可能导致观察到的故障。以前的SFL研究并没有彻底检查SFL度量的有效性是如何与测试套件和相关数量中的失败比例相关的。为了解决这个问题，我们研究了缺陷4j程序和测试套件的基准套件，发现如果一个测试套件只有很少的失败，那么SFL的性能就很差。为了更好地理解这一现象，我们研究了组成SFL指标的一些统计估计器的精度，通过它们的变异系数来测量。嵌入式估计器的精度取决于数据集，发现与包含它的度量的有效性相关:低精度与差有效性相关。发现通过添加测试用例来提高精度可以提高整体SFL的有效性。我们提出了我们的研究结果，并讨论了它们对SFL指标的评估和使用的影响。

{"title":"The Impact of Rare Failures on Statistical Fault Localization: The Case of the Defects4J Suite","authors":"Yigit Küçük, Tim A. D. Henderson, Andy Podgurski","doi":"10.1109/ICSME.2019.00012","DOIUrl":"https://doi.org/10.1109/ICSME.2019.00012","url":null,"abstract":"Statistical Fault Localization (SFL) uses coverage profiles (or \"spectra\") collected from passing and failing tests, together with statistical metrics, which are typically composed of simple estimators, to identify which elements of a program are most likely to have caused observed failures. Previous SFL research has not thoroughly examined how the effectiveness of SFL metrics is related to the proportion of failures in test suites and related quantities. To address this issue, we studied the Defects4J benchmark suite of programs and test suites and found that if a test suite has very few failures, SFL performs poorly. To better understand this phenomenon, we investigated the precision of some statistical estimators of which SFL metrics are composed, as measured by their coefficients of variation. The precision of an embedded estimator, which depends on the dataset, was found to correlate with the effectiveness of a metric containing it: low precision is associated with poor effectiveness. Boosting precision by adding test cases was found to improve overall SFL effectiveness. We present our findings and discuss their implications for the evaluation and use of SFL metrics.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121975067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6