首页 > 最新文献

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)最新文献

英文 中文
How Does Code Reviewing Feedback Evolve?: A Longitudinal Study at Dell EMC 代码评审反馈是如何演变的?:戴尔EMC纵向研究
R. Wen, Maxime Lamothe, Shane McIntosh
Code review is an integral part of modern software development, where fellow developers critique the content, premise, and structure of code changes. Organizations like DellEMC have made considerable investment in code reviews, yet tracking the characteristics of feedback that code reviews provide (a primary product of the code reviewing process) is still a difficult process. To understand community and personal feedback trends, we perform a longitudinal study of 39,249 reviews that contain 248,695 review comments from a proprietary project that is developed by DellEMC. To investigate generalizability, we replicate our study on the OpenStackn Ova project. Through an analysis guided by topic models, we observe that more context-specific, technical feedback is introduced as the studied projects and communities age and as the reviewers within those communities accrue experience. This suggests that communities are reaping a larger return on investment in code review as they grow accustomed to the practice and as reviewers hone their skills. The code review trends uncovered by our models present opportunities for enterprises to monitor reviewing tendencies and improve knowledge transfer and reviewer skills.
代码审查是现代软件开发的一个组成部分,其他开发人员会对代码更改的内容、前提和结构进行评论。像DellEMC这样的组织已经在代码审查方面进行了相当大的投资,但是跟踪代码审查提供的反馈特征(代码审查过程的主要产品)仍然是一个困难的过程。为了了解社区和个人反馈趋势,我们对来自DellEMC开发的专有项目的39,249条评论进行了纵向研究,其中包含248,695条评论。为了研究概括性,我们在OpenStackn Ova项目上重复了我们的研究。通过由主题模型指导的分析,我们观察到,随着所研究的项目和社区的年龄增长,以及这些社区中的审阅者积累经验,更多的上下文特定的技术反馈被引入。这表明社区在代码审查的投资中获得了更大的回报,因为他们逐渐习惯了这种做法,并且随着审查者磨练他们的技能。我们的模型揭示的代码审查趋势为企业提供了监视审查趋势和改进知识转移和审查技能的机会。
{"title":"How Does Code Reviewing Feedback Evolve?: A Longitudinal Study at Dell EMC","authors":"R. Wen, Maxime Lamothe, Shane McIntosh","doi":"10.1145/3510457.3513039","DOIUrl":"https://doi.org/10.1145/3510457.3513039","url":null,"abstract":"Code review is an integral part of modern software development, where fellow developers critique the content, premise, and structure of code changes. Organizations like DellEMC have made considerable investment in code reviews, yet tracking the characteristics of feedback that code reviews provide (a primary product of the code reviewing process) is still a difficult process. To understand community and personal feedback trends, we perform a longitudinal study of 39,249 reviews that contain 248,695 review comments from a proprietary project that is developed by DellEMC. To investigate generalizability, we replicate our study on the OpenStackn Ova project. Through an analysis guided by topic models, we observe that more context-specific, technical feedback is introduced as the studied projects and communities age and as the reviewers within those communities accrue experience. This suggests that communities are reaping a larger return on investment in code review as they grow accustomed to the practice and as reviewers hone their skills. The code review trends uncovered by our models present opportunities for enterprises to monitor reviewing tendencies and improve knowledge transfer and reviewer skills.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123420251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Natural Language Processing Techniques to Improve Manual Test Case Descriptions 使用自然语言处理技术改进手工测试用例描述
Markos Viggiato, Dale Paas, C. Buzon, C. Bezemer
Despite the recent advancements in test automation, testing often remains a manual, and costly, activity in many industries. Manual test cases, often described only in natural language, consist of one or more test steps, which are instructions that must be performed to achieve the testing objective. Having different employees specifying test cases might result in redundant, unclear, or incomplete test cases. Manually reviewing and validating newly-specified test cases is time-consuming and becomes impractical in a scenario with a large test suite. Therefore, in this paper, we propose an automated framework to automatically analyze test cases that are specified in natural language and provide actionable recommendations on how to improve the test cases. Our framework consists of configurable components and modules for analysis, which are capable of recommending improvements to the following: (1) the terminology of a new test case through language modeling, (2) potentially missing test steps for a new test case through frequent itemset and association rule mining, and (3) recommendation of similar test cases that already exist in the test suite through text embedding and clustering. We thoroughly evaluated the three modules on data from our industry partner. Our framework can provide actionable recommendations, which is an important challenge given the widespread occurrence of test cases that are described only in natural language in the software industry (in particular, the game industry).
尽管最近在测试自动化方面取得了进步,但在许多行业中,测试通常仍然是一种手动的、昂贵的活动。手工测试用例,通常只用自然语言描述,由一个或多个测试步骤组成,这些步骤是必须执行以实现测试目标的指令。让不同的员工指定测试用例可能会导致冗余、不清晰或不完整的测试用例。手动检查和验证新指定的测试用例非常耗时,并且在具有大型测试套件的场景中变得不切实际。因此,在本文中,我们提出了一个自动化的框架来自动分析用自然语言指定的测试用例,并就如何改进测试用例提供可操作的建议。我们的框架由可配置的组件和模块组成,用于分析,它们能够对以下方面提出改进建议:(1)通过语言建模来确定新测试用例的术语,(2)通过频繁的项集和关联规则挖掘来确定新测试用例可能缺少的测试步骤,以及(3)通过文本嵌入和聚类来推荐测试套件中已经存在的类似测试用例。我们根据行业合作伙伴提供的数据对这三个模块进行了全面评估。我们的框架可以提供可操作的建议,这是一个重要的挑战,因为在软件行业(特别是游戏行业)中,测试用例只以自然语言描述。
{"title":"Using Natural Language Processing Techniques to Improve Manual Test Case Descriptions","authors":"Markos Viggiato, Dale Paas, C. Buzon, C. Bezemer","doi":"10.1145/3510457.3513045","DOIUrl":"https://doi.org/10.1145/3510457.3513045","url":null,"abstract":"Despite the recent advancements in test automation, testing often remains a manual, and costly, activity in many industries. Manual test cases, often described only in natural language, consist of one or more test steps, which are instructions that must be performed to achieve the testing objective. Having different employees specifying test cases might result in redundant, unclear, or incomplete test cases. Manually reviewing and validating newly-specified test cases is time-consuming and becomes impractical in a scenario with a large test suite. Therefore, in this paper, we propose an automated framework to automatically analyze test cases that are specified in natural language and provide actionable recommendations on how to improve the test cases. Our framework consists of configurable components and modules for analysis, which are capable of recommending improvements to the following: (1) the terminology of a new test case through language modeling, (2) potentially missing test steps for a new test case through frequent itemset and association rule mining, and (3) recommendation of similar test cases that already exist in the test suite through text embedding and clustering. We thoroughly evaluated the three modules on data from our industry partner. Our framework can provide actionable recommendations, which is an important challenge given the widespread occurrence of test cases that are described only in natural language in the software industry (in particular, the game industry).","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126052884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Strategies for Reuse and Sharing among Data Scientists in Software Teams 软件团队中数据科学家之间的重用和共享策略
Will Epperson, Yi Wang, R. Deline, S. Drucker
Effective sharing and reuse practices have long been hallmarks of proficient software engineering. Yet the exploratory nature of data science presents new challenges and opportunities to support sharing and reuse of analysis code. To better understand current practices, we conducted interviews (N=17) and a survey (N=132) with data scientists at Microsoft, and extract five commonly used strategies for sharing and reuse of past work: personal analysis reuse, personal utility libraries, team shared analysis code, team shared template notebooks, and team shared libraries. We also identify factors that encourage or discourage data scientists from sharing and reusing. Our participants described obstacles to reuse and sharing including a lack of incentives to create shared code, difficulties in making data science code modular, and a lack of tool interoperability. We discuss how future tools might help meet these needs.
有效的共享和重用实践长期以来一直是精通软件工程的标志。然而,数据科学的探索性为支持分析代码的共享和重用提出了新的挑战和机遇。为了更好地理解当前的实践,我们对微软的数据科学家进行了访谈(N=17)和调查(N=132),并提取了五种常用的共享和重用过去工作的策略:个人分析重用、个人实用程序库、团队共享分析代码、团队共享模板笔记本和团队共享库。我们还确定了鼓励或阻碍数据科学家共享和重用的因素。我们的参与者描述了重用和共享的障碍,包括缺乏创建共享代码的激励,使数据科学代码模块化的困难,以及缺乏工具互操作性。我们将讨论未来的工具如何帮助满足这些需求。
{"title":"Strategies for Reuse and Sharing among Data Scientists in Software Teams","authors":"Will Epperson, Yi Wang, R. Deline, S. Drucker","doi":"10.1145/3510457.3513042","DOIUrl":"https://doi.org/10.1145/3510457.3513042","url":null,"abstract":"Effective sharing and reuse practices have long been hallmarks of proficient software engineering. Yet the exploratory nature of data science presents new challenges and opportunities to support sharing and reuse of analysis code. To better understand current practices, we conducted interviews (N=17) and a survey (N=132) with data scientists at Microsoft, and extract five commonly used strategies for sharing and reuse of past work: personal analysis reuse, personal utility libraries, team shared analysis code, team shared template notebooks, and team shared libraries. We also identify factors that encourage or discourage data scientists from sharing and reusing. Our participants described obstacles to reuse and sharing including a lack of incentives to create shared code, difficulties in making data science code modular, and a lack of tool interoperability. We discuss how future tools might help meet these needs.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129647014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Unreliable Test Infrastructures in Automotive Testing Setups 汽车测试装置中不可靠的测试基础设施
Claudius V. Jordan, P. Foth, A. Pretschner, Matthias Fruth
During system testing of automotive electrical control units various reasons can lead to invalid test failures, e.g., non-responding components, faulty simulation models, faulty test case implementations, or hardware or software misconfigurations. To determine whether a test failure is invalid and what the underlying cause was, the test executions have to be analyzed manually, which is tedious and therefore costly. In this work, we report the magnitude of the problem of invalid test failures with four system testing projects from the automotive domain. We find that up to 91% of failed test executions are considered invalid. An oftentimes overlooked challenge are unreliable test infrastructures which deteriorate the validity of the test runs. In the studied projects already between 27% and 53% of failed test executions are linked to unreliable test infrastructures.
在汽车电气控制单元的系统测试过程中,各种原因可能导致无效的测试失败,例如,无响应的组件,错误的仿真模型,错误的测试用例实现,或硬件或软件配置错误。为了确定测试失败是否无效以及潜在的原因是什么,必须手动分析测试执行,这是乏味的,因此成本很高。在这项工作中,我们报告了来自汽车领域的四个系统测试项目的无效测试失败问题的严重性。我们发现高达91%的失败测试执行被认为是无效的。一个经常被忽视的挑战是不可靠的测试基础结构,它会降低测试运行的有效性。在研究的项目中,27%到53%的失败测试执行与不可靠的测试基础设施有关。
{"title":"Unreliable Test Infrastructures in Automotive Testing Setups","authors":"Claudius V. Jordan, P. Foth, A. Pretschner, Matthias Fruth","doi":"10.1145/3510457.3513069","DOIUrl":"https://doi.org/10.1145/3510457.3513069","url":null,"abstract":"During system testing of automotive electrical control units various reasons can lead to invalid test failures, e.g., non-responding components, faulty simulation models, faulty test case implementations, or hardware or software misconfigurations. To determine whether a test failure is invalid and what the underlying cause was, the test executions have to be analyzed manually, which is tedious and therefore costly. In this work, we report the magnitude of the problem of invalid test failures with four system testing projects from the automotive domain. We find that up to 91% of failed test executions are considered invalid. An oftentimes overlooked challenge are unreliable test infrastructures which deteriorate the validity of the test runs. In the studied projects already between 27% and 53% of failed test executions are linked to unreliable test infrastructures.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125303326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Testing Machine Learning Systems in Industry: An Empirical Study 工业测试机器学习系统:一项实证研究
Shuyue Li†, Jiaqi Guo, Jian-Guang Lou, Ming Fan, Ting Liu, Dongmei Zhang
Machine learning becomes increasingly prevalent and integrated into a wide range of software systems. These systems, named ML systems, must be adequately tested to gain confidence that they behave correctly. Although many research efforts have been devoted to testing technologies for ML systems, the industrial teams are faced with new challenges on testing the ML systems in real-world settings. To absorb inspirations from the industry on the problems in ML testing, we conducted an empirical study including a survey with 87 responses and interviews with 7 senior ML practitioners from well-known IT companies. Our study uncovers significant industrial concerns on major testing activities, i.e., test data collection, test execution, and test result analysis, and also the good practices and open challenges from the perspective of the industry. (1) Test data collection is conducted in different ways on ML model, data, and code and faced with different challenges. (2) Test execution in ML systems suffers from two major problems: entanglement among the components and the regression on model performance. (3) Test result analysis centers on quantitative methods, e.g., metric-based evaluation, and is combined with some qualitative methods based on practitioners’ experience. Based on our findings, we highlight the research opportunities and also provide some implications for practitioners.
机器学习变得越来越普遍,并集成到广泛的软件系统中。这些被称为ML系统的系统必须经过充分的测试,以获得对其正确行为的信心。尽管许多研究都致力于机器学习系统的测试技术,但工业团队面临着在现实环境中测试机器学习系统的新挑战。为了吸收业界对机器学习测试问题的启示,我们进行了一项实证研究,包括对87个回复的调查和对7位来自知名IT公司的高级机器学习从业者的访谈。我们的研究揭示了主要测试活动的重要工业关注点,即测试数据收集,测试执行和测试结果分析,以及从行业角度来看的良好实践和开放挑战。(1) ML模型、数据、代码的测试数据收集方式不同,面临的挑战也不同。(2)机器学习系统中的测试执行存在两个主要问题:组件之间的纠缠和模型性能的回归。(3)测试结果分析以定量方法为中心,如基于度量的评价,并结合一些基于从业者经验的定性方法。基于我们的研究结果,我们强调了研究的机会,并为从业者提供了一些启示。
{"title":"Testing Machine Learning Systems in Industry: An Empirical Study","authors":"Shuyue Li†, Jiaqi Guo, Jian-Guang Lou, Ming Fan, Ting Liu, Dongmei Zhang","doi":"10.1145/3510457.3513036","DOIUrl":"https://doi.org/10.1145/3510457.3513036","url":null,"abstract":"Machine learning becomes increasingly prevalent and integrated into a wide range of software systems. These systems, named ML systems, must be adequately tested to gain confidence that they behave correctly. Although many research efforts have been devoted to testing technologies for ML systems, the industrial teams are faced with new challenges on testing the ML systems in real-world settings. To absorb inspirations from the industry on the problems in ML testing, we conducted an empirical study including a survey with 87 responses and interviews with 7 senior ML practitioners from well-known IT companies. Our study uncovers significant industrial concerns on major testing activities, i.e., test data collection, test execution, and test result analysis, and also the good practices and open challenges from the perspective of the industry. (1) Test data collection is conducted in different ways on ML model, data, and code and faced with different challenges. (2) Test execution in ML systems suffers from two major problems: entanglement among the components and the regression on model performance. (3) Test result analysis centers on quantitative methods, e.g., metric-based evaluation, and is combined with some qualitative methods based on practitioners’ experience. Based on our findings, we highlight the research opportunities and also provide some implications for practitioners.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"658 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132351481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Issues in the Adoption of the Scaled Agile Framework 采用规模化敏捷框架的问题
P. Ciancarini, A. Kruglov, W. Pedrycz, Dilshat Salikhov, G. Succi
Agile methods were originally introduced for small sized, colocated teams. Their successful products immediately brought up the issue of adapting the methods also for large and distributed organizations engaged in projects to build major, complex products. Currently the most popular multi-teams agile method is the Scaled Agile Framework (SAFe) which, however, is subject to criticism: it appears to be quite demanding and expensive in terms of human resource and project management practices. Moreover, SAFe allegedly goes against some of the principles of agility. This research attempts to gather a deeper understanding of the matter first reviewing and analysing the studies published on this topic via a multivocal literature review and then with an extended empirical investigation on the matters that appear most controversial via the direct analysis of the work of 25 respondents from 17 different companies located in eight countries. Thus, the originality of this research is in the systemic assessment of the “level of flexibility” of SAFe, highlighting the challenges of adopting this framework as it relates to decision making, structure, and the technical and managerial competencies of the company. The results show that SAFe can be an effective and adequate approach if the company is ready to invest a significant effort and resources into it both in the form of providing time for SAFe to be properly absorbed and specific training for individuals.
敏捷方法最初是为小型的、分布在一起的团队引入的。他们成功的产品立即提出了一个问题,即将这些方法也适用于大型和分布式的组织,这些组织从事于构建主要的、复杂的产品的项目。目前最流行的多团队敏捷方法是规模敏捷框架(SAFe),然而,它受到了批评:它在人力资源和项目管理实践方面似乎要求很高,而且成本很高。此外,外管局据称违反了一些敏捷性原则。本研究试图收集问题的更深层次的理解,首先回顾和分析发表在这个主题上的研究,通过多声文献回顾,然后通过对来自8个国家的17家不同公司的25名受访者的工作的直接分析,对最有争议的问题进行扩展的实证调查。因此,本研究的独创性在于对外管局“灵活性水平”的系统评估,突出了采用该框架所面临的挑战,因为它与公司的决策、结构、技术和管理能力有关。结果表明,如果公司愿意投入大量的精力和资源,为SAFe提供适当吸收的时间和对个人的具体培训,SAFe可以成为一种有效和充分的方法。
{"title":"Issues in the Adoption of the Scaled Agile Framework","authors":"P. Ciancarini, A. Kruglov, W. Pedrycz, Dilshat Salikhov, G. Succi","doi":"10.1145/3510457.3513028","DOIUrl":"https://doi.org/10.1145/3510457.3513028","url":null,"abstract":"Agile methods were originally introduced for small sized, colocated teams. Their successful products immediately brought up the issue of adapting the methods also for large and distributed organizations engaged in projects to build major, complex products. Currently the most popular multi-teams agile method is the Scaled Agile Framework (SAFe) which, however, is subject to criticism: it appears to be quite demanding and expensive in terms of human resource and project management practices. Moreover, SAFe allegedly goes against some of the principles of agility. This research attempts to gather a deeper understanding of the matter first reviewing and analysing the studies published on this topic via a multivocal literature review and then with an extended empirical investigation on the matters that appear most controversial via the direct analysis of the work of 25 respondents from 17 different companies located in eight countries. Thus, the originality of this research is in the systemic assessment of the “level of flexibility” of SAFe, highlighting the challenges of adopting this framework as it relates to decision making, structure, and the technical and managerial competencies of the company. The results show that SAFe can be an effective and adequate approach if the company is ready to invest a significant effort and resources into it both in the form of providing time for SAFe to be properly absorbed and specific training for individuals.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132810598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Verifying Dynamic Trait Objects in Rust 在Rust中验证动态Trait对象
Alexa VanHattum, Daniel Schwartz-Narbonne, Nathan Chong, Adrian Sampson
Rust has risen in prominence as a systems programming language in large part due to its focus on reliability. The language's advanced type system and borrow checker eliminate certain classes of memory safety violations. But for critical pieces of code, teams need assurance beyond what the type checker alone can provide. Verification tools for Rust can check other properties, from memory faults in unsafe Rust code to user-defined correctness assertions. This paper particularly focuses on the challenges in reasoning about Rust's dynamic trait objects, a feature that provides dynamic dispatch for function abstractions. While the explicit dyn keyword that denotes dynamic dispatch is used in 37% of the 500 most-downloaded Rust libraries (crates), dynamic dispatch is implicitly linked into 70%. To our knowledge, our open-source Kani Rust Verifier is the first symbolic modeling checking tool for Rust that can verify correctness while supporting the breadth of dynamic trait objects, including dynamically dispatched closures. We show how our system uses semantic trait information from Rust's Mid-level Intermediate Representation (an advantage over targeting a language-agnostic level such as LLVM) to improve verification performance by 5%–15× for examples from open-source virtualization software. Finally, we share an open-source suite of verification test cases for dynamic trait objects.
作为一种系统编程语言,Rust之所以声名鹊起,很大程度上是因为它对可靠性的关注。该语言的高级类型系统和借用检查器消除了某些类型的内存安全违规。但是对于关键的代码片段,团队需要的保证超出了类型检查器所能提供的范围。Rust的验证工具可以检查其他属性,从不安全Rust代码中的内存错误到用户定义的正确性断言。本文特别关注Rust动态trait对象推理中的挑战,动态trait对象是为函数抽象提供动态分派的特性。在下载最多的500个Rust库(crate)中,37%使用了显式dyn关键字来表示动态分派,而70%使用了隐式动态分派。据我们所知,我们的开源Kani Rust验证器是Rust的第一个符号建模检查工具,它可以在支持动态trait对象的宽度(包括动态分派的闭包)的同时验证正确性。我们展示了我们的系统如何使用来自Rust的中级表示(middle -level Intermediate Representation)的语义特征信息(相对于针对语言无关级别(如LLVM)的优势),以开源虚拟化软件为例,将验证性能提高5% - 15x。最后,我们将分享一套用于动态特征对象的验证测试用例。
{"title":"Verifying Dynamic Trait Objects in Rust","authors":"Alexa VanHattum, Daniel Schwartz-Narbonne, Nathan Chong, Adrian Sampson","doi":"10.1145/3510457.3513031","DOIUrl":"https://doi.org/10.1145/3510457.3513031","url":null,"abstract":"Rust has risen in prominence as a systems programming language in large part due to its focus on reliability. The language's advanced type system and borrow checker eliminate certain classes of memory safety violations. But for critical pieces of code, teams need assurance beyond what the type checker alone can provide. Verification tools for Rust can check other properties, from memory faults in unsafe Rust code to user-defined correctness assertions. This paper particularly focuses on the challenges in reasoning about Rust's dynamic trait objects, a feature that provides dynamic dispatch for function abstractions. While the explicit dyn keyword that denotes dynamic dispatch is used in 37% of the 500 most-downloaded Rust libraries (crates), dynamic dispatch is implicitly linked into 70%. To our knowledge, our open-source Kani Rust Verifier is the first symbolic modeling checking tool for Rust that can verify correctness while supporting the breadth of dynamic trait objects, including dynamically dispatched closures. We show how our system uses semantic trait information from Rust's Mid-level Intermediate Representation (an advantage over targeting a language-agnostic level such as LLVM) to improve verification performance by 5%–15× for examples from open-source virtualization software. Finally, we share an open-source suite of verification test cases for dynamic trait objects.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133331560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Industry's Cry for Tools that Support Large-Scale Refactoring 业界对支持大规模重构的工具的呼声
James Ivers, R. Nord, I. Ozkaya, Chris Seifried, C. Timperley, M. Kessentini
Software refactoring plays an important role in software engineering. Developers often turn to refactoring when they want to restructure software to improve its quality without changing its external behavior. Compared to small-scale (floss) refactoring, many refactoring efforts are much larger, requiring entire teams and months of effort, and the role of tools in these efforts is not as well studied. This short paper introduces an industry survey that we conducted. Results from 107 developers demonstrate that projects commonly go through multiple large-scale refactorings, each of which requires considerable effort. While there is often a desire to refactor, other business concerns such as developing new features often take higher priority. Our study finds that developers use several categories of tools to support large-scale refactoring and rely more heavily on general-purpose tools like IDEs than on tools designed specifically to support refactoring. Tool support varies across the different activities (spanning communication, reasoning, and technical activities), with some particularly challenging activities seeing little use of tools in practice. Our study demonstrates a clear need for better large-scale refactoring tools.
软件重构在软件工程中起着重要的作用。当开发人员想要重构软件以在不改变其外部行为的情况下提高其质量时,他们经常求助于重构。与小规模(牙线)重构相比,许多重构工作要大得多,需要整个团队和几个月的努力,而工具在这些工作中的作用并没有得到很好的研究。这篇短文介绍了我们所做的一项行业调查。来自107个开发人员的结果表明,项目通常要经历多次大规模的重构,每一次都需要付出相当大的努力。虽然经常有重构的愿望,但其他业务关注点(如开发新功能)通常具有更高的优先级。我们的研究发现,开发人员使用几类工具来支持大规模重构,并且更多地依赖于ide等通用工具,而不是专门为支持重构而设计的工具。工具支持在不同的活动(跨越交流、推理和技术活动)中是不同的,在一些特别具有挑战性的活动中,在实践中很少使用工具。我们的研究清楚地表明,需要更好的大规模重构工具。
{"title":"Industry's Cry for Tools that Support Large-Scale Refactoring","authors":"James Ivers, R. Nord, I. Ozkaya, Chris Seifried, C. Timperley, M. Kessentini","doi":"10.1145/3510457.3513074","DOIUrl":"https://doi.org/10.1145/3510457.3513074","url":null,"abstract":"Software refactoring plays an important role in software engineering. Developers often turn to refactoring when they want to restructure software to improve its quality without changing its external behavior. Compared to small-scale (floss) refactoring, many refactoring efforts are much larger, requiring entire teams and months of effort, and the role of tools in these efforts is not as well studied. This short paper introduces an industry survey that we conducted. Results from 107 developers demonstrate that projects commonly go through multiple large-scale refactorings, each of which requires considerable effort. While there is often a desire to refactor, other business concerns such as developing new features often take higher priority. Our study finds that developers use several categories of tools to support large-scale refactoring and rely more heavily on general-purpose tools like IDEs than on tools designed specifically to support refactoring. Tool support varies across the different activities (spanning communication, reasoning, and technical activities), with some particularly challenging activities seeing little use of tools in practice. Our study demonstrates a clear need for better large-scale refactoring tools.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"4 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114046188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Impact of Flaky Tests on Historical Test Prioritization on Chrome 片状测试对Chrome历史测试优先级的影响
Emad Fallahzadeh, Peter C. Rigby
Test prioritization algorithms prioritize probable failing tests to give faster feedback to developers in case a failure occurs. Test prioritization approaches that use historical failures to run tests that have failed in the past may be susceptible to flaky tests as these tests often fail and then pass without identifying a fault. Traditionally, flaky failures like other types of failures are considered blocking, i. e. a test that needs to be investigated before the code can move to the next stage. However, on Google Chrome, flaky failures are non-blocking and the code still moves to the next stage in the CI pipeline. In this work, we explain the Chrome testing pipeline and classification. Then, we re-implement two important history based test prioritization algorithms and evaluate them on over 276 million test runs from the Chrome project. We apply these algorithms in two scenarios. First, we consider flaky failures as blocking and then, we use Chrome's approach and consider flaky failures as non-blocking. Our investigation reveals that 99.58% of all failures are flaky. These types of failures are much more repetitive than non-flaky failures, and they are also well distributed over time. We conclude that the prior performance of the prioritization algorithms have been inflated by flaky failures. We release our data and scripts in our replication package [8].
测试优先排序算法对可能失败的测试进行优先排序,以便在发生失败时更快地向开发人员提供反馈。使用历史失败来运行过去失败的测试的测试优先级方法可能容易受到不可靠测试的影响,因为这些测试经常失败,然后在没有识别错误的情况下通过。传统上,像其他类型的失败一样的片状失败被认为是阻塞的,即需要在代码移动到下一阶段之前进行调查的测试。然而,在Google Chrome上,零星的失败是不阻塞的,代码仍然会转移到CI管道的下一个阶段。在本工作中,我们解释了Chrome的测试流程和分类。然后,我们重新实现了两个重要的基于历史的测试优先级算法,并在Chrome项目的2.76亿次测试中对它们进行了评估。我们在两种情况下应用这些算法。首先,我们将片状失败视为阻塞,然后,我们使用Chrome的方法,将片状失败视为非阻塞。我们的调查显示99.58%的失败都是不可靠的。这些类型的失败比非片状失败更具重复性,并且它们也随着时间的推移而分布良好。我们得出结论,优先排序算法的先前性能已经膨胀的片状故障。我们在复制包中发布数据和脚本[8]。
{"title":"The Impact of Flaky Tests on Historical Test Prioritization on Chrome","authors":"Emad Fallahzadeh, Peter C. Rigby","doi":"10.1145/3510457.3513038","DOIUrl":"https://doi.org/10.1145/3510457.3513038","url":null,"abstract":"Test prioritization algorithms prioritize probable failing tests to give faster feedback to developers in case a failure occurs. Test prioritization approaches that use historical failures to run tests that have failed in the past may be susceptible to flaky tests as these tests often fail and then pass without identifying a fault. Traditionally, flaky failures like other types of failures are considered blocking, i. e. a test that needs to be investigated before the code can move to the next stage. However, on Google Chrome, flaky failures are non-blocking and the code still moves to the next stage in the CI pipeline. In this work, we explain the Chrome testing pipeline and classification. Then, we re-implement two important history based test prioritization algorithms and evaluate them on over 276 million test runs from the Chrome project. We apply these algorithms in two scenarios. First, we consider flaky failures as blocking and then, we use Chrome's approach and consider flaky failures as non-blocking. Our investigation reveals that 99.58% of all failures are flaky. These types of failures are much more repetitive than non-flaky failures, and they are also well distributed over time. We conclude that the prior performance of the prioritization algorithms have been inflated by flaky failures. We release our data and scripts in our replication package [8].","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116035539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An Asynchronous Call Graph for JavaScript JavaScript的异步调用图
Dominik Seifert, Michael Wan, Jane Yung-jen Hsu, Benson Yeh
Asynchronous JavaScript has become omnipresent, yet is inherently difficult to reason about. While many recent debugging tools are trying to address this issue with (semi-)automatic methods, interactive analysis tools are few and far between. To this date, developers are required to build mental models of complex concurrent control flows with little to no tool support. Thus, asynchrony is making life hard for novices and catches even seasoned developers off-guard, especially when dealing with unfamiliar code. That is why we propose the Asynchronous Call Graph. It is the first approach to capture and visualize concurrent control flow between call graph roots. It is also the first concurrency analysis tool for JavaScript that is fully interactive and integrated with an omniscient debugger in a popular IDE. First tests show that the ACG works successfully on real-world codebases. This approach has the potential to set a new standard for how developers can analyze asynchrony.
异步JavaScript已经变得无所不在,但它本身就很难解释。虽然许多最近的调试工具都试图用(半)自动化方法解决这个问题,但交互式分析工具却很少。到目前为止,开发人员需要在很少甚至没有工具支持的情况下构建复杂并发控制流的心智模型。因此,异步使新手的生活变得艰难,甚至使经验丰富的开发人员猝不及防,特别是在处理不熟悉的代码时。这就是我们提出异步调用图的原因。它是捕获和可视化调用图根之间并发控制流的第一种方法。它也是第一个完全交互式的JavaScript并发分析工具,并且在流行的IDE中集成了一个无所不知的调试器。第一次测试表明,ACG在真实的代码库上成功地工作。这种方法有可能为开发人员如何分析异步设置一个新标准。
{"title":"An Asynchronous Call Graph for JavaScript","authors":"Dominik Seifert, Michael Wan, Jane Yung-jen Hsu, Benson Yeh","doi":"10.1145/3510457.3513059","DOIUrl":"https://doi.org/10.1145/3510457.3513059","url":null,"abstract":"Asynchronous JavaScript has become omnipresent, yet is inherently difficult to reason about. While many recent debugging tools are trying to address this issue with (semi-)automatic methods, interactive analysis tools are few and far between. To this date, developers are required to build mental models of complex concurrent control flows with little to no tool support. Thus, asynchrony is making life hard for novices and catches even seasoned developers off-guard, especially when dealing with unfamiliar code. That is why we propose the Asynchronous Call Graph. It is the first approach to capture and visualize concurrent control flow between call graph roots. It is also the first concurrency analysis tool for JavaScript that is fully interactive and integrated with an omniscient debugger in a popular IDE. First tests show that the ACG works successfully on real-world codebases. This approach has the potential to set a new standard for how developers can analyze asynchrony.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129213269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1