利用树核的源代码相似性确定回归测试优先级

IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Journal of Software-Evolution and Process Pub Date : 2024-02-15 DOI:10.1002/smr.2653
Francesco Altiero, Anna Corazza, Sergio Di Martino, Adriano Peron, Luigi Libero Lucio Starace
{"title":"利用树核的源代码相似性确定回归测试优先级","authors":"Francesco Altiero,&nbsp;Anna Corazza,&nbsp;Sergio Di Martino,&nbsp;Adriano Peron,&nbsp;Luigi Libero Lucio Starace","doi":"10.1002/smr.2653","DOIUrl":null,"url":null,"abstract":"<p>Regression test prioritization (RTP) is an active research field, aiming at re-ordering the tests in a test suite to maximize the rate at which faults are detected. A number of RTP strategies have been proposed, leveraging different factors to reorder tests. Some techniques include an analysis of changed source code, to assign higher priority to tests stressing modified parts of the codebase. Still, most of these change-based solutions focus on simple text-level comparisons among versions. We believe that measuring source code changes in a more refined way, capable of discriminating between mere textual changes (e.g., renaming of a local variable) and more structural changes (e.g., changes in the control flow), could lead to significant benefits in RTP, under the assumption that major structural changes are also more likely to introduce faults. To this end, we propose two novel RTP techniques that leverage <i>tree kernels</i> (TK), a class of similarity functions largely used in Natural Language Processing on tree-structured data. In particular, we apply TKs to abstract syntax trees of source code, to more precisely quantify the extent of structural changes in the source code, and prioritize tests accordingly. We assessed the effectiveness of the proposals by conducting an empirical study on five real-world Java projects, also used in a number of RTP-related papers. We automatically generated, for each considered pair of software versions (i.e., old version, new version) in the evolution of the involved projects, 100 variations with artificially injected faults, leading to over 5k different software evolution scenarios overall. We compared the proposed prioritization approaches against well-known prioritization techniques, evaluating both their effectiveness and their execution times. Our findings show that leveraging more refined code change analysis techniques to quantify the extent of changes in source code can lead to relevant improvements in prioritization effectiveness, while typically introducing negligible overheads due to their execution.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 8","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Regression test prioritization leveraging source code similarity with tree kernels\",\"authors\":\"Francesco Altiero,&nbsp;Anna Corazza,&nbsp;Sergio Di Martino,&nbsp;Adriano Peron,&nbsp;Luigi Libero Lucio Starace\",\"doi\":\"10.1002/smr.2653\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Regression test prioritization (RTP) is an active research field, aiming at re-ordering the tests in a test suite to maximize the rate at which faults are detected. A number of RTP strategies have been proposed, leveraging different factors to reorder tests. Some techniques include an analysis of changed source code, to assign higher priority to tests stressing modified parts of the codebase. Still, most of these change-based solutions focus on simple text-level comparisons among versions. We believe that measuring source code changes in a more refined way, capable of discriminating between mere textual changes (e.g., renaming of a local variable) and more structural changes (e.g., changes in the control flow), could lead to significant benefits in RTP, under the assumption that major structural changes are also more likely to introduce faults. To this end, we propose two novel RTP techniques that leverage <i>tree kernels</i> (TK), a class of similarity functions largely used in Natural Language Processing on tree-structured data. In particular, we apply TKs to abstract syntax trees of source code, to more precisely quantify the extent of structural changes in the source code, and prioritize tests accordingly. We assessed the effectiveness of the proposals by conducting an empirical study on five real-world Java projects, also used in a number of RTP-related papers. We automatically generated, for each considered pair of software versions (i.e., old version, new version) in the evolution of the involved projects, 100 variations with artificially injected faults, leading to over 5k different software evolution scenarios overall. We compared the proposed prioritization approaches against well-known prioritization techniques, evaluating both their effectiveness and their execution times. Our findings show that leveraging more refined code change analysis techniques to quantify the extent of changes in source code can lead to relevant improvements in prioritization effectiveness, while typically introducing negligible overheads due to their execution.</p>\",\"PeriodicalId\":48898,\"journal\":{\"name\":\"Journal of Software-Evolution and Process\",\"volume\":\"36 8\",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2024-02-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Software-Evolution and Process\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/smr.2653\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Software-Evolution and Process","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/smr.2653","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

回归测试优先级排序(RTP)是一个活跃的研究领域,旨在对测试套件中的测试重新排序,以最大限度地提高故障检测率。人们提出了许多 RTP 策略,利用不同的因素对测试重新排序。有些技术包括分析已更改的源代码,为强调代码库已修改部分的测试分配更高的优先级。不过,这些基于变化的解决方案大多侧重于版本间简单的文本级比较。我们认为,以更精细的方式衡量源代码的变化,能够区分单纯的文本变化(如局部变量的重命名)和更多的结构变化(如控制流的变化),可以为 RTP 带来显著的好处,前提是重大的结构变化也更有可能引入故障。为此,我们提出了两种利用树核(TK)的新型 RTP 技术,树核是自然语言处理中广泛应用于树形结构数据的一类相似性函数。特别是,我们将 TK 应用于源代码的抽象语法树,以更精确地量化源代码中结构变化的程度,并相应地确定测试的优先级。我们在五个实际 Java 项目中进行了实证研究,评估了这些建议的有效性。我们为所涉及项目演化过程中的每一对软件版本(即旧版本、新版本)自动生成了 100 种带有人为注入故障的变体,从而产生了超过 5k 种不同的软件演化场景。我们将提出的优先级排序方法与众所周知的优先级排序技术进行了比较,评估了它们的有效性和执行时间。我们的研究结果表明,利用更精细的代码变更分析技术来量化源代码的变更程度,可以显著提高优先级排序的有效性,同时由于其执行而带来的开销通常可以忽略不计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Regression test prioritization leveraging source code similarity with tree kernels

Regression test prioritization (RTP) is an active research field, aiming at re-ordering the tests in a test suite to maximize the rate at which faults are detected. A number of RTP strategies have been proposed, leveraging different factors to reorder tests. Some techniques include an analysis of changed source code, to assign higher priority to tests stressing modified parts of the codebase. Still, most of these change-based solutions focus on simple text-level comparisons among versions. We believe that measuring source code changes in a more refined way, capable of discriminating between mere textual changes (e.g., renaming of a local variable) and more structural changes (e.g., changes in the control flow), could lead to significant benefits in RTP, under the assumption that major structural changes are also more likely to introduce faults. To this end, we propose two novel RTP techniques that leverage tree kernels (TK), a class of similarity functions largely used in Natural Language Processing on tree-structured data. In particular, we apply TKs to abstract syntax trees of source code, to more precisely quantify the extent of structural changes in the source code, and prioritize tests accordingly. We assessed the effectiveness of the proposals by conducting an empirical study on five real-world Java projects, also used in a number of RTP-related papers. We automatically generated, for each considered pair of software versions (i.e., old version, new version) in the evolution of the involved projects, 100 variations with artificially injected faults, leading to over 5k different software evolution scenarios overall. We compared the proposed prioritization approaches against well-known prioritization techniques, evaluating both their effectiveness and their execution times. Our findings show that leveraging more refined code change analysis techniques to quantify the extent of changes in source code can lead to relevant improvements in prioritization effectiveness, while typically introducing negligible overheads due to their execution.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Software-Evolution and Process
Journal of Software-Evolution and Process COMPUTER SCIENCE, SOFTWARE ENGINEERING-
自引率
10.00%
发文量
109
期刊最新文献
Issue Information Issue Information A hybrid‐ensemble model for software defect prediction for balanced and imbalanced datasets using AI‐based techniques with feature preservation: SMERKP‐XGB Issue Information LLMs for science: Usage for code generation and data analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1