探索在测试到代码的可追溯性中利用概念信息的好处

2018 IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE) Pub Date : 2018-05-28 DOI:10.1145/3194104.3194106

András Kicsi, L. Tóth, László Vidács

{"title":"探索在测试到代码的可追溯性中利用概念信息的好处","authors":"András Kicsi, L. Tóth, László Vidács","doi":"10.1145/3194104.3194106","DOIUrl":null,"url":null,"abstract":"Striving for reliability of software systems often results in immense numbers of tests. Due to the lack of a generally used annotation, finding the parts of code these tests were meant to assess can be a demanding task. This is a valid problem of software engineering called test-to-code traceability. Recent research on the subject has attempted to cope with this problem applying various approaches and their combinations, achieving profound results. These approaches have involved the use of naming conventions during development processes and also have utilized various information retrieval (IR) methods often referred to as conceptual information. In this work we investigate the benefits of textual information located in software code and its value for aiding traceability. We evaluated the capabilities of the natural language processing technique called Latent Semantic Indexing (LSI) in the view of the results of the naming conventions technique on five real, medium sized software systems. Although LSI is already used for this purpose, we extend the viewpoint of one-to-one traceability approach to the more versatile view of LSI as a recommendation system. We found that considering the top 5 elements in the ranked list increases the results by 30% on average and makes LSI a viable alternative in projects where naming conventions are not followed systematically.","PeriodicalId":249268,"journal":{"name":"2018 IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Exploring the Benefits of Utilizing Conceptual Information in Test-to-Code Traceability\",\"authors\":\"András Kicsi, L. Tóth, László Vidács\",\"doi\":\"10.1145/3194104.3194106\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Striving for reliability of software systems often results in immense numbers of tests. Due to the lack of a generally used annotation, finding the parts of code these tests were meant to assess can be a demanding task. This is a valid problem of software engineering called test-to-code traceability. Recent research on the subject has attempted to cope with this problem applying various approaches and their combinations, achieving profound results. These approaches have involved the use of naming conventions during development processes and also have utilized various information retrieval (IR) methods often referred to as conceptual information. In this work we investigate the benefits of textual information located in software code and its value for aiding traceability. We evaluated the capabilities of the natural language processing technique called Latent Semantic Indexing (LSI) in the view of the results of the naming conventions technique on five real, medium sized software systems. Although LSI is already used for this purpose, we extend the viewpoint of one-to-one traceability approach to the more versatile view of LSI as a recommendation system. We found that considering the top 5 elements in the ranked list increases the results by 30% on average and makes LSI a viable alternative in projects where naming conventions are not followed systematically.\",\"PeriodicalId\":249268,\"journal\":{\"name\":\"2018 IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE)\",\"volume\":\"76 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3194104.3194106\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3194104.3194106","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

对软件系统可靠性的追求常常导致大量的测试。由于缺乏常用的注释，找到这些测试要评估的代码部分可能是一项艰巨的任务。这是软件工程中一个有效的问题，称为测试到代码的可追溯性。近年来对这一问题的研究试图运用各种方法及其组合来解决这一问题，取得了深刻的成果。这些方法涉及在开发过程中使用命名约定，并且还利用了通常称为概念信息的各种信息检索(IR)方法。在这项工作中，我们研究了位于软件代码中的文本信息的好处及其在帮助可追溯性方面的价值。根据命名约定技术在五个真实的中型软件系统上的结果，我们评估了称为潜在语义索引(LSI)的自然语言处理技术的能力。虽然LSI已经用于此目的，但我们将一对一可追溯性方法的观点扩展到LSI作为推荐系统的更通用的观点。我们发现，考虑排名表中的前5个元素平均可使结果提高30%，并使LSI成为没有系统遵循命名惯例的项目的可行选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Exploring the Benefits of Utilizing Conceptual Information in Test-to-Code Traceability

Striving for reliability of software systems often results in immense numbers of tests. Due to the lack of a generally used annotation, finding the parts of code these tests were meant to assess can be a demanding task. This is a valid problem of software engineering called test-to-code traceability. Recent research on the subject has attempted to cope with this problem applying various approaches and their combinations, achieving profound results. These approaches have involved the use of naming conventions during development processes and also have utilized various information retrieval (IR) methods often referred to as conceptual information. In this work we investigate the benefits of textual information located in software code and its value for aiding traceability. We evaluated the capabilities of the natural language processing technique called Latent Semantic Indexing (LSI) in the view of the results of the naming conventions technique on five real, medium sized software systems. Although LSI is already used for this purpose, we extend the viewpoint of one-to-one traceability approach to the more versatile view of LSI as a recommendation system. We found that considering the top 5 elements in the ranked list increases the results by 30% on average and makes LSI a viable alternative in projects where naming conventions are not followed systematically.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE)

自引率

0.00%

发文量