{"title":"An Empirical Study on Source Code Feature Extraction in Preprocessing of IR-Based Requirements Traceability","authors":"Bangchao Wang, Yang Deng, Ruiqi Luo, Huan Jin","doi":"10.1109/QRS57517.2022.00110","DOIUrl":null,"url":null,"abstract":"In information retrieval-based (IR-based) requirements traceability research, a great deal of researches have focused on establishing trace links between requirements and source code. However, as the description styles of source code and requirements are very different, how to better preprocess the code is crucial for the quality of trace link generation. This paper aims to draw empirical conclusions about code feature extraction, annotation importance assessment, and annotation redundancy removal through comprehensive experiments, which impact the quality of trace links generated by IR-based methods between requirements and source code. The results show that when the average annotaion density is higher than 0.2, feature extraction is recommended. Removing redundancy from code with high annotation redundancy can enhance the quality of trace links. The above experiences can help developers to improve the quality of trace link generation and provide them with advice on writing code.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS57517.2022.00110","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In information retrieval-based (IR-based) requirements traceability research, a great deal of researches have focused on establishing trace links between requirements and source code. However, as the description styles of source code and requirements are very different, how to better preprocess the code is crucial for the quality of trace link generation. This paper aims to draw empirical conclusions about code feature extraction, annotation importance assessment, and annotation redundancy removal through comprehensive experiments, which impact the quality of trace links generated by IR-based methods between requirements and source code. The results show that when the average annotaion density is higher than 0.2, feature extraction is recommended. Removing redundancy from code with high annotation redundancy can enhance the quality of trace links. The above experiences can help developers to improve the quality of trace link generation and provide them with advice on writing code.