Enhancing Unsupervised Requirements Traceability with Sequential Semantics

2019 26th Asia-Pacific Software Engineering Conference (APSEC) Pub Date : 2019-12-01 DOI:10.1109/APSEC48747.2019.00013

Lei Chen, Dandan Wang, Junjie Wang, Qing Wang

{"title":"Enhancing Unsupervised Requirements Traceability with Sequential Semantics","authors":"Lei Chen, Dandan Wang, Junjie Wang, Qing Wang","doi":"10.1109/APSEC48747.2019.00013","DOIUrl":null,"url":null,"abstract":"Requirements traceability provides important support throughout all software life cycle; however, creating such links manually is time-consuming and error-prone. Supervised automated solutions use machine learning or deep learning techniques to generate trace links, but require large labeled dataset to train an effective model. Unsupervised solutions as word embedding approaches can generate links by capturing the semantic meaning of artifacts and are gaining more attention. Despite that, our observation revealed that, besides the semantic information, the sequential information of terms in the artifacts would provide additional assistance for building the accurate links. This paper proposes an unsupervised requirements traceability approach (named S2Trace) which learns the Sequential Semantics of software artifacts to generate the trace links. Its core idea is to mine the sequential patterns and use them to learn the document embedding representation. Evaluation is conducted on five public datasets, and results show that our approach outperforms three typical baselines. The modeling of sequential information in this paper provides new insights into the unsupervised traceability solutions, and the improvement in the traceability accuracy further proves the usefulness of the sequential information.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"44 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC48747.2019.00013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Requirements traceability provides important support throughout all software life cycle; however, creating such links manually is time-consuming and error-prone. Supervised automated solutions use machine learning or deep learning techniques to generate trace links, but require large labeled dataset to train an effective model. Unsupervised solutions as word embedding approaches can generate links by capturing the semantic meaning of artifacts and are gaining more attention. Despite that, our observation revealed that, besides the semantic information, the sequential information of terms in the artifacts would provide additional assistance for building the accurate links. This paper proposes an unsupervised requirements traceability approach (named S2Trace) which learns the Sequential Semantics of software artifacts to generate the trace links. Its core idea is to mine the sequential patterns and use them to learn the document embedding representation. Evaluation is conducted on five public datasets, and results show that our approach outperforms three typical baselines. The modeling of sequential information in this paper provides new insights into the unsupervised traceability solutions, and the improvement in the traceability accuracy further proves the usefulness of the sequential information.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用顺序语义增强无监督需求的可追溯性

需求可追溯性在整个软件生命周期中提供重要的支持;但是，手动创建这样的链接既耗时又容易出错。监督式自动化解决方案使用机器学习或深度学习技术来生成跟踪链接，但需要大型标记数据集来训练有效的模型。作为词嵌入方法的无监督解决方案可以通过捕获工件的语义来生成链接，并且越来越受到关注。尽管如此，我们的观察表明，除了语义信息之外，工件中术语的顺序信息将为构建准确的链接提供额外的帮助。本文提出了一种无监督的需求跟踪方法(命名为S2Trace)，该方法通过学习软件构件的顺序语义来生成跟踪链接。其核心思想是挖掘序列模式并利用序列模式学习文档嵌入表示。在五个公共数据集上进行了评估，结果表明我们的方法优于三个典型的基线。本文对序列信息的建模为无监督跟踪解决方案提供了新的见解，而跟踪精度的提高进一步证明了序列信息的有用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

自引率

0.00%

发文量