{"title":"Replication in Requirements Engineering: the NLP for RE Case","authors":"Sallam Abualhaija, Fatma Başak Aydemir, Fabiano Dalpiaz, Davide Dell’Anna, Alessio Ferrari, Xavier Franch, Davide Fucci","doi":"10.1145/3658669","DOIUrl":null,"url":null,"abstract":"<p><b>[Context]</b> Natural language processing (NLP) techniques have been widely applied in the requirements engineering (RE) field to support tasks such as classification and ambiguity detection. Despite its empirical vocation, RE research has given limited attention to replication of NLP for RE studies. Replication is hampered by several factors, including the context specificity of the studies, the heterogeneity of the tasks involving NLP, the tasks’ inherent <i>hairiness</i>, and, in turn, the heterogeneous reporting structure. <b>[Solution]</b> To address these issues, we propose a new artifact, referred to as <span>ID-Card</span>, whose goal is to provide a structured summary of research papers emphasizing replication-relevant information. We construct the <span>ID-Card</span> through a structured, iterative process based on design science. <b>[Results]</b> In this paper: (i) we report on hands-on experiences of replication, (ii) we review the state-of-the-art and extract replication-relevant information, (iii) we identify, through focus groups, challenges across two typical dimensions of replication: data annotation and tool reconstruction, and (iv) we present the concept and structure of the <span>ID-Card</span> to mitigate the identified challenges. <b>[Contribution]</b> This study aims to create awareness of replication in NLP for RE. We propose an <span>ID-Card</span> that is intended to foster study replication, but can also be used in other contexts, e.g., for educational purposes.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"45 1","pages":""},"PeriodicalIF":6.6000,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3658669","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
[Context] Natural language processing (NLP) techniques have been widely applied in the requirements engineering (RE) field to support tasks such as classification and ambiguity detection. Despite its empirical vocation, RE research has given limited attention to replication of NLP for RE studies. Replication is hampered by several factors, including the context specificity of the studies, the heterogeneity of the tasks involving NLP, the tasks’ inherent hairiness, and, in turn, the heterogeneous reporting structure. [Solution] To address these issues, we propose a new artifact, referred to as ID-Card, whose goal is to provide a structured summary of research papers emphasizing replication-relevant information. We construct the ID-Card through a structured, iterative process based on design science. [Results] In this paper: (i) we report on hands-on experiences of replication, (ii) we review the state-of-the-art and extract replication-relevant information, (iii) we identify, through focus groups, challenges across two typical dimensions of replication: data annotation and tool reconstruction, and (iv) we present the concept and structure of the ID-Card to mitigate the identified challenges. [Contribution] This study aims to create awareness of replication in NLP for RE. We propose an ID-Card that is intended to foster study replication, but can also be used in other contexts, e.g., for educational purposes.
期刊介绍:
Designing and building a large, complex software system is a tremendous challenge. ACM Transactions on Software Engineering and Methodology (TOSEM) publishes papers on all aspects of that challenge: specification, design, development and maintenance. It covers tools and methodologies, languages, data structures, and algorithms. TOSEM also reports on successful efforts, noting practical lessons that can be scaled and transferred to other projects, and often looks at applications of innovative technologies. The tone is scholarly but readable; the content is worthy of study; the presentation is effective.