A Pipeline for Automating Labeling to Prediction in Classification of NFRs

Ranit Chatterjee, Abdul Ahmed, Preethu Rose Anish, B. Suman, Prashant Lawhatre, S. Ghaisas
{"title":"A Pipeline for Automating Labeling to Prediction in Classification of NFRs","authors":"Ranit Chatterjee, Abdul Ahmed, Preethu Rose Anish, B. Suman, Prashant Lawhatre, S. Ghaisas","doi":"10.1109/RE51729.2021.00036","DOIUrl":null,"url":null,"abstract":"Non-Functional Requirements (NFRs) focus on the operational constraints of the software system. Early detection of NFRs enables their incorporation into the architectural design at an initial stage, a practice obviously preferable to expensive refactoring at a later stage. Automated identification and classification of NFRs has therefore seen numerous efforts using rule-based, machine learning and deep learning-based approaches. One of the major challenges for such an automation is the manual effort that needs to be invested into labeling of training data. This is a concern for large software vendors who typically work on a variety of applications in diverse domains. We address this challenge by designing a pipeline that facilitates classification of NFRs using only a limited amount (~ 20% of an available new dataset) of labeled data for training. We (1) employed Snorkel to automatically label a dataset comprising NFRs from various Software Requirement Specification documents, (2) trained several classifiers using it, and (3) reused these pre-trained classifiers using a Transfer Learning approach to classify NFRs in industry-specific datasets. From among the various language model classifiers, the best results have been obtained for a BERT based classifier fine-tuned to learn the linguistic intricacies of three different domain-specific datasets from real-life projects.","PeriodicalId":440285,"journal":{"name":"2021 IEEE 29th International Requirements Engineering Conference (RE)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 29th International Requirements Engineering Conference (RE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RE51729.2021.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Non-Functional Requirements (NFRs) focus on the operational constraints of the software system. Early detection of NFRs enables their incorporation into the architectural design at an initial stage, a practice obviously preferable to expensive refactoring at a later stage. Automated identification and classification of NFRs has therefore seen numerous efforts using rule-based, machine learning and deep learning-based approaches. One of the major challenges for such an automation is the manual effort that needs to be invested into labeling of training data. This is a concern for large software vendors who typically work on a variety of applications in diverse domains. We address this challenge by designing a pipeline that facilitates classification of NFRs using only a limited amount (~ 20% of an available new dataset) of labeled data for training. We (1) employed Snorkel to automatically label a dataset comprising NFRs from various Software Requirement Specification documents, (2) trained several classifiers using it, and (3) reused these pre-trained classifiers using a Transfer Learning approach to classify NFRs in industry-specific datasets. From among the various language model classifiers, the best results have been obtained for a BERT based classifier fine-tuned to learn the linguistic intricacies of three different domain-specific datasets from real-life projects.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
NFRs分类中自动标注到预测的流水线
非功能需求(nfr)关注软件系统的操作约束。早期检测nfr可以在初始阶段将它们合并到架构设计中,这种做法显然比在后期进行昂贵的重构更可取。因此,使用基于规则、机器学习和深度学习的方法对非自然灾害的自动识别和分类进行了大量的努力。这种自动化的主要挑战之一是需要投入人工工作来标记训练数据。这是大型软件供应商所关心的问题,他们通常在不同领域中处理各种应用程序。我们通过设计一个管道来解决这一挑战,该管道仅使用有限数量(约占可用新数据集的20%)的标记数据进行训练,从而促进nfr的分类。我们(1)使用Snorkel自动标记包含来自各种软件需求规范文档的nfr的数据集,(2)使用它训练几个分类器,(3)使用迁移学习方法重用这些预训练的分类器,对行业特定数据集中的nfr进行分类。在各种语言模型分类器中,基于BERT的分类器获得了最好的结果,该分类器经过微调,可以从现实项目中学习三种不同领域特定数据集的语言复杂性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Welcome from the RE 2021 Organizers On the Role of User Feedback in Software Evolution: a Practitioners’ Perspective Agile Teams’ Perception in Privacy Requirements Elicitation: LGPD’s compliance in Brazil Pri-AwaRE: Tool Support for priority-aware decision-making under uncertainty Environment-Driven Abstraction Identification for Requirements-Based Testing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1