Application of back-translation: a transfer learning approach to identify ambiguous software requirements

Isha Subedi, Maninder Singh, Vijayalakshmi Ramasamy, G. Walia
{"title":"Application of back-translation: a transfer learning approach to identify ambiguous software requirements","authors":"Isha Subedi, Maninder Singh, Vijayalakshmi Ramasamy, G. Walia","doi":"10.1145/3409334.3452068","DOIUrl":null,"url":null,"abstract":"Ambiguous requirements are problematic in requirement engineering as various stakeholders can debate on the interpretation of the requirements leading to a variety of issues in the development stages. Since requirement specifications are usually written in natural language, analyzing ambiguous requirements is currently a manual process as it has not been fully automated to meet the industry standards. In this paper, we used transfer learning by using ULMFiT where we pre-trained our model to a general-domain corpus and then fine-tuned it to classify ambiguous vs unambiguous requirements (target task). We then compared its accuracy with machine learning classifiers like SVM, Linear Regression, and Multinomial Naive Bayes. We also used back translation (BT) as a text augmentation technique to see if it improved the classification accuracy. Our results showed that ULMFiT achieved higher accuracy than SVM (Support Vector Machines), Logistic Regression and Multinomial Naive Bayes for our initial data set. Further by augmenting requirements using BT, ULMFiT got a higher accuracy than SVM, Logistic Regression, and Multinomial Naive Bayes classifier, improving the initial performance by 5.371%. Our proposed research provides some promising insights on how transfer learning and text augmentation can be applied to small data sets in requirements engineering.","PeriodicalId":148741,"journal":{"name":"Proceedings of the 2021 ACM Southeast Conference","volume":"201 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 ACM Southeast Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3409334.3452068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Ambiguous requirements are problematic in requirement engineering as various stakeholders can debate on the interpretation of the requirements leading to a variety of issues in the development stages. Since requirement specifications are usually written in natural language, analyzing ambiguous requirements is currently a manual process as it has not been fully automated to meet the industry standards. In this paper, we used transfer learning by using ULMFiT where we pre-trained our model to a general-domain corpus and then fine-tuned it to classify ambiguous vs unambiguous requirements (target task). We then compared its accuracy with machine learning classifiers like SVM, Linear Regression, and Multinomial Naive Bayes. We also used back translation (BT) as a text augmentation technique to see if it improved the classification accuracy. Our results showed that ULMFiT achieved higher accuracy than SVM (Support Vector Machines), Logistic Regression and Multinomial Naive Bayes for our initial data set. Further by augmenting requirements using BT, ULMFiT got a higher accuracy than SVM, Logistic Regression, and Multinomial Naive Bayes classifier, improving the initial performance by 5.371%. Our proposed research provides some promising insights on how transfer learning and text augmentation can be applied to small data sets in requirements engineering.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
反向翻译的应用:一种识别模糊软件需求的迁移学习方法
在需求工程中,模糊的需求是有问题的,因为不同的涉众可能会对需求的解释进行辩论,从而导致开发阶段中的各种问题。由于需求说明通常是用自然语言编写的,分析模棱两可的需求目前是一个手工过程,因为它还没有完全自动化以满足行业标准。在本文中,我们通过使用ULMFiT使用迁移学习,其中我们将模型预训练到通用领域语料库,然后对其进行微调以分类模糊与非模糊的需求(目标任务)。然后,我们将其与机器学习分类器(如SVM、线性回归和多项朴素贝叶斯)的准确性进行了比较。我们还使用反向翻译(BT)作为文本增强技术,看看它是否提高了分类精度。结果表明,对于我们的初始数据集,ULMFiT比SVM(支持向量机)、Logistic回归和多项朴素贝叶斯获得了更高的精度。此外,通过使用BT增强需求,ULMFiT获得了比SVM、Logistic回归和多项朴素贝叶斯分类器更高的准确率,初始性能提高了5.371%。我们提出的研究为如何将迁移学习和文本增强应用于需求工程中的小数据集提供了一些有希望的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Application of back-translation: a transfer learning approach to identify ambiguous software requirements A survey of wireless network simulation and/or emulation software for use in higher education Implementing a network intrusion detection system using semi-supervised support vector machine and random forest Performance evaluation of a widely used implementation of the MQTT protocol with large payloads in normal operation and under a DoS attack Benefits of combining dimensional attention and working memory for partially observable reinforcement learning problems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1