Classifying User Requirements from Online Feedback in Small Dataset Environments using Deep Learning

R. Mekala, Asif Irfan, Eduard C. Groen, Adam Porter, Mikael Lindvall
{"title":"Classifying User Requirements from Online Feedback in Small Dataset Environments using Deep Learning","authors":"R. Mekala, Asif Irfan, Eduard C. Groen, Adam Porter, Mikael Lindvall","doi":"10.1109/RE51729.2021.00020","DOIUrl":null,"url":null,"abstract":"An overwhelming number of users access app repositories like App Store/Google Play and social media platforms like Twitter, where they provide feedback on digital experiences. This vast textual corpus comprising user feedback has the potential to unearth detailed insights regarding the users’ opinions on products and services. Various tools have been proposed that employ natural language processing (NLP) and traditional machine learning (ML) based models as an inexpensive mechanism to identify requirements in user feedback. However, they fall short on their classification accuracy over unseen data due to factors like the cost of generating voluminous de-biased labeled datasets and general inefficiency. Recently, Van Vliet et al. [1] achieved state-of-the-art results extracting and classifying requirements from user reviews through traditional crowdsourcing. Based on their reference classification tasks and outcomes, we successfully developed and validated a deep-learning-backed artificial intelligence pipeline to achieve a state-of-the-art averaged classification accuracy of ∼87% on standard tasks for user feedback analysis. This approach, which comprises a BERT-based sequence classifier, proved effective even in extremely low-volume dataset environments. Additionally, our approach drastically reduces the time and costs of evaluation, and improves on the accuracy measures achieved using traditional ML-/NLP-based techniques.","PeriodicalId":440285,"journal":{"name":"2021 IEEE 29th International Requirements Engineering Conference (RE)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 29th International Requirements Engineering Conference (RE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RE51729.2021.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

An overwhelming number of users access app repositories like App Store/Google Play and social media platforms like Twitter, where they provide feedback on digital experiences. This vast textual corpus comprising user feedback has the potential to unearth detailed insights regarding the users’ opinions on products and services. Various tools have been proposed that employ natural language processing (NLP) and traditional machine learning (ML) based models as an inexpensive mechanism to identify requirements in user feedback. However, they fall short on their classification accuracy over unseen data due to factors like the cost of generating voluminous de-biased labeled datasets and general inefficiency. Recently, Van Vliet et al. [1] achieved state-of-the-art results extracting and classifying requirements from user reviews through traditional crowdsourcing. Based on their reference classification tasks and outcomes, we successfully developed and validated a deep-learning-backed artificial intelligence pipeline to achieve a state-of-the-art averaged classification accuracy of ∼87% on standard tasks for user feedback analysis. This approach, which comprises a BERT-based sequence classifier, proved effective even in extremely low-volume dataset environments. Additionally, our approach drastically reduces the time and costs of evaluation, and improves on the accuracy measures achieved using traditional ML-/NLP-based techniques.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于深度学习的小数据集环境下在线反馈用户需求分类
大量用户访问app Store/ b谷歌Play等应用库和Twitter等社交媒体平台,并在这些平台上提供数字体验反馈。这个包含用户反馈的庞大文本语料库有可能挖掘出有关用户对产品和服务的意见的详细见解。已经提出了各种工具,它们采用自然语言处理(NLP)和传统的基于机器学习(ML)的模型作为一种廉价的机制来识别用户反馈中的需求。然而,由于产生大量去偏见标记数据集的成本和普遍的低效率等因素,它们在对未见过的数据的分类准确性方面存在不足。最近,Van Vliet等人([1])通过传统的众包方法从用户评论中提取需求并进行分类,取得了最先进的结果。基于他们的参考分类任务和结果,我们成功地开发并验证了一个深度学习支持的人工智能管道,在用于用户反馈分析的标准任务上实现了最先进的平均分类准确率约87%。这种方法包括基于bert的序列分类器,即使在极低容量的数据集环境中也证明是有效的。此外,我们的方法大大减少了评估的时间和成本,并提高了使用传统的基于ML / nlp的技术所达到的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Welcome from the RE 2021 Organizers On the Role of User Feedback in Software Evolution: a Practitioners’ Perspective Agile Teams’ Perception in Privacy Requirements Elicitation: LGPD’s compliance in Brazil Pri-AwaRE: Tool Support for priority-aware decision-making under uncertainty Environment-Driven Abstraction Identification for Requirements-Based Testing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1