四个软件工程社区对攻击性语言的检测和减少

Jithin Cheriyan, Bastin Tony Roy Savarimuthu, Stephen Cranefield
{"title":"四个软件工程社区对攻击性语言的检测和减少","authors":"Jithin Cheriyan, Bastin Tony Roy Savarimuthu, Stephen Cranefield","doi":"10.1145/3463274.3463805","DOIUrl":null,"url":null,"abstract":"Software Engineering (SE) communities such as Stack Overflow have become unwelcoming, particularly through members’ use of offensive language. Research has shown that offensive language drives users away from active engagement within these platforms. This work aims to explore this issue more broadly by investigating the nature of offensive language in comments posted by users in four prominent SE platforms – GitHub, Gitter, Slack and Stack Overflow (SO). It proposes an approach to detect and classify offensive language in SE communities by adopting natural language processing and deep learning techniques. Further, a Conflict Reduction System (CRS), which identifies offence and then suggests what changes could be made to minimize offence has been proposed. Beyond showing the prevalence of offensive language in over 1 million comments from four different communities which ranges from 0.07% to 0.43%, our results show promise in successful detection and classification of such language. The CRS system has the potential to drastically reduce manual moderation efforts to detect and reduce offence in SE communities.","PeriodicalId":328024,"journal":{"name":"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Towards offensive language detection and reduction in four Software Engineering communities\",\"authors\":\"Jithin Cheriyan, Bastin Tony Roy Savarimuthu, Stephen Cranefield\",\"doi\":\"10.1145/3463274.3463805\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software Engineering (SE) communities such as Stack Overflow have become unwelcoming, particularly through members’ use of offensive language. Research has shown that offensive language drives users away from active engagement within these platforms. This work aims to explore this issue more broadly by investigating the nature of offensive language in comments posted by users in four prominent SE platforms – GitHub, Gitter, Slack and Stack Overflow (SO). It proposes an approach to detect and classify offensive language in SE communities by adopting natural language processing and deep learning techniques. Further, a Conflict Reduction System (CRS), which identifies offence and then suggests what changes could be made to minimize offence has been proposed. Beyond showing the prevalence of offensive language in over 1 million comments from four different communities which ranges from 0.07% to 0.43%, our results show promise in successful detection and classification of such language. The CRS system has the potential to drastically reduce manual moderation efforts to detect and reduce offence in SE communities.\",\"PeriodicalId\":328024,\"journal\":{\"name\":\"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3463274.3463805\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3463274.3463805","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

摘要

像Stack Overflow这样的软件工程(SE)社区已经变得不受欢迎,特别是因为成员使用了攻击性语言。研究表明,攻击性语言会让用户远离这些平台。这项工作旨在通过调查四个著名SE平台(GitHub、Gitter、Slack和Stack Overflow (SO))上用户发布的评论中攻击性语言的性质,更广泛地探讨这个问题。提出了一种采用自然语言处理和深度学习技术对SE社区中的攻击性语言进行检测和分类的方法。此外,还提出了一个减少冲突制度(CRS),该制度查明罪行,然后提出可以作出哪些改变以尽量减少罪行。除了显示来自四个不同社区的100多万条评论中攻击性语言的流行程度(范围从0.07%到0.43%)之外,我们的结果显示了成功检测和分类此类语言的希望。CRS系统有可能大幅减少人工审核工作,以发现和减少SE社区的犯罪行为。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Towards offensive language detection and reduction in four Software Engineering communities
Software Engineering (SE) communities such as Stack Overflow have become unwelcoming, particularly through members’ use of offensive language. Research has shown that offensive language drives users away from active engagement within these platforms. This work aims to explore this issue more broadly by investigating the nature of offensive language in comments posted by users in four prominent SE platforms – GitHub, Gitter, Slack and Stack Overflow (SO). It proposes an approach to detect and classify offensive language in SE communities by adopting natural language processing and deep learning techniques. Further, a Conflict Reduction System (CRS), which identifies offence and then suggests what changes could be made to minimize offence has been proposed. Beyond showing the prevalence of offensive language in over 1 million comments from four different communities which ranges from 0.07% to 0.43%, our results show promise in successful detection and classification of such language. The CRS system has the potential to drastically reduce manual moderation efforts to detect and reduce offence in SE communities.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
About the Assessment of Grey Literature in Software Engineering Towards an Automated Classification Approach for Software Engineering Research Fog Based Energy Efficient Process Framework for Smart Building Open Data-driven Usability Improvements of Static Code Analysis and its Challenges Towards a corpus for credibility assessment in software practitioner blog articles
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1