四个软件工程社区对攻击性语言的检测和减少

Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering Pub Date : 2021-06-04 DOI:10.1145/3463274.3463805

Jithin Cheriyan, Bastin Tony Roy Savarimuthu, Stephen Cranefield

{"title":"四个软件工程社区对攻击性语言的检测和减少","authors":"Jithin Cheriyan, Bastin Tony Roy Savarimuthu, Stephen Cranefield","doi":"10.1145/3463274.3463805","DOIUrl":null,"url":null,"abstract":"Software Engineering (SE) communities such as Stack Overflow have become unwelcoming, particularly through members’ use of offensive language. Research has shown that offensive language drives users away from active engagement within these platforms. This work aims to explore this issue more broadly by investigating the nature of offensive language in comments posted by users in four prominent SE platforms – GitHub, Gitter, Slack and Stack Overflow (SO). It proposes an approach to detect and classify offensive language in SE communities by adopting natural language processing and deep learning techniques. Further, a Conflict Reduction System (CRS), which identifies offence and then suggests what changes could be made to minimize offence has been proposed. Beyond showing the prevalence of offensive language in over 1 million comments from four different communities which ranges from 0.07% to 0.43%, our results show promise in successful detection and classification of such language. The CRS system has the potential to drastically reduce manual moderation efforts to detect and reduce offence in SE communities.","PeriodicalId":328024,"journal":{"name":"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Towards offensive language detection and reduction in four Software Engineering communities\",\"authors\":\"Jithin Cheriyan, Bastin Tony Roy Savarimuthu, Stephen Cranefield\",\"doi\":\"10.1145/3463274.3463805\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software Engineering (SE) communities such as Stack Overflow have become unwelcoming, particularly through members’ use of offensive language. Research has shown that offensive language drives users away from active engagement within these platforms. This work aims to explore this issue more broadly by investigating the nature of offensive language in comments posted by users in four prominent SE platforms – GitHub, Gitter, Slack and Stack Overflow (SO). It proposes an approach to detect and classify offensive language in SE communities by adopting natural language processing and deep learning techniques. Further, a Conflict Reduction System (CRS), which identifies offence and then suggests what changes could be made to minimize offence has been proposed. Beyond showing the prevalence of offensive language in over 1 million comments from four different communities which ranges from 0.07% to 0.43%, our results show promise in successful detection and classification of such language. The CRS system has the potential to drastically reduce manual moderation efforts to detect and reduce offence in SE communities.\",\"PeriodicalId\":328024,\"journal\":{\"name\":\"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3463274.3463805\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3463274.3463805","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 18

摘要

像Stack Overflow这样的软件工程(SE)社区已经变得不受欢迎，特别是因为成员使用了攻击性语言。研究表明，攻击性语言会让用户远离这些平台。这项工作旨在通过调查四个著名SE平台(GitHub、Gitter、Slack和Stack Overflow (SO))上用户发布的评论中攻击性语言的性质，更广泛地探讨这个问题。提出了一种采用自然语言处理和深度学习技术对SE社区中的攻击性语言进行检测和分类的方法。此外，还提出了一个减少冲突制度(CRS)，该制度查明罪行，然后提出可以作出哪些改变以尽量减少罪行。除了显示来自四个不同社区的100多万条评论中攻击性语言的流行程度(范围从0.07%到0.43%)之外，我们的结果显示了成功检测和分类此类语言的希望。CRS系统有可能大幅减少人工审核工作，以发现和减少SE社区的犯罪行为。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Towards offensive language detection and reduction in four Software Engineering communities

Software Engineering (SE) communities such as Stack Overflow have become unwelcoming, particularly through members’ use of offensive language. Research has shown that offensive language drives users away from active engagement within these platforms. This work aims to explore this issue more broadly by investigating the nature of offensive language in comments posted by users in four prominent SE platforms – GitHub, Gitter, Slack and Stack Overflow (SO). It proposes an approach to detect and classify offensive language in SE communities by adopting natural language processing and deep learning techniques. Further, a Conflict Reduction System (CRS), which identifies offence and then suggests what changes could be made to minimize offence has been proposed. Beyond showing the prevalence of offensive language in over 1 million comments from four different communities which ranges from 0.07% to 0.43%, our results show promise in successful detection and classification of such language. The CRS system has the potential to drastically reduce manual moderation efforts to detect and reduce offence in SE communities.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering

自引率

0.00%

发文量

期刊最新文献

About the Assessment of Grey Literature in Software Engineering Towards an Automated Classification Approach for Software Engineering Research Fog Based Energy Efficient Process Framework for Smart Building Open Data-driven Usability Improvements of Static Code Analysis and its Challenges Towards a corpus for credibility assessment in software practitioner blog articles