Detecting Hate Speech in Hindi in Online Social Media

Anushka Sharma, Rishabh Kaushal
{"title":"Detecting Hate Speech in Hindi in Online Social Media","authors":"Anushka Sharma, Rishabh Kaushal","doi":"10.1109/ICCT56969.2023.10075749","DOIUrl":null,"url":null,"abstract":"Because of the rise in online hatred, the research communities of artificial intelligence, particularly natural language processing, have been developing models for identifying online hatred. Recently, code-mixing, or the usage of multiple languages in social media conversations, has made multilingual hatred a significant difficulty for automated detection. The crucial task involved in NLP is identifying inciting hatred in writings on social networking sites. This work has several relevant applications, including analysis of sentiments, cyberbullying in online world, and societal & political conflict studies. Using tweets that have been put online on Twitter, we analyze the issue of hatred detection in multilingual functionality in this paper. The tweets have the text annotations and the speech category (Normal speech or Hate speech) to which these belong. We, therefore, recommend a monitored method for detecting hatred. Additionally, the classification approach is provided, which uses certain characters level, words level, and lexicons-based features for identifying hate speech in the corpus. We obtain results of 96% accuracy in identifying posts across four classifiers. Index Terms—Hate speech, Multilingual, Code-mixing, NLP","PeriodicalId":128100,"journal":{"name":"2023 3rd International Conference on Intelligent Communication and Computational Techniques (ICCT)","volume":"100 9","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 3rd International Conference on Intelligent Communication and Computational Techniques (ICCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCT56969.2023.10075749","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Because of the rise in online hatred, the research communities of artificial intelligence, particularly natural language processing, have been developing models for identifying online hatred. Recently, code-mixing, or the usage of multiple languages in social media conversations, has made multilingual hatred a significant difficulty for automated detection. The crucial task involved in NLP is identifying inciting hatred in writings on social networking sites. This work has several relevant applications, including analysis of sentiments, cyberbullying in online world, and societal & political conflict studies. Using tweets that have been put online on Twitter, we analyze the issue of hatred detection in multilingual functionality in this paper. The tweets have the text annotations and the speech category (Normal speech or Hate speech) to which these belong. We, therefore, recommend a monitored method for detecting hatred. Additionally, the classification approach is provided, which uses certain characters level, words level, and lexicons-based features for identifying hate speech in the corpus. We obtain results of 96% accuracy in identifying posts across four classifiers. Index Terms—Hate speech, Multilingual, Code-mixing, NLP
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在线社交媒体上印地语仇恨言论的检测
由于网络仇恨的增加,人工智能研究团体,特别是自然语言处理,一直在开发识别网络仇恨的模型。最近,代码混合,或在社交媒体对话中使用多种语言,使多语言仇恨成为自动检测的一个重大困难。NLP的关键任务是识别社交网站上煽动仇恨的文章。这项工作有几个相关的应用,包括情绪分析,网络世界中的网络欺凌,以及社会和政治冲突研究。本文利用Twitter上发布的推文,分析了多语言功能中的仇恨检测问题。tweet具有文本注释和所属的语音类别(正常语音或仇恨语音)。因此,我们推荐一种监测仇恨的方法。此外,还提出了一种分类方法,该方法使用一定的字符级、词级和基于词典的特征来识别语料库中的仇恨言论。我们在识别四个分类器的帖子中获得了96%的准确率。索引术语-仇恨言论,多语言,代码混合,自然语言处理
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
About ICCT '23 A Novel Technique to Detect URL Phishing based on Feature Count Effectiveness of Anti-Spoofing Protocols for Email Authentication Optimal Predictive Maintenance Technique for Manufacturing Semiconductors using Machine Learning Development of Secure IoT Ecosystems for Healthcare
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1