F. Jafri, Kritesh Rauniyar, Surendrabikram Thapa, Mohammad Aman Siddiqui, Matloob Khushi, Usman Naseem
{"title":"CHUNAV: Analyzing Hindi Hate Speech and Targeted Groups in Indian Election Discourse","authors":"F. Jafri, Kritesh Rauniyar, Surendrabikram Thapa, Mohammad Aman Siddiqui, Matloob Khushi, Usman Naseem","doi":"10.1145/3665245","DOIUrl":null,"url":null,"abstract":"\n In the ever-evolving landscape of online discourse and political dialogue, the rise of hate speech poses a significant challenge to maintaining a respectful and inclusive digital environment. The context becomes particularly complex when considering the Hindi language—a low-resource language with limited available data. To address this pressing concern, we introduce the\n CHUNAV\n dataset—a collection of 11,457 Hindi tweets gathered during assembly elections in various states.\n CHUNAV\n is purpose-built for hate speech categorization and the identification of target groups. The dataset is a valuable resource for exploring hate speech within the distinctive socio-political context of Indian elections. The tweets within\n CHUNAV\n have been meticulously categorized into “Hate” and “Non-Hate” labels, and further subdivided to pinpoint the specific targets of hate speech, including “Individual”, “Organization”, and “Community” labels (as shown in Figure 1). Furthermore, this paper presents multiple benchmark models for hate speech detection, along with an innovative ensemble and oversampling-based method. The paper also delves into the results of topic modeling, all aimed at effectively addressing hate speech and target identification in the Hindi language. This contribution seeks to advance the field of hate speech analysis and foster a safer and more inclusive online space within the distinctive realm of Indian Assembly Elections.\n","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":"58 31","pages":""},"PeriodicalIF":17.7000,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3665245","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
In the ever-evolving landscape of online discourse and political dialogue, the rise of hate speech poses a significant challenge to maintaining a respectful and inclusive digital environment. The context becomes particularly complex when considering the Hindi language—a low-resource language with limited available data. To address this pressing concern, we introduce the
CHUNAV
dataset—a collection of 11,457 Hindi tweets gathered during assembly elections in various states.
CHUNAV
is purpose-built for hate speech categorization and the identification of target groups. The dataset is a valuable resource for exploring hate speech within the distinctive socio-political context of Indian elections. The tweets within
CHUNAV
have been meticulously categorized into “Hate” and “Non-Hate” labels, and further subdivided to pinpoint the specific targets of hate speech, including “Individual”, “Organization”, and “Community” labels (as shown in Figure 1). Furthermore, this paper presents multiple benchmark models for hate speech detection, along with an innovative ensemble and oversampling-based method. The paper also delves into the results of topic modeling, all aimed at effectively addressing hate speech and target identification in the Hindi language. This contribution seeks to advance the field of hate speech analysis and foster a safer and more inclusive online space within the distinctive realm of Indian Assembly Elections.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.