首页 > 最新文献

Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)最新文献

英文 中文
Offensive Language Detection in Nepali Social Media 尼泊尔社交媒体中的攻击性语言检测
Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.woah-1.7
Nobal B. Niraula, S. Dulal, Diwa Koirala
Social media texts such as blog posts, comments, and tweets often contain offensive languages including racial hate speech comments, personal attacks, and sexual harassment. Detecting inappropriate use of language is, therefore, of utmost importance for the safety of the users as well as for suppressing hateful conduct and aggression. Existing approaches to this problem are mostly available for resource-rich languages such as English and German. In this paper, we characterize the offensive language in Nepali, a low-resource language, highlighting the challenges that need to be addressed for processing Nepali social media text. We also present experiments for detecting offensive language using supervised machine learning. Besides contributing the first baseline approaches of detecting offensive language in Nepali, we also release human annotated data sets to encourage future research on this crucial topic.
博客文章、评论和推特等社交媒体文本经常包含攻击性语言,包括种族仇恨言论、人身攻击和性骚扰。因此,检测语言的不当使用对于用户的安全以及压制仇恨行为和侵略都是至关重要的。解决这个问题的现有方法大多适用于资源丰富的语言,如英语和德语。在本文中,我们描述了尼泊尔语中的攻击性语言,这是一种低资源语言,突出了处理尼泊尔社交媒体文本需要解决的挑战。我们还提出了使用监督机器学习检测攻击性语言的实验。除了提供第一个检测尼泊尔语冒犯性语言的基线方法外,我们还发布了人类注释数据集,以鼓励未来对这一关键主题的研究。
{"title":"Offensive Language Detection in Nepali Social Media","authors":"Nobal B. Niraula, S. Dulal, Diwa Koirala","doi":"10.18653/v1/2021.woah-1.7","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.7","url":null,"abstract":"Social media texts such as blog posts, comments, and tweets often contain offensive languages including racial hate speech comments, personal attacks, and sexual harassment. Detecting inappropriate use of language is, therefore, of utmost importance for the safety of the users as well as for suppressing hateful conduct and aggression. Existing approaches to this problem are mostly available for resource-rich languages such as English and German. In this paper, we characterize the offensive language in Nepali, a low-resource language, highlighting the challenges that need to be addressed for processing Nepali social media text. We also present experiments for detecting offensive language using supervised machine learning. Besides contributing the first baseline approaches of detecting offensive language in Nepali, we also release human annotated data sets to encourage future research on this crucial topic.","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124263150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Context Sensitivity Estimation in Toxicity Detection 毒性检测中的环境敏感性估计
Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.woah-1.15
A. Xenos, John Pavlopoulos, Ion Androutsopoulos
User posts whose perceived toxicity depends on the conversational context are rare in current toxicity detection datasets. Hence, toxicity detectors trained on current datasets will also disregard context, making the detection of context-sensitive toxicity a lot harder when it occurs. We constructed and publicly release a dataset of 10k posts with two kinds of toxicity labels per post, obtained from annotators who considered (i) both the current post and the previous one as context, or (ii) only the current post. We introduce a new task, context-sensitivity estimation, which aims to identify posts whose perceived toxicity changes if the context (previous post) is also considered. Using the new dataset, we show that systems can be developed for this task. Such systems could be used to enhance toxicity detection datasets with more context-dependent posts or to suggest when moderators should consider the parent posts, which may not always be necessary and may introduce additional costs.
在当前的毒性检测数据集中,用户帖子的感知毒性取决于会话上下文是罕见的。因此,在当前数据集上训练的毒性检测器也会忽略上下文,这使得检测上下文敏感的毒性变得更加困难。我们构建并公开发布了一个包含10k个帖子的数据集,每个帖子有两种毒性标签,这些标签来自于将(i)当前帖子和前一个帖子作为上下文,或(ii)仅考虑当前帖子的注释者。我们引入了一个新的任务,上下文敏感性估计,其目的是识别如果也考虑上下文(以前的帖子),其感知毒性会发生变化的帖子。使用新的数据集,我们表明可以为这项任务开发系统。此类系统可用于增强毒性检测数据集,其中包含更多与上下文相关的帖子,或建议版主何时应考虑父帖子,这可能并不总是必要的,并且可能会带来额外的成本。
{"title":"Context Sensitivity Estimation in Toxicity Detection","authors":"A. Xenos, John Pavlopoulos, Ion Androutsopoulos","doi":"10.18653/v1/2021.woah-1.15","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.15","url":null,"abstract":"User posts whose perceived toxicity depends on the conversational context are rare in current toxicity detection datasets. Hence, toxicity detectors trained on current datasets will also disregard context, making the detection of context-sensitive toxicity a lot harder when it occurs. We constructed and publicly release a dataset of 10k posts with two kinds of toxicity labels per post, obtained from annotators who considered (i) both the current post and the previous one as context, or (ii) only the current post. We introduce a new task, context-sensitivity estimation, which aims to identify posts whose perceived toxicity changes if the context (previous post) is also considered. Using the new dataset, we show that systems can be developed for this task. Such systems could be used to enhance toxicity detection datasets with more context-dependent posts or to suggest when moderators should consider the parent posts, which may not always be necessary and may introduce additional costs.","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129098347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
VL-BERT+: Detecting Protected Groups in Hateful Multimodal Memes VL-BERT+:在可恨的多模态模因中检测保护组
Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.woah-1.22
Piush Aggarwal, Michelle Espranita Liman, Darina Gold, Torsten Zesch
This paper describes our submission (winning solution for Task A) to the Shared Task on Hateful Meme Detection at WOAH 2021. We build our system on top of a state-of-the-art system for binary hateful meme classification that already uses image tags such as race, gender, and web entities. We add further metadata such as emotions and experiment with data augmentation techniques, as hateful instances are underrepresented in the data set.
本文描述了我们在WOAH 2021上提交的关于仇恨模因检测的共享任务(任务A的获奖解决方案)。我们的系统建立在一个最先进的二元仇恨模因分类系统之上,该系统已经使用了种族、性别和网络实体等图像标签。我们添加了进一步的元数据,如情绪和实验数据增强技术,因为可恶的实例在数据集中的代表性不足。
{"title":"VL-BERT+: Detecting Protected Groups in Hateful Multimodal Memes","authors":"Piush Aggarwal, Michelle Espranita Liman, Darina Gold, Torsten Zesch","doi":"10.18653/v1/2021.woah-1.22","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.22","url":null,"abstract":"This paper describes our submission (winning solution for Task A) to the Shared Task on Hateful Meme Detection at WOAH 2021. We build our system on top of a state-of-the-art system for binary hateful meme classification that already uses image tags such as race, gender, and web entities. We add further metadata such as emotions and experiment with data augmentation techniques, as hateful instances are underrepresented in the data set.","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"209 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126055897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1