Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)最新文献

英文中文

Accounting for Offensive Speech as a Practice of Resistance 作为一种抵抗实践的攻击性言论的解释

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.18

Mark Díaz, Razvan Amironesei, Laura Weidinger, Iason Gabriel

Tasks such as toxicity detection, hate speech detection, and online harassment detection have been developed for identifying interactions involving offensive speech. In this work we articulate the need for a relational understanding of offensiveness to help distinguish denotative offensive speech from offensive speech serving as a mechanism through which marginalized communities resist oppressive social norms. Using examples from the queer community, we argue that evaluations of offensive speech must focus on the impacts of language use. We call this the cynic perspective– or a characteristic of language with roots in Cynic philosophy that pertains to employing offensive speech as a practice of resistance. We also explore the degree to which NLP systems may encounter limits to modeling relational context.

毒性检测、仇恨言论检测和在线骚扰检测等任务已经开发出来，用于识别涉及攻击性言论的互动。在这项工作中，我们阐明了对冒犯性的关系理解的必要性，以帮助区分外延性冒犯性言论和作为边缘化社区抵抗压迫性社会规范机制的冒犯性言论。以酷儿群体为例，我们认为对冒犯性言论的评估必须集中在语言使用的影响上。我们称之为愤世嫉俗的观点——或者是源于愤世嫉俗哲学的一种语言特征，它涉及到使用攻击性言论作为一种抵抗的实践。我们还探讨了NLP系统在建模关系上下文时可能遇到的限制程度。

引用次数: 3

Distributional properties of political dogwhistle representations in Swedish BERT 瑞典语BERT中政治狗哨表征的分布特性

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.16

Niclas Hertzberg, R. Cooper, Eveliina Lindgren, B. Rönnerstrand, Gregor Rettenegger, Ellen Breitholtz, A. Sayeed

“Dogwhistles” are expressions intended by the speaker have two messages: a socially-unacceptable “in-group” message understood by a subset of listeners, and a benign message intended for the out-group. We take the result of a word-replacement survey of the Swedish population intended to reveal how dogwhistles are understood, and we show that the difficulty of annotating dogwhistles is reflected in the separability in the space of a sentence-transformer Swedish BERT trained on general data.

“狗哨”是说话者想要表达的两种信息:一种是被一小部分听众理解的不被社会接受的“群体内”信息，另一种是向群体外传递的良性信息。我们采用瑞典人口的单词替换调查结果，旨在揭示狗哨是如何被理解的，并且我们表明注释狗哨的难度反映在基于一般数据训练的句子转换瑞典BERT的空间可分性上。

引用次数: 2

Users Hate Blondes: Detecting Sexism in User Comments on Online Romanian News 用户讨厌金发女郎:检测性别歧视的用户评论在线罗马尼亚新闻

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.21

Andreea-Loredana Moldovan, Karla-Claudia Csürös, Ana-Maria Bucur, Loredana Bercuci

Romania ranks almost last in Europe when it comes to gender equality in political representation, with about 10${%$ fewer women in politics than the E.U. average. We proceed from the assumption that this underrepresentation is also influenced by the sexism and verbal abuse female politicians face in the public sphere, especially in online media. We collect a novel dataset with sexist comments in Romanian language from newspaper articles about Romanian female politicians and propose baseline models using classical machine learning models and fine-tuned pretrained transformer models for the classification of sexist language in the online medium.

在政治代表性别平等方面，罗马尼亚在欧洲排名垫底，女性参政人数比欧盟平均水平低10%。我们的假设是，这种代表性不足也受到女性政治家在公共领域，特别是在网络媒体中面临的性别歧视和言语虐待的影响。我们从报纸上关于罗马尼亚女政治家的文章中收集了一个新的数据集，其中包含罗马尼亚语的性别歧视评论，并使用经典的机器学习模型和微调的预训练变压器模型提出基线模型，用于在线媒体中性别歧视语言的分类。

引用次数: 1

Revisiting Queer Minorities in Lexicons 重新审视词汇中的酷儿少数群体

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.23

Krithika Ramesh, Sumeet Kumar, Ashiqur R. KhudaBukhsh

Lexicons play an important role in content moderation often being the first line of defense. However, little or no literature exists in analyzing the representation of queer-related words in them. In this paper, we consider twelve well-known lexicons containing inappropriate words and analyze how gender and sexual minorities are represented in these lexicons. Our analyses reveal that several of these lexicons barely make any distinction between pejorative and non-pejorative queer-related words. We express concern that such unfettered usage of non-pejorative queer-related words may impact queer presence in mainstream discourse. Our analyses further reveal that the lexicons have poor overlap in queer-related words. We finally present a quantifiable measure of consistency and show that several of these lexicons are not consistent in how they include (or omit) queer-related words.

词汇在内容审核中扮演着重要的角色，通常是第一道防线。然而，很少或没有文献分析酷儿相关词汇在其中的表现。本文选取了12个著名的不恰当词汇，分析了这些词汇中性别和性少数群体的表达方式。我们的分析表明，这些词汇中有几个几乎没有区分贬义和非贬义的酷儿相关词汇。我们担心，这种与酷儿有关的非贬义词汇的随意使用，可能会影响主流话语中酷儿的存在。我们的分析进一步揭示了酷儿相关词汇的词汇重叠很少。我们最后提出了一种一致性的量化测量方法，并表明其中一些词汇在如何包含(或省略)酷儿相关词汇方面并不一致。

引用次数: 1

Towards Automatic Generation of Messages Countering Online Hate Speech and Microaggressions 自动生成信息对抗在线仇恨言论和微侵犯

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.2

Mana Ashida, Mamoru Komachi

With the widespread use of social media, online hate is increasing, and microaggressions are receiving attention. We explore the potential for using pretrained language models to automatically generate messages that combat the associated offensive texts. Specifically, we focus on using prompting to steer model generation as it requires less data and computation than fine-tuning. We also propose a human evaluation perspective; offensiveness, stance, and informativeness. After obtaining 306 counterspeech and 42 microintervention messages generated by GPT-{2, 3, Neo}, we conducted a human evaluation using Amazon Mechanical Turk. The results indicate the potential of using prompting in the proposed generation task. All the generated texts along with the annotation are published to encourage future research on countering hate and microaggressions online.

随着社交媒体的广泛使用，网络仇恨正在增加，微侵犯正在受到关注。我们探索了使用预训练语言模型来自动生成对抗相关攻击性文本的消息的潜力。具体来说，我们专注于使用提示来引导模型生成，因为它比微调需要更少的数据和计算。我们还提出了人类评价的视角;攻击性、姿态和信息量。在获得GPT-{2,3, Neo}生成的306条反语音和42条微干预信息后，我们使用Amazon Mechanical Turk进行了人类评价。结果表明在提议的生成任务中使用提示的潜力。所有生成的文本以及注释都将发布，以鼓励未来在网上打击仇恨和微侵犯的研究。

引用次数: 15

HATE-ITA: Hate Speech Detection in Italian Social Media Text Hate - ita:意大利社交媒体文本中的仇恨言论检测

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.24

Debora Nozza, Federico Bianchi, Giuseppe Attanasio

引用次数: 2

The subtle language of exclusion: Identifying the Toxic Speech of Trans-exclusionary Radical Feminists 排斥的微妙语言:识别跨性别排斥的激进女权主义者的有毒言论

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.8

Christina T. Lu, David Jurgens

Toxic language can take many forms, from explicit hate speech to more subtle microaggressions. Within this space, models identifying transphobic language have largely focused on overt forms. However, a more pernicious and subtle source of transphobic comments comes in the form of statements made by Trans-exclusionary Radical Feminists (TERFs); these statements often appear seemingly-positive and promote women’s causes and issues, while simultaneously denying the inclusion of transgender women as women. Here, we introduce two models to mitigate this antisocial behavior. The first model identifies TERF users in social media, recognizing that these users are a main source of transphobic material that enters mainstream discussion and whom other users may not desire to engage with in good faith. The second model tackles the harder task of recognizing the masked rhetoric of TERF messages and introduces a new dataset to support this task. Finally, we discuss the ethics of deploying these models to mitigate the harm of this language, arguing for a balanced approach that allows for restorative interactions.

有毒语言可以有多种形式，从明确的仇恨言论到更微妙的微侵犯。在这个空间里，识别跨性别语言的模型主要集中在公开的形式上。然而，一个更有害和微妙的跨性别言论来源来自于跨性别排斥激进女权主义者(terf)的声明;这些声明通常看起来是积极的，促进了妇女的事业和问题，同时否认变性妇女是女性。在这里，我们介绍了两个模型来减轻这种反社会行为。第一个模型识别社交媒体中的TERF用户，认识到这些用户是进入主流讨论的变性材料的主要来源，其他用户可能不希望真诚地与他们接触。第二个模型处理更难的任务，即识别TERF消息的屏蔽修辞，并引入了一个新的数据集来支持该任务。最后，我们讨论了部署这些模型以减轻这种语言的危害的伦理问题，主张采用一种允许恢复性交互的平衡方法。

{"title":"The subtle language of exclusion: Identifying the Toxic Speech of Trans-exclusionary Radical Feminists","authors":"Christina T. Lu, David Jurgens","doi":"10.18653/v1/2022.woah-1.8","DOIUrl":"https://doi.org/10.18653/v1/2022.woah-1.8","url":null,"abstract":"Toxic language can take many forms, from explicit hate speech to more subtle microaggressions. Within this space, models identifying transphobic language have largely focused on overt forms. However, a more pernicious and subtle source of transphobic comments comes in the form of statements made by Trans-exclusionary Radical Feminists (TERFs); these statements often appear seemingly-positive and promote women’s causes and issues, while simultaneously denying the inclusion of transgender women as women. Here, we introduce two models to mitigate this antisocial behavior. The first model identifies TERF users in social media, recognizing that these users are a main source of transphobic material that enters mainstream discussion and whom other users may not desire to engage with in good faith. The second model tackles the harder task of recognizing the masked rhetoric of TERF messages and introduces a new dataset to support this task. Finally, we discuss the ethics of deploying these models to mitigate the harm of this language, arguing for a balanced approach that allows for restorative interactions.","PeriodicalId":440731,"journal":{"name":"Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115212099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Lost in Distillation: A Case Study in Toxicity Modeling 在蒸馏中丢失:毒性模型的案例研究

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.9

Alyssa Chvasta, Alyssa Lees, Jeffrey Sorensen, Lucy Vasserman, Nitesh Goyal

In an era of increasingly large pre-trained language models, knowledge distillation is a powerful tool for transferring information from a large model to a smaller one. In particular, distillation is of tremendous benefit when it comes to real-world constraints such as serving latency or serving at scale. However, a loss of robustness in language understanding may be hidden in the process and not immediately revealed when looking at high-level evaluation metrics. In this work, we investigate the hidden costs: what is “lost in distillation”, especially in regards to identity-based model bias using the case study of toxicity modeling. With reproducible models using open source training sets, we investigate models distilled from a BERT teacher baseline. Using both open source and proprietary big data models, we investigate these hidden performance costs.

在预训练语言模型日益庞大的时代，知识蒸馏是将信息从大模型转移到小模型的有力工具。特别是，当涉及到诸如服务延迟或大规模服务等现实世界的限制时，蒸馏具有巨大的好处。然而，语言理解稳健性的丧失可能隐藏在这个过程中，在查看高级评估指标时不会立即显示出来。在这项工作中，我们研究了隐性成本:什么是“蒸馏损失”，特别是在基于身份的模型偏差方面，使用毒性建模的案例研究。通过使用开源训练集的可重复模型，我们研究了从BERT教师基线提取的模型。使用开源和专有的大数据模型，我们调查了这些隐藏的性能成本。

引用次数: 0

A Comprehensive Dataset for German Offensive Language and Conversation Analysis 德语攻击性语言和会话分析的综合数据集

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.14

Christoph Demus, Jonas Pitz, Mina Schütz, Nadine Probol, M. Siegel, D. Labudde

In this work, we present a new publicly available offensive language dataset of 10.278 German social media comments collected in the first half of 2021 that were annotated by in total six annotators. With twelve different annotation categories, it is far more comprehensive than other datasets, and goes beyond just hate speech detection. The labels aim in particular also at toxicity, criminal relevance and discrimination types of comments.Furthermore, about half of the comments are from coherent parts of conversations, which opens the possibility to consider the comments’ contexts and do conversation analyses in order to research the contagion of offensive language in conversations.

在这项工作中，我们提出了一个新的公开可用的攻击性语言数据集，其中包含了2021年上半年收集的10.278条德国社交媒体评论，这些评论共由6位注释者进行了注释。它有12个不同的注释类别，比其他数据集要全面得多，而且不仅仅是仇恨言论检测。这些标签还特别针对毒性、犯罪相关性和歧视类型的评论。此外，大约一半的评论来自对话的连贯部分，这为考虑评论的上下文和进行对话分析提供了可能性，以便研究攻击性语言在对话中的传染。

引用次数: 12

Targeted Identity Group Prediction in Hate Speech Corpora 仇恨言论语料库中的目标身份群体预测

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.22

Pratik S. Sachdeva, Renata Barreto, Claudia von Vacano, Chris J. Kennedy

The past decade has seen an abundance of work seeking to detect, characterize, and measure online hate speech. A related, but less studied problem, is the detection of identity groups targeted by that hate speech. Predictive accuracy on this task can supplement additional analyses beyond hate speech detection, motivating its study. Using the Measuring Hate Speech corpus, which provided annotations for targeted identity groups, we created neural network models to perform multi-label binary prediction of identity groups targeted by a comment. Specifically, we studied 8 broad identity groups and 12 identity sub-groups within race and gender identity. We found that these networks exhibited good predictive performance, achieving ROC AUCs of greater than 0.9 and PR AUCs of greater than 0.7 on several identity groups. We validated their performance on HateCheck and Gab Hate Corpora, finding that predictive performance generalized in most settings. We additionally examined the performance of the model on comments targeting multiple identity groups. Our results demonstrate the feasibility of simultaneously identifying targeted groups in social media comments.

在过去的十年里，人们做了大量的工作来检测、描述和衡量网络仇恨言论。一个相关的，但研究较少的问题是，如何识别仇恨言论针对的身份群体。该任务的预测准确性可以补充仇恨言论检测之外的其他分析，从而推动其研究。利用测量仇恨言论语料库(该语料库为目标身份群体提供注释)，我们创建了神经网络模型，对评论所针对的身份群体进行多标签二值预测。具体而言，我们研究了种族和性别认同中的8个广义认同群体和12个认同子群体。我们发现这些网络表现出良好的预测性能，在几个身份组上实现了大于0.9的ROC auc和大于0.7的PR auc。我们在HateCheck和Gab Hate语料库上验证了它们的性能，发现预测性能在大多数情况下都是通用的。我们还检查了模型在针对多个身份群体的评论上的性能。我们的研究结果证明了在社交媒体评论中同时识别目标群体的可行性。

{"title":"Targeted Identity Group Prediction in Hate Speech Corpora","authors":"Pratik S. Sachdeva, Renata Barreto, Claudia von Vacano, Chris J. Kennedy","doi":"10.18653/v1/2022.woah-1.22","DOIUrl":"https://doi.org/10.18653/v1/2022.woah-1.22","url":null,"abstract":"The past decade has seen an abundance of work seeking to detect, characterize, and measure online hate speech. A related, but less studied problem, is the detection of identity groups targeted by that hate speech. Predictive accuracy on this task can supplement additional analyses beyond hate speech detection, motivating its study. Using the Measuring Hate Speech corpus, which provided annotations for targeted identity groups, we created neural network models to perform multi-label binary prediction of identity groups targeted by a comment. Specifically, we studied 8 broad identity groups and 12 identity sub-groups within race and gender identity. We found that these networks exhibited good predictive performance, achieving ROC AUCs of greater than 0.9 and PR AUCs of greater than 0.7 on several identity groups. We validated their performance on HateCheck and Gab Hate Corpora, finding that predictive performance generalized in most settings. We additionally examined the performance of the model on comments targeting multiple identity groups. Our results demonstrate the feasibility of simultaneously identifying targeted groups in social media comments.","PeriodicalId":440731,"journal":{"name":"Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121492205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀