首页 > 最新文献

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)最新文献

英文 中文
Cleansing & expanding the HURTLEX(el) with a multidimensional categorization of offensive words 用冒犯性词汇的多维分类清理和扩展HURTLEX(el)
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.10
Vivian Stamou, Iakovi Alexiou, Antigone Klimi, Eleftheria Molou, Alexandra Saivanidou, Stella Markantonatou
We present a cleansed version of the multilingual lexicon HURTLEX-(EL) comprising 737 offensive words of Modern Greek. We worked bottom-up in two annotation rounds and developed detailed guidelines by cross-classifying words on three dimensions: context, reference, and thematic domain. Our classification reveals a wider spectrum of thematic domains concerning the study of offensive language than previously thought Efthymiou et al. (2014) and reveals social and cultural aspects that are not included in the HURTLEX categories.
我们提出了多语言词典HURTLEX-(EL)的净化版本,包括737现代希腊语的攻击性词汇。我们自下而上地进行了两轮注释,并通过在三个维度上交叉分类单词制定了详细的指导方针:上下文、参考和主题领域。我们的分类揭示了与攻击性语言研究有关的更广泛的主题领域,而不是之前认为的Efthymiou等人(2014),并揭示了未包括在HURTLEX类别中的社会和文化方面。
{"title":"Cleansing & expanding the HURTLEX(el) with a multidimensional categorization of offensive words","authors":"Vivian Stamou, Iakovi Alexiou, Antigone Klimi, Eleftheria Molou, Alexandra Saivanidou, Stella Markantonatou","doi":"10.18653/v1/2022.woah-1.10","DOIUrl":"https://doi.org/10.18653/v1/2022.woah-1.10","url":null,"abstract":"We present a cleansed version of the multilingual lexicon HURTLEX-(EL) comprising 737 offensive words of Modern Greek. We worked bottom-up in two annotation rounds and developed detailed guidelines by cross-classifying words on three dimensions: context, reference, and thematic domain. Our classification reveals a wider spectrum of thematic domains concerning the study of offensive language than previously thought Efthymiou et al. (2014) and reveals social and cultural aspects that are not included in the HURTLEX categories.","PeriodicalId":440731,"journal":{"name":"Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)","volume":"212 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115646875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Free speech or Free Hate Speech? Analyzing the Proliferation of Hate Speech in Parler 言论自由还是仇恨言论自由?分析律师中仇恨言论的扩散
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.11
Abraham Israeli, Oren Tsur
Social platforms such as Gab and Parler, branded as ‘free-speech’ networks, have seen a significant growth of their user base in recent years. This popularity is mainly attributed to the stricter moderation enforced by mainstream platforms such as Twitter, Facebook, and Reddit.In this work we provide the first large scale analysis of hate-speech on Parler. We experiment with an array of algorithms for hate-speech detection, demonstrating limitations of transfer learning in that domain, given the illusive and ever changing nature of the ways hate-speech is delivered. In order to improve classification accuracy we annotated 10K Parler posts, which we use to fine-tune a BERT classifier. Classification of individual posts is then leveraged for the classification of millions of users via label propagation over the social network. Classifying users by their propensity to disseminate hate, we find that hate mongers make 16.1% of Parler active users, and that they have distinct characteristics comparing to other user groups. We further complement our analysis by comparing the trends observed in Parler to those found in Gab. To the best of our knowledge, this is among the first works to analyze hate speech in Parler in a quantitative manner and on the user level.
Gab和Parler等社交平台被标榜为“言论自由”的网络,近年来用户基础显著增长。这种受欢迎程度主要归因于Twitter、Facebook和Reddit等主流平台实施的更严格的审核。在这项工作中,我们首次对Parler上的仇恨言论进行了大规模分析。我们对仇恨言论检测的一系列算法进行了实验,证明了迁移学习在该领域的局限性,因为仇恨言论的传递方式具有欺骗性和不断变化的性质。为了提高分类精度,我们注释了10K个Parler帖子,我们用它来微调BERT分类器。然后,通过社交网络上的标签传播,对单个帖子进行分类,从而对数百万用户进行分类。根据传播仇恨的倾向对用户进行分类,我们发现仇恨贩子占Parler活跃用户的16.1%,与其他用户群体相比,他们具有明显的特征。通过比较在Parler中观察到的趋势与在Gab中发现的趋势,我们进一步补充了我们的分析。据我们所知,这是第一批以定量方式和用户层面分析Parler中的仇恨言论的作品之一。
{"title":"Free speech or Free Hate Speech? Analyzing the Proliferation of Hate Speech in Parler","authors":"Abraham Israeli, Oren Tsur","doi":"10.18653/v1/2022.woah-1.11","DOIUrl":"https://doi.org/10.18653/v1/2022.woah-1.11","url":null,"abstract":"Social platforms such as Gab and Parler, branded as ‘free-speech’ networks, have seen a significant growth of their user base in recent years. This popularity is mainly attributed to the stricter moderation enforced by mainstream platforms such as Twitter, Facebook, and Reddit.In this work we provide the first large scale analysis of hate-speech on Parler. We experiment with an array of algorithms for hate-speech detection, demonstrating limitations of transfer learning in that domain, given the illusive and ever changing nature of the ways hate-speech is delivered. In order to improve classification accuracy we annotated 10K Parler posts, which we use to fine-tune a BERT classifier. Classification of individual posts is then leveraged for the classification of millions of users via label propagation over the social network. Classifying users by their propensity to disseminate hate, we find that hate mongers make 16.1% of Parler active users, and that they have distinct characteristics comparing to other user groups. We further complement our analysis by comparing the trends observed in Parler to those found in Gab. To the best of our knowledge, this is among the first works to analyze hate speech in Parler in a quantitative manner and on the user level.","PeriodicalId":440731,"journal":{"name":"Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127474359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
“Zo Grof !”: A Comprehensive Corpus for Offensive and Abusive Language in Dutch “Zo Grof !”:荷兰语侮辱性语言综合语料库
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.5
Ward Ruitenbeek, Victor Zwart, Robin Van Der Noord, Zhenja Gnezdilov, T. Caselli
This paper presents a comprehensive corpus for the study of socially unacceptable language in Dutch. The corpus extends and revise an existing resource with more data and introduces a new annotation dimension for offensive language, making it a unique resource in the Dutch language panorama. Each language phenomenon (abusive and offensive language) in the corpus has been annotated with a multi-layer annotation scheme modelling the explicitness and the target(s) of the message. We have conducted a new set of experiments with different classification algorithms on all annotation dimensions. Monolingual Pre-Trained Language Models prove as the best systems, obtaining a macro-average F1 of 0.828 for binary classification of offensive language, and 0.579 for the targets of offensive messages. Furthermore, the best system obtains a macro-average F1 of 0.667 for distinguishing between abusive and offensive messages.
本文提供了一个全面的语料库,用于研究荷兰语中社会不可接受的语言。该语料库对现有资源进行了扩展和修订,增加了更多的数据,并为攻击性语言引入了新的注释维度,使其成为荷兰语全景中的独特资源。语料库中的每种语言现象(辱骂性和攻击性语言)都使用多层注释方案进行注释,该方案对消息的显式性和目标进行建模。我们在所有标注维度上使用不同的分类算法进行了一组新的实验。单语预训练语言模型被证明是最好的系统,对于攻击性语言的二元分类,其宏观平均F1为0.828,对于攻击性信息的目标,其宏观平均F1为0.579。此外,最佳系统在区分辱骂性和攻击性信息方面获得了0.667的宏观平均F1。
{"title":"“Zo Grof !”: A Comprehensive Corpus for Offensive and Abusive Language in Dutch","authors":"Ward Ruitenbeek, Victor Zwart, Robin Van Der Noord, Zhenja Gnezdilov, T. Caselli","doi":"10.18653/v1/2022.woah-1.5","DOIUrl":"https://doi.org/10.18653/v1/2022.woah-1.5","url":null,"abstract":"This paper presents a comprehensive corpus for the study of socially unacceptable language in Dutch. The corpus extends and revise an existing resource with more data and introduces a new annotation dimension for offensive language, making it a unique resource in the Dutch language panorama. Each language phenomenon (abusive and offensive language) in the corpus has been annotated with a multi-layer annotation scheme modelling the explicitness and the target(s) of the message. We have conducted a new set of experiments with different classification algorithms on all annotation dimensions. Monolingual Pre-Trained Language Models prove as the best systems, obtaining a macro-average F1 of 0.828 for binary classification of offensive language, and 0.579 for the targets of offensive messages. Furthermore, the best system obtains a macro-average F1 of 0.667 for distinguishing between abusive and offensive messages.","PeriodicalId":440731,"journal":{"name":"Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125404179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Towards a Multi-Entity Aspect-Based Sentiment Analysis for Characterizing Directed Social Regard in Online Messaging 基于多实体面向方面的情感分析表征网络信息中的定向社会关注
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.19
Joan Zheng, Scott Friedman, S. Schmer-Galunder, Ian H. Magnusson, Ruta Wheelock, Jeremy Gottlieb, Diana Gomez, Christopher Miller
Online messaging is dynamic, influential, and highly contextual, and a single post may contain contrasting sentiments towards multiple entities, such as dehumanizing one actor while empathizing with another in the same message.These complexities are important to capture for understanding the systematic abuse voiced within an online community, or for determining whether individuals are advocating for abuse, opposing abuse, or simply reporting abuse. In this work, we describe a formulation of directed social regard (DSR) as a problem of multi-entity aspect-based sentiment analysis (ME-ABSA), which models the degree of intensity of multiple sentiments that are associated with entities described by a text document. Our DSR schema is informed by Bandura’s psychosocial theory of moral disengagement and by recent work in ABSA. We present a dataset of over 2,900 posts and sentences, comprising over 24,000 entities annotated for DSR over nine psychosocial dimensions by three annotators. We present a novel transformer-based ME-ABSA model for DSR, achieving favorable preliminary results on this dataset.
在线消息传递是动态的、有影响力的和高度上下文相关的,单个帖子可能包含对多个实体的对比情绪,例如在同一消息中贬低一个演员而同情另一个演员。这些复杂性对于理解在线社区中有系统的虐待行为,或者确定个人是在倡导虐待、反对虐待,还是只是报告虐待,都是很重要的。在这项工作中,我们将定向社会关注(DSR)的表述描述为一个基于多实体方面的情感分析(ME-ABSA)问题,该问题模拟了与文本文档所描述的实体相关的多种情感的强度程度。我们的DSR图式是由Bandura的道德脱离的社会心理理论和ABSA最近的工作提供的。我们提供了一个超过2900个帖子和句子的数据集,包括超过24,000个实体,由三个注释者在9个社会心理维度上为DSR注释。我们提出了一种新的基于变压器的ME-ABSA DSR模型,在该数据集上取得了良好的初步结果。
{"title":"Towards a Multi-Entity Aspect-Based Sentiment Analysis for Characterizing Directed Social Regard in Online Messaging","authors":"Joan Zheng, Scott Friedman, S. Schmer-Galunder, Ian H. Magnusson, Ruta Wheelock, Jeremy Gottlieb, Diana Gomez, Christopher Miller","doi":"10.18653/v1/2022.woah-1.19","DOIUrl":"https://doi.org/10.18653/v1/2022.woah-1.19","url":null,"abstract":"Online messaging is dynamic, influential, and highly contextual, and a single post may contain contrasting sentiments towards multiple entities, such as dehumanizing one actor while empathizing with another in the same message.These complexities are important to capture for understanding the systematic abuse voiced within an online community, or for determining whether individuals are advocating for abuse, opposing abuse, or simply reporting abuse. In this work, we describe a formulation of directed social regard (DSR) as a problem of multi-entity aspect-based sentiment analysis (ME-ABSA), which models the degree of intensity of multiple sentiments that are associated with entities described by a text document. Our DSR schema is informed by Bandura’s psychosocial theory of moral disengagement and by recent work in ABSA. We present a dataset of over 2,900 posts and sentences, comprising over 24,000 entities annotated for DSR over nine psychosocial dimensions by three annotators. We present a novel transformer-based ME-ABSA model for DSR, achieving favorable preliminary results on this dataset.","PeriodicalId":440731,"journal":{"name":"Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)","volume":"153 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134605899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improving Generalization of Hate Speech Detection Systems to Novel Target Groups via Domain Adaptation 基于领域自适应的仇恨语音检测系统泛化方法研究
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.4
F. Ludwig, Klara Dolos, T. Zesch, E. Hobley
Despite recent advances in machine learning based hate speech detection, classifiers still struggle with generalizing knowledge to out-of-domain data samples. In this paper, we investigate the generalization capabilities of deep learning models to different target groups of hate speech under clean experimental settings. Furthermore, we assess the efficacy of three different strategies of unsupervised domain adaptation to improve these capabilities. Given the diversity of hate and its rapid dynamics in the online world (e.g. the evolution of new target groups like virologists during the COVID-19 pandemic), robustly detecting hate aimed at newly identified target groups is a highly relevant research question. We show that naively trained models suffer from a target group specific bias, which can be reduced via domain adaptation. We were able to achieve a relative improvement of the F1-score between 5.8% and 10.7% for out-of-domain target groups of hate speech compared to baseline approaches by utilizing domain adaptation.
尽管最近在基于机器学习的仇恨言论检测方面取得了进展,分类器仍然难以将知识推广到域外数据样本。在本文中,我们在干净的实验环境下研究了深度学习模型对不同仇恨言论目标群体的泛化能力。此外,我们评估了三种不同的无监督域适应策略来提高这些能力的有效性。鉴于仇恨的多样性及其在网络世界中的快速动态(例如,在COVID-19大流行期间病毒学家等新目标群体的演变),强有力地检测针对新确定目标群体的仇恨是一个高度相关的研究问题。我们表明,天真训练的模型会受到目标群体特定偏差的影响,这可以通过领域适应来减少。与使用域适应的基线方法相比,我们能够实现域外仇恨言论目标群体的f1得分在5.8%到10.7%之间的相对提高。
{"title":"Improving Generalization of Hate Speech Detection Systems to Novel Target Groups via Domain Adaptation","authors":"F. Ludwig, Klara Dolos, T. Zesch, E. Hobley","doi":"10.18653/v1/2022.woah-1.4","DOIUrl":"https://doi.org/10.18653/v1/2022.woah-1.4","url":null,"abstract":"Despite recent advances in machine learning based hate speech detection, classifiers still struggle with generalizing knowledge to out-of-domain data samples. In this paper, we investigate the generalization capabilities of deep learning models to different target groups of hate speech under clean experimental settings. Furthermore, we assess the efficacy of three different strategies of unsupervised domain adaptation to improve these capabilities. Given the diversity of hate and its rapid dynamics in the online world (e.g. the evolution of new target groups like virologists during the COVID-19 pandemic), robustly detecting hate aimed at newly identified target groups is a highly relevant research question. We show that naively trained models suffer from a target group specific bias, which can be reduced via domain adaptation. We were able to achieve a relative improvement of the F1-score between 5.8% and 10.7% for out-of-domain target groups of hate speech compared to baseline approaches by utilizing domain adaptation.","PeriodicalId":440731,"journal":{"name":"Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)","volume":"466 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133855764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Resources for Multilingual Hate Speech Detection 多语言仇恨言论检测资源
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.12
Ayme Arango Monnar, Jorge Perez, Bárbara Poblete, M. Saldaña, Valentina Proust
{"title":"Resources for Multilingual Hate Speech Detection","authors":"Ayme Arango Monnar, Jorge Perez, Bárbara Poblete, M. Saldaña, Valentina Proust","doi":"10.18653/v1/2022.woah-1.12","DOIUrl":"https://doi.org/10.18653/v1/2022.woah-1.12","url":null,"abstract":"","PeriodicalId":440731,"journal":{"name":"Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124598204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Separating Hate Speech and Offensive Language Classes via Adversarial Debiasing 通过对抗性去偏见分离仇恨言论和攻击性语言类
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.1
Shuzhou Yuan, Antonis Maronikolakis, Hinrich Schütze
Research to tackle hate speech plaguing online media has made strides in providing solutions, analyzing bias and curating data. A challenging problem is ambiguity between hate speech and offensive language, causing low performance both overall and specifically for the hate speech class. It can be argued that misclassifying actual hate speech content as merely offensive can lead to further harm against targeted groups. In our work, we mitigate this potentially harmful phenomenon by proposing an adversarial debiasing method to separate the two classes. We show that our method works for English, Arabic German and Hindi, plus in a multilingual setting, improving performance over baselines.
针对困扰网络媒体的仇恨言论的研究在提供解决方案、分析偏见和管理数据方面取得了长足进展。一个具有挑战性的问题是仇恨言论和攻击性语言之间的歧义,这导致仇恨言论类的总体表现和具体表现都很低。可以认为,将实际的仇恨言论内容错误地归类为仅仅是攻击性的,可能会导致对目标群体的进一步伤害。在我们的工作中,我们通过提出一种对抗性的去偏见方法来分离这两个类别,从而减轻了这种潜在的有害现象。我们表明,我们的方法适用于英语、阿拉伯语、德语和印地语,以及在多语言环境中,提高了基准性能。
{"title":"Separating Hate Speech and Offensive Language Classes via Adversarial Debiasing","authors":"Shuzhou Yuan, Antonis Maronikolakis, Hinrich Schütze","doi":"10.18653/v1/2022.woah-1.1","DOIUrl":"https://doi.org/10.18653/v1/2022.woah-1.1","url":null,"abstract":"Research to tackle hate speech plaguing online media has made strides in providing solutions, analyzing bias and curating data. A challenging problem is ambiguity between hate speech and offensive language, causing low performance both overall and specifically for the hate speech class. It can be argued that misclassifying actual hate speech content as merely offensive can lead to further harm against targeted groups. In our work, we mitigate this potentially harmful phenomenon by proposing an adversarial debiasing method to separate the two classes. We show that our method works for English, Arabic German and Hindi, plus in a multilingual setting, improving performance over baselines.","PeriodicalId":440731,"journal":{"name":"Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130296854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Counter-TWIT: An Italian Corpus for Online Counterspeech in Ecological Contexts 反推特:生态语境下在线反言论的意大利语语料库
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.woah-1.6
Pierpaolo Goffredo, Valerio Basile, B. Cepollaro, V. Patti
This work describes the process of creating a corpus of Twitter conversations annotated for the presence of counterspeech in response to toxic speech related to axes of discrimination linked to sexism, racism and homophobia. The main novelty is an annotated dataset comprising relevant tweets in their context of occurrence. The corpus is made up of tweets and responses captured by different profiles replying to discriminatory content or objectionably couched news. An annotation scheme was created to make explicit the knowledge on the dimensions of toxic speech and counterspeech.An analysis of the collected and annotated data and of the IAA that emerged during the annotation process is included. Moreover, we report about preliminary experiments on automatic counterspeech detection, based on supervised automatic learning models trained on the new dataset. The results highlight the fundamental role played by the context in this detection task, confirming our intuitions about the importance to collect tweets in their context of occurrence.
这项工作描述了创建一个Twitter对话语料库的过程,该语料库对与性别歧视、种族主义和同性恋恐惧症相关的歧视相关的有毒言论进行了注释,以应对反言论的存在。主要的新颖之处在于一个带注释的数据集,该数据集包含发生上下文中的相关tweet。语料库由不同的个人资料捕获的tweet和回复组成,这些个人资料回复歧视性内容或令人反感的措辞新闻。提出了一种标注方案,明确了有毒言语和反言语维度上的知识。还包括对收集和注释的数据以及在注释过程中出现的IAA的分析。此外,我们报告了基于新数据集训练的监督式自动学习模型的自动反语音检测的初步实验。结果突出了上下文在此检测任务中所起的基本作用,证实了我们关于在其发生上下文中收集tweet的重要性的直觉。
{"title":"Counter-TWIT: An Italian Corpus for Online Counterspeech in Ecological Contexts","authors":"Pierpaolo Goffredo, Valerio Basile, B. Cepollaro, V. Patti","doi":"10.18653/v1/2022.woah-1.6","DOIUrl":"https://doi.org/10.18653/v1/2022.woah-1.6","url":null,"abstract":"This work describes the process of creating a corpus of Twitter conversations annotated for the presence of counterspeech in response to toxic speech related to axes of discrimination linked to sexism, racism and homophobia. The main novelty is an annotated dataset comprising relevant tweets in their context of occurrence. The corpus is made up of tweets and responses captured by different profiles replying to discriminatory content or objectionably couched news. An annotation scheme was created to make explicit the knowledge on the dimensions of toxic speech and counterspeech.An analysis of the collected and annotated data and of the IAA that emerged during the annotation process is included. Moreover, we report about preliminary experiments on automatic counterspeech detection, based on supervised automatic learning models trained on the new dataset. The results highlight the fundamental role played by the context in this detection task, confirming our intuitions about the importance to collect tweets in their context of occurrence.","PeriodicalId":440731,"journal":{"name":"Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)","volume":"196 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115503794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1