首页 > 最新文献

Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)最新文献

英文 中文
Racist or Sexist Meme? Classifying Memes beyond Hateful 种族歧视还是性别歧视?对表情包进行分类
Pub Date : 2021-08-01 DOI: 10.18653/v1/2021.woah-1.23
Haris Bin Zia, Ignacio Castro, Gareth Tyson
Memes are the combinations of text and images that are often humorous in nature. But, that may not always be the case, and certain combinations of texts and images may depict hate, referred to as hateful memes. This work presents a multimodal pipeline that takes both visual and textual features from memes into account to (1) identify the protected category (e.g. race, sex etc.) that has been attacked; and (2) detect the type of attack (e.g. contempt, slurs etc.). Our pipeline uses state-of-the-art pre-trained visual and textual representations, followed by a simple logistic regression classifier. We employ our pipeline on the Hateful Memes Challenge dataset with additional newly created fine-grained labels for protected category and type of attack. Our best model achieves an AUROC of 0.96 for identifying the protected category, and 0.97 for detecting the type of attack. We release our code at https://github.com/harisbinzia/HatefulMemes
模因是文本和图像的组合,通常具有幽默的性质。但是,情况可能并非总是如此,某些文字和图像的组合可能会描绘仇恨,被称为仇恨表情包。这项工作提出了一个多模态管道,将模因的视觉和文本特征考虑在内,以:(1)识别受到攻击的受保护类别(例如种族、性别等);(2)检测攻击的类型(例如蔑视,诽谤等)。我们的管道使用最先进的预训练视觉和文本表示,然后是一个简单的逻辑回归分类器。我们在仇恨模因挑战数据集上使用我们的管道,并为受保护的类别和攻击类型添加了额外的新创建的细粒度标签。我们最好的模型在识别受保护类别时的AUROC为0.96,在检测攻击类型时的AUROC为0.97。我们在https://github.com/harisbinzia/HatefulMemes上发布我们的代码
{"title":"Racist or Sexist Meme? Classifying Memes beyond Hateful","authors":"Haris Bin Zia, Ignacio Castro, Gareth Tyson","doi":"10.18653/v1/2021.woah-1.23","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.23","url":null,"abstract":"Memes are the combinations of text and images that are often humorous in nature. But, that may not always be the case, and certain combinations of texts and images may depict hate, referred to as hateful memes. This work presents a multimodal pipeline that takes both visual and textual features from memes into account to (1) identify the protected category (e.g. race, sex etc.) that has been attacked; and (2) detect the type of attack (e.g. contempt, slurs etc.). Our pipeline uses state-of-the-art pre-trained visual and textual representations, followed by a simple logistic regression classifier. We employ our pipeline on the Hateful Memes Challenge dataset with additional newly created fine-grained labels for protected category and type of attack. Our best model achieves an AUROC of 0.96 for identifying the protected category, and 0.97 for detecting the type of attack. We release our code at https://github.com/harisbinzia/HatefulMemes","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"299 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114855837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A Large-Scale English Multi-Label Twitter Dataset for Cyberbullying and Online Abuse Detection 用于网络欺凌和网络滥用检测的大规模英语多标签Twitter数据集
Pub Date : 2021-08-01 DOI: 10.18653/v1/2021.woah-1.16
S. Salawu, Joan A. Lumsden, Yulan He
In this paper, we introduce a new English Twitter-based dataset for cyberbullying detection and online abuse. Comprising 62,587 tweets, this dataset was sourced from Twitter using specific query terms designed to retrieve tweets with high probabilities of various forms of bullying and offensive content, including insult, trolling, profanity, sarcasm, threat, porn and exclusion. We recruited a pool of 17 annotators to perform fine-grained annotation on the dataset with each tweet annotated by three annotators. All our annotators are high school educated and frequent users of social media. Inter-rater agreement for the dataset as measured by Krippendorff’s Alpha is 0.67. Analysis performed on the dataset confirmed common cyberbullying themes reported by other studies and revealed interesting relationships between the classes. The dataset was used to train a number of transformer-based deep learning models returning impressive results.
在本文中,我们介绍了一个新的基于英文twitter的网络欺凌检测和在线虐待数据集。该数据集包括62,587条推文,来自Twitter,使用特定的查询术语,旨在检索各种形式的欺凌和攻击性内容的高概率推文,包括侮辱、挑衅、亵渎、讽刺、威胁、色情和排斥。我们招募了17个注释者对数据集进行细粒度注释,每条tweet由3个注释者注释。我们所有的注释者都受过高中教育,经常使用社交媒体。用Krippendorff的Alpha来衡量的数据集的内部一致性是0.67。对数据集进行的分析证实了其他研究报告中常见的网络欺凌主题,并揭示了班级之间有趣的关系。该数据集被用来训练一些基于转换器的深度学习模型,得到了令人印象深刻的结果。
{"title":"A Large-Scale English Multi-Label Twitter Dataset for Cyberbullying and Online Abuse Detection","authors":"S. Salawu, Joan A. Lumsden, Yulan He","doi":"10.18653/v1/2021.woah-1.16","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.16","url":null,"abstract":"In this paper, we introduce a new English Twitter-based dataset for cyberbullying detection and online abuse. Comprising 62,587 tweets, this dataset was sourced from Twitter using specific query terms designed to retrieve tweets with high probabilities of various forms of bullying and offensive content, including insult, trolling, profanity, sarcasm, threat, porn and exclusion. We recruited a pool of 17 annotators to perform fine-grained annotation on the dataset with each tweet annotated by three annotators. All our annotators are high school educated and frequent users of social media. Inter-rater agreement for the dataset as measured by Krippendorff’s Alpha is 0.67. Analysis performed on the dataset confirmed common cyberbullying themes reported by other studies and revealed interesting relationships between the classes. The dataset was used to train a number of transformer-based deep learning models returning impressive results.","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133473858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Targets and Aspects in Social Media Hate Speech 社交媒体仇恨言论的目标和方面
Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.woah-1.19
A. Shvets, Paula Fortuna, Juan Soler, L. Wanner
Mainstream research on hate speech focused so far predominantly on the task of classifying mainly social media posts with respect to predefined typologies of rather coarse-grained hate speech categories. This may be sufficient if the goal is to detect and delete abusive language posts. However, removal is not always possible due to the legislation of a country. Also, there is evidence that hate speech cannot be successfully combated by merely removing hate speech posts; they should be countered by education and counter-narratives. For this purpose, we need to identify (i) who is the target in a given hate speech post, and (ii) what aspects (or characteristics) of the target are attributed to the target in the post. As the first approximation, we propose to adapt a generic state-of-the-art concept extraction model to the hate speech domain. The outcome of the experiments is promising and can serve as inspiration for further work on the task
到目前为止,关于仇恨言论的主流研究主要集中在对社交媒体帖子进行分类的任务上,这些帖子是根据相当粗粒度的仇恨言论类别的预定义类型进行分类的。如果目标是检测和删除辱骂性语言帖子,这可能就足够了。然而,由于一个国家的立法,移除并不总是可能的。此外,有证据表明,仅仅通过删除仇恨言论帖子是无法成功打击仇恨言论的;他们应该通过教育和反叙事来应对。为此,我们需要确定(i)在给定的仇恨言论帖子中谁是目标,以及(ii)目标的哪些方面(或特征)归因于帖子中的目标。作为第一个近似,我们提出了一个通用的最先进的概念提取模型,以适应仇恨言论领域。实验的结果是有希望的,可以为进一步的工作提供灵感
{"title":"Targets and Aspects in Social Media Hate Speech","authors":"A. Shvets, Paula Fortuna, Juan Soler, L. Wanner","doi":"10.18653/v1/2021.woah-1.19","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.19","url":null,"abstract":"Mainstream research on hate speech focused so far predominantly on the task of classifying mainly social media posts with respect to predefined typologies of rather coarse-grained hate speech categories. This may be sufficient if the goal is to detect and delete abusive language posts. However, removal is not always possible due to the legislation of a country. Also, there is evidence that hate speech cannot be successfully combated by merely removing hate speech posts; they should be countered by education and counter-narratives. For this purpose, we need to identify (i) who is the target in a given hate speech post, and (ii) what aspects (or characteristics) of the target are attributed to the target in the post. As the first approximation, we propose to adapt a generic state-of-the-art concept extraction model to the hate speech domain. The outcome of the experiments is promising and can serve as inspiration for further work on the task","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"741 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116120724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Jibes & Delights: A Dataset of Targeted Insults and Compliments to Tackle Online Abuse Jibes & Delights:有针对性的侮辱和赞美数据集,以解决在线滥用
Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.woah-1.14
Ravsimar Sodhi, Kartikey Pant, Radhika Mamidi
Online abuse and offensive language on social media have become widespread problems in today’s digital age. In this paper, we contribute a Reddit-based dataset, consisting of 68,159 insults and 51,102 compliments targeted at individuals instead of targeting a particular community or race. Secondly, we benchmark multiple existing state-of-the-art models for both classification and unsupervised style transfer on the dataset. Finally, we analyse the experimental results and conclude that the transfer task is challenging, requiring the models to understand the high degree of creativity exhibited in the data.
在当今的数字时代,社交媒体上的网络辱骂和攻击性语言已经成为普遍存在的问题。在本文中,我们提供了一个基于reddit的数据集,其中包括针对个人的68,159种侮辱和51,102种赞美,而不是针对特定的社区或种族。其次,我们在数据集上对多个现有的最先进的分类和无监督风格迁移模型进行基准测试。最后,我们分析了实验结果,得出迁移任务具有挑战性的结论,要求模型理解数据中显示的高度创造力。
{"title":"Jibes & Delights: A Dataset of Targeted Insults and Compliments to Tackle Online Abuse","authors":"Ravsimar Sodhi, Kartikey Pant, Radhika Mamidi","doi":"10.18653/v1/2021.woah-1.14","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.14","url":null,"abstract":"Online abuse and offensive language on social media have become widespread problems in today’s digital age. In this paper, we contribute a Reddit-based dataset, consisting of 68,159 insults and 51,102 compliments targeted at individuals instead of targeting a particular community or race. Secondly, we benchmark multiple existing state-of-the-art models for both classification and unsupervised style transfer on the dataset. Finally, we analyse the experimental results and conclude that the transfer task is challenging, requiring the models to understand the high degree of creativity exhibited in the data.","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129476383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Multimodal or Text? Retrieval or BERT? Benchmarking Classifiers for the Shared Task on Hateful Memes 多模态还是文本?检索还是BERT?仇恨表情包共享任务的基准分类器
Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.woah-1.24
Vasiliki Kougia, John Pavlopoulos
The Shared Task on Hateful Memes is a challenge that aims at the detection of hateful content in memes by inviting the implementation of systems that understand memes, potentially by combining image and textual information. The challenge consists of three detection tasks: hate, protected category and attack type. The first is a binary classification task, while the other two are multi-label classification tasks. Our participation included a text-based BERT baseline (TxtBERT), the same but adding information from the image (ImgBERT), and neural retrieval approaches. We also experimented with retrieval augmented classification models. We found that an ensemble of TxtBERT and ImgBERT achieves the best performance in terms of ROC AUC score in two out of the three tasks on our development set.
仇恨表情包共享任务是一项挑战,旨在通过引入理解表情包的系统来检测表情包中的仇恨内容,可能会将图像和文本信息结合起来。挑战包括三个检测任务:仇恨、受保护类别和攻击类型。第一个是二元分类任务,另外两个是多标签分类任务。我们的参与包括基于文本的BERT基线(TxtBERT),相同但添加了来自图像的信息(ImgBERT),以及神经检索方法。我们还尝试了检索增强分类模型。我们发现TxtBERT和ImgBERT的集合在我们的开发集中的三个任务中的两个任务中达到了ROC AUC得分的最佳性能。
{"title":"Multimodal or Text? Retrieval or BERT? Benchmarking Classifiers for the Shared Task on Hateful Memes","authors":"Vasiliki Kougia, John Pavlopoulos","doi":"10.18653/v1/2021.woah-1.24","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.24","url":null,"abstract":"The Shared Task on Hateful Memes is a challenge that aims at the detection of hateful content in memes by inviting the implementation of systems that understand memes, potentially by combining image and textual information. The challenge consists of three detection tasks: hate, protected category and attack type. The first is a binary classification task, while the other two are multi-label classification tasks. Our participation included a text-based BERT baseline (TxtBERT), the same but adding information from the image (ImgBERT), and neural retrieval approaches. We also experimented with retrieval augmented classification models. We found that an ensemble of TxtBERT and ImgBERT achieves the best performance in terms of ROC AUC score in two out of the three tasks on our development set.","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"51 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114116853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Exploiting Auxiliary Data for Offensive Language Detection with Bidirectional Transformers 利用双向变压器辅助数据进行攻击性语言检测
Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.woah-1.1
Sumer Singh, Sheng Li
Offensive language detection (OLD) has received increasing attention due to its societal impact. Recent work shows that bidirectional transformer based methods obtain impressive performance on OLD. However, such methods usually rely on large-scale well-labeled OLD datasets for model training. To address the issue of data/label scarcity in OLD, in this paper, we propose a simple yet effective domain adaptation approach to train bidirectional transformers. Our approach introduces domain adaptation (DA) training procedures to ALBERT, such that it can effectively exploit auxiliary data from source domains to improve the OLD performance in a target domain. Experimental results on benchmark datasets show that our approach, ALBERT (DA), obtains the state-of-the-art performance in most cases. Particularly, our approach significantly benefits underrepresented and under-performing classes, with a significant improvement over ALBERT.
冒犯性语言检测由于其社会影响而受到越来越多的关注。最近的研究表明,基于双向变压器的方法在OLD上取得了令人印象深刻的性能。然而,这些方法通常依赖于大规模标记良好的OLD数据集进行模型训练。为了解决OLD中数据/标签稀缺的问题,本文提出了一种简单而有效的域自适应方法来训练双向变压器。我们的方法在ALBERT中引入了域自适应(DA)训练过程,从而可以有效地利用源域的辅助数据来提高目标域的OLD性能。在基准数据集上的实验结果表明,我们的ALBERT (DA)方法在大多数情况下获得了最先进的性能。特别是,我们的方法显著地有利于代表性不足和表现不佳的班级,比ALBERT有了显著的改进。
{"title":"Exploiting Auxiliary Data for Offensive Language Detection with Bidirectional Transformers","authors":"Sumer Singh, Sheng Li","doi":"10.18653/v1/2021.woah-1.1","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.1","url":null,"abstract":"Offensive language detection (OLD) has received increasing attention due to its societal impact. Recent work shows that bidirectional transformer based methods obtain impressive performance on OLD. However, such methods usually rely on large-scale well-labeled OLD datasets for model training. To address the issue of data/label scarcity in OLD, in this paper, we propose a simple yet effective domain adaptation approach to train bidirectional transformers. Our approach introduces domain adaptation (DA) training procedures to ALBERT, such that it can effectively exploit auxiliary data from source domains to improve the OLD performance in a target domain. Experimental results on benchmark datasets show that our approach, ALBERT (DA), obtains the state-of-the-art performance in most cases. Particularly, our approach significantly benefits underrepresented and under-performing classes, with a significant improvement over ALBERT.","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128833485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Toxic Comment Collection: Making More Than 30 Datasets Easily Accessible in One Unified Format 有毒评论收集:使30多个数据集在一个统一的格式中易于访问
Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.woah-1.17
Julian Risch, Philipp Schmidt, Ralf Krestel
With the rise of research on toxic comment classification, more and more annotated datasets have been released. The wide variety of the task (different languages, different labeling processes and schemes) has led to a large amount of heterogeneous datasets that can be used for training and testing very specific settings. Despite recent efforts to create web pages that provide an overview, most publications still use only a single dataset. They are not stored in one central database, they come in many different data formats and it is difficult to interpret their class labels and how to reuse these labels in other projects. To overcome these issues, we present a collection of more than thirty datasets in the form of a software tool that automatizes downloading and processing of the data and presents them in a unified data format that also offers a mapping of compatible class labels. Another advantage of that tool is that it gives an overview of properties of available datasets, such as different languages, platforms, and class labels to make it easier to select suitable training and test data.
随着有毒评论分类研究的兴起,越来越多的注释数据集被发布。各种各样的任务(不同的语言,不同的标记过程和方案)导致了大量的异构数据集,可用于训练和测试非常具体的设置。尽管最近努力创建提供概述的网页,但大多数出版物仍然只使用单个数据集。它们不是存储在一个中央数据库中,它们以许多不同的数据格式出现,很难解释它们的类标签以及如何在其他项目中重用这些标签。为了克服这些问题,我们以软件工具的形式提供了三十多个数据集的集合,该软件工具可以自动下载和处理数据,并以统一的数据格式呈现它们,该格式还提供了兼容类标签的映射。该工具的另一个优点是,它提供了可用数据集属性的概述,例如不同的语言、平台和类标签,从而更容易选择合适的训练和测试数据。
{"title":"Toxic Comment Collection: Making More Than 30 Datasets Easily Accessible in One Unified Format","authors":"Julian Risch, Philipp Schmidt, Ralf Krestel","doi":"10.18653/v1/2021.woah-1.17","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.17","url":null,"abstract":"With the rise of research on toxic comment classification, more and more annotated datasets have been released. The wide variety of the task (different languages, different labeling processes and schemes) has led to a large amount of heterogeneous datasets that can be used for training and testing very specific settings. Despite recent efforts to create web pages that provide an overview, most publications still use only a single dataset. They are not stored in one central database, they come in many different data formats and it is difficult to interpret their class labels and how to reuse these labels in other projects. To overcome these issues, we present a collection of more than thirty datasets in the form of a software tool that automatizes downloading and processing of the data and presents them in a unified data format that also offers a mapping of compatible class labels. Another advantage of that tool is that it gives an overview of properties of available datasets, such as different languages, platforms, and class labels to make it easier to select suitable training and test data.","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131195627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Findings of the WOAH 5 Shared Task on Fine Grained Hateful Memes Detection WOAH 5共享任务在细粒度仇恨模因检测中的发现
Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.woah-1.21
Lambert Mathias, Shaoliang Nie, Aida Mostafazadeh Davani, Douwe Kiela, Vinodkumar Prabhakaran, Bertie Vidgen, Zeerak Talat
We present the results and main findings of the shared task at WOAH 5 on hateful memes detection. The task include two subtasks relating to distinct challenges in the fine-grained detection of hateful memes: (1) the protected category attacked by the meme and (2) the attack type. 3 teams submitted system description papers. This shared task builds on the hateful memes detection task created by Facebook AI Research in 2020.
我们介绍了WOAH 5关于仇恨模因检测的共享任务的结果和主要发现。该任务包括两个子任务,它们与细粒度检测仇恨模因的不同挑战有关:(1)被模因攻击的受保护类别和(2)攻击类型。3个小组提交了系统描述论文。这项共享任务建立在2020年Facebook人工智能研究创建的仇恨表情包检测任务的基础上。
{"title":"Findings of the WOAH 5 Shared Task on Fine Grained Hateful Memes Detection","authors":"Lambert Mathias, Shaoliang Nie, Aida Mostafazadeh Davani, Douwe Kiela, Vinodkumar Prabhakaran, Bertie Vidgen, Zeerak Talat","doi":"10.18653/v1/2021.woah-1.21","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.21","url":null,"abstract":"We present the results and main findings of the shared task at WOAH 5 on hateful memes detection. The task include two subtasks relating to distinct challenges in the fine-grained detection of hateful memes: (1) the protected category attacked by the meme and (2) the attack type. 3 teams submitted system description papers. This shared task builds on the hateful memes detection task created by Facebook AI Research in 2020.","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124897610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Fine-Grained Fairness Analysis of Abusive Language Detection Systems with CheckList 基于检查表的滥用语言检测系统的细粒度公平性分析
Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.woah-1.9
Marta Marchiori Manerba, Sara Tonelli
Current abusive language detection systems have demonstrated unintended bias towards sensitive features such as nationality or gender. This is a crucial issue, which may harm minorities and underrepresented groups if such systems were integrated in real-world applications. In this paper, we create ad hoc tests through the CheckList tool (Ribeiro et al., 2020) to detect biases within abusive language classifiers for English. We compare the behaviour of two BERT-based models, one trained on a generic hate speech dataset and the other on a dataset for misogyny detection. Our evaluation shows that, although BERT-based classifiers achieve high accuracy levels on a variety of natural language processing tasks, they perform very poorly as regards fairness and bias, in particular on samples involving implicit stereotypes, expressions of hate towards minorities and protected attributes such as race or sexual orientation. We release both the notebooks implemented to extend the Fairness tests and the synthetic datasets usable to evaluate systems bias independently of CheckList.
目前的滥用语言检测系统已经显示出对国籍或性别等敏感特征的无意识偏见。这是一个至关重要的问题,如果将这些系统集成到实际应用中,可能会损害少数民族和代表性不足的群体。在本文中,我们通过CheckList工具(Ribeiro et al., 2020)创建了特别测试,以检测英语滥用语言分类器中的偏差。我们比较了两个基于bert的模型的行为,一个是在通用仇恨言论数据集上训练的,另一个是在厌女症检测数据集上训练的。我们的评估表明,尽管基于bert的分类器在各种自然语言处理任务上达到了很高的准确率水平,但它们在公平性和偏见方面的表现非常差,特别是在涉及隐性刻板印象、对少数民族的仇恨表达和受保护属性(如种族或性取向)的样本上。我们发布了用于扩展公平性测试的笔记本和可用于独立于CheckList评估系统偏差的合成数据集。
{"title":"Fine-Grained Fairness Analysis of Abusive Language Detection Systems with CheckList","authors":"Marta Marchiori Manerba, Sara Tonelli","doi":"10.18653/v1/2021.woah-1.9","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.9","url":null,"abstract":"Current abusive language detection systems have demonstrated unintended bias towards sensitive features such as nationality or gender. This is a crucial issue, which may harm minorities and underrepresented groups if such systems were integrated in real-world applications. In this paper, we create ad hoc tests through the CheckList tool (Ribeiro et al., 2020) to detect biases within abusive language classifiers for English. We compare the behaviour of two BERT-based models, one trained on a generic hate speech dataset and the other on a dataset for misogyny detection. Our evaluation shows that, although BERT-based classifiers achieve high accuracy levels on a variety of natural language processing tasks, they perform very poorly as regards fairness and bias, in particular on samples involving implicit stereotypes, expressions of hate towards minorities and protected attributes such as race or sexual orientation. We release both the notebooks implemented to extend the Fairness tests and the synthetic datasets usable to evaluate systems bias independently of CheckList.","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"64 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116557884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Abusive Language on Social Media Through the Legal Looking Glass 透过法律的镜子看社交媒体上的辱骂语言
Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.woah-1.20
Thales Bertaglia, A. Grigoriu, M. Dumontier, Gijs van Dijck
Abusive language is a growing phenomenon on social media platforms. Its effects can reach beyond the online context, contributing to mental or emotional stress on users. Automatic tools for detecting abuse can alleviate the issue. In practice, developing automated methods to detect abusive language relies on good quality data. However, there is currently a lack of standards for creating datasets in the field. These standards include definitions of what is considered abusive language, annotation guidelines and reporting on the process. This paper introduces an annotation framework inspired by legal concepts to define abusive language in the context of online harassment. The framework uses a 7-point Likert scale for labelling instead of class labels. We also present ALYT – a dataset of Abusive Language on YouTube. ALYT includes YouTube comments in English extracted from videos on different controversial topics and labelled by Law students. The comments were sampled from the actual collected data, without artificial methods for increasing the abusive content. The paper describes the annotation process thoroughly, including all its guidelines and training steps.
在社交媒体平台上,辱骂性语言日益成为一种现象。它的影响可以超越网络环境,对用户造成精神或情感压力。用于检测滥用的自动工具可以缓解这个问题。在实践中,开发自动化方法来检测辱骂性语言依赖于高质量的数据。然而,目前在该领域缺乏创建数据集的标准。这些标准包括对被认为是滥用语言的定义、注释指南和过程报告。本文引入了一个受法律概念启发的注释框架来定义网络骚扰背景下的辱骂性语言。该框架使用7点李克特量表来标记,而不是分类标签。我们也提出ALYT -在YouTube上的辱骂语言的数据集。ALYT包括从YouTube上不同争议话题的视频中提取的英语评论,并由法律学生标记。评论是从实际收集的数据中抽取的,没有人为增加滥用内容的方法。本文详细地描述了标注过程,包括所有的指导方针和训练步骤。
{"title":"Abusive Language on Social Media Through the Legal Looking Glass","authors":"Thales Bertaglia, A. Grigoriu, M. Dumontier, Gijs van Dijck","doi":"10.18653/v1/2021.woah-1.20","DOIUrl":"https://doi.org/10.18653/v1/2021.woah-1.20","url":null,"abstract":"Abusive language is a growing phenomenon on social media platforms. Its effects can reach beyond the online context, contributing to mental or emotional stress on users. Automatic tools for detecting abuse can alleviate the issue. In practice, developing automated methods to detect abusive language relies on good quality data. However, there is currently a lack of standards for creating datasets in the field. These standards include definitions of what is considered abusive language, annotation guidelines and reporting on the process. This paper introduces an annotation framework inspired by legal concepts to define abusive language in the context of online harassment. The framework uses a 7-point Likert scale for labelling instead of class labels. We also present ALYT – a dataset of Abusive Language on YouTube. ALYT includes YouTube comments in English extracted from videos on different controversial topics and labelled by Law students. The comments were sampled from the actual collected data, without artificial methods for increasing the abusive content. The paper describes the annotation process thoroughly, including all its guidelines and training steps.","PeriodicalId":166161,"journal":{"name":"Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115033391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1