Overview of the HASOC Track at FIRE 2020: Hate Speech and Offensive Language Identification in Tamil, Malayalam, Hindi, English and German

Thomas Mandl, Sandip J Modha, M. Anandkumar, Bharathi Raja Chakravarthi
{"title":"Overview of the HASOC Track at FIRE 2020: Hate Speech and Offensive Language Identification in Tamil, Malayalam, Hindi, English and German","authors":"Thomas Mandl, Sandip J Modha, M. Anandkumar, Bharathi Raja Chakravarthi","doi":"10.1145/3441501.3441517","DOIUrl":null,"url":null,"abstract":"This paper presents the HASOC track and its two parts. HASOC is dedicated to evaluate technology for finding Offensive Language and Hate Speech. HASOC is creating test collections for languages with few resources and English for comparison. The first track within HASOC has continued work from 2019 and provided a testbed of Twitter posts for Hindi, German and English. The second track within HASOC has created test resources for Tamil and Malayalam in native and Latin script. Posts were extracted mainly from Youtube and Twitter. Both tracks have attracted much interest and over 40 research groups have participated as well as described their approaches in papers. In this overview, we present the tasks, the data and the main results.","PeriodicalId":415985,"journal":{"name":"Proceedings of the 12th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"520 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"167","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th Annual Meeting of the Forum for Information Retrieval Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3441501.3441517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 167

Abstract

This paper presents the HASOC track and its two parts. HASOC is dedicated to evaluate technology for finding Offensive Language and Hate Speech. HASOC is creating test collections for languages with few resources and English for comparison. The first track within HASOC has continued work from 2019 and provided a testbed of Twitter posts for Hindi, German and English. The second track within HASOC has created test resources for Tamil and Malayalam in native and Latin script. Posts were extracted mainly from Youtube and Twitter. Both tracks have attracted much interest and over 40 research groups have participated as well as described their approaches in papers. In this overview, we present the tasks, the data and the main results.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
2020年FIRE大会HASOC专题概述:泰米尔语、马拉雅拉姆语、印地语、英语和德语中的仇恨言论和攻击性语言识别
本文介绍了HASOC轨道及其两部分。HASOC致力于评估发现攻击性语言和仇恨言论的技术。HASOC正在为资源较少的语言创建测试集,并将英语作为比较。HASOC的第一条轨道从2019年开始继续工作,并为印地语、德语和英语提供Twitter帖子的测试平台。HASOC的第二个轨道是创建泰米尔语和马拉雅拉姆语的本地和拉丁文字测试资源。这些帖子主要来自Youtube和Twitter。这两个方向都吸引了很多人的兴趣,超过40个研究小组参与其中,并在论文中描述了他们的方法。在本综述中,我们介绍了任务、数据和主要结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Bi-directional Encoder Representation of Transformer model for Sequential Music Recommender System Overview of the PAN@FIRE 2020 Task on the Authorship Identification of SOurce COde Overview of RCD-2020, the FIRE-2020 track on Retrieval from Conversational Dialogues Proceedings of the 12th Annual Meeting of the Forum for Information Retrieval Evaluation FIRE 2020 EDNIL Track: Event Detection from News in Indian Languages
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1