{"title":"BelElect: A New Dataset for Bias Research from a \"Dark\" Platform","authors":"Sviatlana Höhn, S. Mauw, Nicholas M. Asher","doi":"10.1609/icwsm.v16i1.19378","DOIUrl":null,"url":null,"abstract":"New social networks and platforms such as Telegram, Gab and Parler offer a stage for extremist, racist and aggressive content, but also provide a safe space for freedom fighters in authoritarian regimes. Data from such platforms offer excellent opportunities for research on issues such as linguistic bias and toxic language detection. However, only a few, mostly unannotated, English-only corpora from such platforms exist. This article presents a new Telegram corpus in Russian and Belorussian languages tailored for research on linguistic bias in political news. In addition, we created a repository to make all currently available corpora from so-called \"dark\" platforms accessible in one place.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Web and Social Media","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/icwsm.v16i1.19378","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
New social networks and platforms such as Telegram, Gab and Parler offer a stage for extremist, racist and aggressive content, but also provide a safe space for freedom fighters in authoritarian regimes. Data from such platforms offer excellent opportunities for research on issues such as linguistic bias and toxic language detection. However, only a few, mostly unannotated, English-only corpora from such platforms exist. This article presents a new Telegram corpus in Russian and Belorussian languages tailored for research on linguistic bias in political news. In addition, we created a repository to make all currently available corpora from so-called "dark" platforms accessible in one place.