{"title":"Not Judging a User by Their Cover: Understanding Harm in Multi-Modal Processing within Social Media Research","authors":"Jiachen Jiang, Soroush Vosoughi","doi":"10.1145/3422841.3423534","DOIUrl":null,"url":null,"abstract":"Social media has shaken the foundations of our society, unlikely as it may seem. Many of the popular tools used to moderate harmful digital content, however, have received widespread criticism from both the academic community and the public sphere for middling performance and lack of accountability. Though social media research is thought to center primarily on natural language processing, we demonstrate the need for the community to understand multimedia processing and its unique ethical considerations. Specifically, we identify statistical differences in the performance of Amazon Turk (MTurk) annotators when different modalities of information are provided and discuss the patterns of harm that arise from crowd-sourced human demographic prediction. Finally, we discuss the consequences of those biases through auditing the performance of a toxicity detector called Perspective API on the language of Twitter users across a variety of demographic categories.","PeriodicalId":428850,"journal":{"name":"Proceedings of the 2nd International Workshop on Fairness, Accountability, Transparency and Ethics in Multimedia","volume":"116 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Workshop on Fairness, Accountability, Transparency and Ethics in Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3422841.3423534","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Social media has shaken the foundations of our society, unlikely as it may seem. Many of the popular tools used to moderate harmful digital content, however, have received widespread criticism from both the academic community and the public sphere for middling performance and lack of accountability. Though social media research is thought to center primarily on natural language processing, we demonstrate the need for the community to understand multimedia processing and its unique ethical considerations. Specifically, we identify statistical differences in the performance of Amazon Turk (MTurk) annotators when different modalities of information are provided and discuss the patterns of harm that arise from crowd-sourced human demographic prediction. Finally, we discuss the consequences of those biases through auditing the performance of a toxicity detector called Perspective API on the language of Twitter users across a variety of demographic categories.
社交媒体已经动摇了我们社会的基础,尽管看起来不太可能。然而,许多用于缓和有害数字内容的流行工具因表现一般和缺乏问责制而受到学术界和公共领域的广泛批评。虽然社会媒体研究被认为主要集中在自然语言处理上,但我们证明了社区理解多媒体处理及其独特的伦理考虑的必要性。具体来说,我们确定了在提供不同形式的信息时,Amazon Turk (MTurk)注释器性能的统计差异,并讨论了由众包的人类人口统计预测产生的危害模式。最后,我们通过审计一个名为Perspective API的毒性检测器对各种人口统计类别的Twitter用户的语言的性能来讨论这些偏差的后果。