“Too Soon” to count? How gender and race cloud notability considerations on Wikipedia

IF 6.5 1区 社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY Big Data & Society Pub Date : 2023-01-01 DOI:10.1177/20539517231165490
Mackenzie Lemieux, Rebecca Zhang, F. Tripodi
{"title":"“Too Soon” to count? How gender and race cloud notability considerations on Wikipedia","authors":"Mackenzie Lemieux, Rebecca Zhang, F. Tripodi","doi":"10.1177/20539517231165490","DOIUrl":null,"url":null,"abstract":"While research has explored the extent of gender bias and the barriers to women's inclusion on English-language Wikipedia, very little research has focused on the problem of racial bias within the encyclopedia. Despite advocacy groups' efforts to incrementally improve representation on Wikipedia, much is unknown regarding how biographies are assessed after creation. Applying a combination of web-scraping, deep learning, natural language processing, and qualitative analysis to pages of academics nominated for deletion on Wikipedia, we demonstrate how Wikipedia's notability guidelines are unequally applied across race and gender. We find that online presence predicts whether a Wikipedia page is kept or deleted for white male academics but that this metric is idiosyncratically applied for female and BIPOC academics. Further, women's pages, regardless of race, were more likely to be deemed “too soon” for Wikipedia. A deeper analysis of the deletion archives reveals that when the tag is used on a woman's biography it is done so outside of the community guidelines, referring to one's career stage rather than media/online coverage. We argue that awareness of hidden biases on Wikipedia is critical to the objective and equitable application of the notability criteria across race and gender both on the encyclopedia and beyond.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":6.5000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data & Society","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/20539517231165490","RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL SCIENCES, INTERDISCIPLINARY","Score":null,"Total":0}
引用次数: 3

Abstract

While research has explored the extent of gender bias and the barriers to women's inclusion on English-language Wikipedia, very little research has focused on the problem of racial bias within the encyclopedia. Despite advocacy groups' efforts to incrementally improve representation on Wikipedia, much is unknown regarding how biographies are assessed after creation. Applying a combination of web-scraping, deep learning, natural language processing, and qualitative analysis to pages of academics nominated for deletion on Wikipedia, we demonstrate how Wikipedia's notability guidelines are unequally applied across race and gender. We find that online presence predicts whether a Wikipedia page is kept or deleted for white male academics but that this metric is idiosyncratically applied for female and BIPOC academics. Further, women's pages, regardless of race, were more likely to be deemed “too soon” for Wikipedia. A deeper analysis of the deletion archives reveals that when the tag is used on a woman's biography it is done so outside of the community guidelines, referring to one's career stage rather than media/online coverage. We argue that awareness of hidden biases on Wikipedia is critical to the objective and equitable application of the notability criteria across race and gender both on the encyclopedia and beyond.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
“太早”数不清?性别和种族如何影响维基百科上的知名度
虽然有研究探讨了性别偏见的程度以及女性被纳入英语维基百科的障碍,但很少有研究关注百科全书中的种族偏见问题。尽管倡导团体努力逐步提高维基百科上的代表性,但关于传记创作后如何评估,还有很多未知之处。我们将网络抓取、深度学习、自然语言处理和定性分析结合起来,对维基百科上被提名删除的学者页面进行分析,证明了维基百科的显著性准则在种族和性别之间的应用是不平等的。我们发现,在线存在可以预测白人男性学者是否保留或删除维基百科页面,但这一指标特别适用于女性和BIPOC学者。此外,女性页面,无论种族,都更有可能被认为对维基百科来说“太早”。对删除档案的深入分析表明,当这个标签被用在一个女人的传记上时,它是在社区指导方针之外做的,指的是一个人的职业阶段,而不是媒体/在线报道。我们认为,意识到维基百科上隐藏的偏见对于在百科全书内外客观公正地应用跨种族和性别的显著性标准至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Big Data & Society
Big Data & Society SOCIAL SCIENCES, INTERDISCIPLINARY-
CiteScore
10.90
自引率
10.60%
发文量
59
审稿时长
11 weeks
期刊介绍: Big Data & Society (BD&S) is an open access, peer-reviewed scholarly journal that publishes interdisciplinary work principally in the social sciences, humanities, and computing and their intersections with the arts and natural sciences. The journal focuses on the implications of Big Data for societies and aims to connect debates about Big Data practices and their effects on various sectors such as academia, social life, industry, business, and government. BD&S considers Big Data as an emerging field of practices, not solely defined by but generative of unique data qualities such as high volume, granularity, data linking, and mining. The journal pays attention to digital content generated both online and offline, encompassing social media, search engines, closed networks (e.g., commercial or government transactions), and open networks like digital archives, open government, and crowdsourced data. Rather than providing a fixed definition of Big Data, BD&S encourages interdisciplinary inquiries, debates, and studies on various topics and themes related to Big Data practices. BD&S seeks contributions that analyze Big Data practices, involve empirical engagements and experiments with innovative methods, and reflect on the consequences of these practices for the representation, realization, and governance of societies. As a digital-only journal, BD&S's platform can accommodate multimedia formats such as complex images, dynamic visualizations, videos, and audio content. The contents of the journal encompass peer-reviewed research articles, colloquia, bookcasts, think pieces, state-of-the-art methods, and work by early career researchers.
期刊最新文献
Is there a role of the kidney failure risk equation in optimizing timing of vascular access creation in pre-dialysis patients? From rules to examples: Machine learning's type of authority Outlier bias: AI classification of curb ramps, outliers, and context Artificial intelligence and skills in the workplace: An integrative research agenda Redress and worldmaking: Differing approaches to algorithmic reparations for housing justice
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1