可访问性数据集的数据代表性:一个元分析。

Rie Kamikubo, Lining Wang, Crystal Marte, Amnah Mahmood, Hernisa Kacorri
{"title":"可访问性数据集的数据代表性:一个元分析。","authors":"Rie Kamikubo,&nbsp;Lining Wang,&nbsp;Crystal Marte,&nbsp;Amnah Mahmood,&nbsp;Hernisa Kacorri","doi":"10.1145/3517428.3544826","DOIUrl":null,"url":null,"abstract":"<p><p>As data-driven systems are increasingly deployed at scale, ethical concerns have arisen around unfair and discriminatory outcomes for historically marginalized groups that are underrepresented in training data. In response, work around AI fairness and inclusion has called for datasets that are representative of various demographic groups. In this paper, we contribute an analysis of the representativeness of age, gender, and race & ethnicity in accessibility datasets-datasets sourced from people with disabilities and older adults-that can potentially play an important role in mitigating bias for inclusive AI-infused applications. We examine the current state of representation within datasets sourced by people with disabilities by reviewing publicly-available information of 190 datasets, we call these accessibility datasets. We find that accessibility datasets represent diverse ages, but have gender and race representation gaps. Additionally, we investigate how the sensitive and complex nature of demographic variables makes classification difficult and inconsistent (<i>e.g.</i>, gender, race & ethnicity), with the source of labeling often unknown. By reflecting on the current challenges and opportunities for representation of disabled data contributors, we hope our effort expands the space of possibility for greater inclusion of marginalized communities in AI-infused systems.</p>","PeriodicalId":72321,"journal":{"name":"ASSETS. Annual ACM Conference on Assistive Technologies","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10024595/pdf/nihms-1869788.pdf","citationCount":"8","resultStr":"{\"title\":\"Data Representativeness in Accessibility Datasets: A Meta-Analysis.\",\"authors\":\"Rie Kamikubo,&nbsp;Lining Wang,&nbsp;Crystal Marte,&nbsp;Amnah Mahmood,&nbsp;Hernisa Kacorri\",\"doi\":\"10.1145/3517428.3544826\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>As data-driven systems are increasingly deployed at scale, ethical concerns have arisen around unfair and discriminatory outcomes for historically marginalized groups that are underrepresented in training data. In response, work around AI fairness and inclusion has called for datasets that are representative of various demographic groups. In this paper, we contribute an analysis of the representativeness of age, gender, and race & ethnicity in accessibility datasets-datasets sourced from people with disabilities and older adults-that can potentially play an important role in mitigating bias for inclusive AI-infused applications. We examine the current state of representation within datasets sourced by people with disabilities by reviewing publicly-available information of 190 datasets, we call these accessibility datasets. We find that accessibility datasets represent diverse ages, but have gender and race representation gaps. Additionally, we investigate how the sensitive and complex nature of demographic variables makes classification difficult and inconsistent (<i>e.g.</i>, gender, race & ethnicity), with the source of labeling often unknown. By reflecting on the current challenges and opportunities for representation of disabled data contributors, we hope our effort expands the space of possibility for greater inclusion of marginalized communities in AI-infused systems.</p>\",\"PeriodicalId\":72321,\"journal\":{\"name\":\"ASSETS. Annual ACM Conference on Assistive Technologies\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10024595/pdf/nihms-1869788.pdf\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ASSETS. Annual ACM Conference on Assistive Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3517428.3544826\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ASSETS. Annual ACM Conference on Assistive Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3517428.3544826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

随着数据驱动系统越来越多地大规模部署,对于在培训数据中代表性不足的历史边缘化群体,出现了不公平和歧视性结果的伦理担忧。作为回应,围绕人工智能公平性和包容性的工作需要能够代表不同人口群体的数据集。在本文中,我们分析了无障碍数据集中年龄、性别、种族和民族的代表性,这些数据集来自残疾人和老年人,这可能在减轻包容性人工智能应用的偏见方面发挥重要作用。我们通过审查190个数据集(我们称之为无障碍数据集)的公开信息,检查了残疾人数据集中的当前表现状态。我们发现可访问性数据集代表不同的年龄,但存在性别和种族代表性差距。此外,我们研究了人口统计变量的敏感性和复杂性如何使分类困难和不一致(例如,性别,种族和民族),标签的来源通常未知。通过反思当前残疾数据贡献者所面临的挑战和机遇,我们希望我们的努力能够扩大将边缘化社区更大程度地纳入人工智能系统的可能性空间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Data Representativeness in Accessibility Datasets: A Meta-Analysis.

As data-driven systems are increasingly deployed at scale, ethical concerns have arisen around unfair and discriminatory outcomes for historically marginalized groups that are underrepresented in training data. In response, work around AI fairness and inclusion has called for datasets that are representative of various demographic groups. In this paper, we contribute an analysis of the representativeness of age, gender, and race & ethnicity in accessibility datasets-datasets sourced from people with disabilities and older adults-that can potentially play an important role in mitigating bias for inclusive AI-infused applications. We examine the current state of representation within datasets sourced by people with disabilities by reviewing publicly-available information of 190 datasets, we call these accessibility datasets. We find that accessibility datasets represent diverse ages, but have gender and race representation gaps. Additionally, we investigate how the sensitive and complex nature of demographic variables makes classification difficult and inconsistent (e.g., gender, race & ethnicity), with the source of labeling often unknown. By reflecting on the current challenges and opportunities for representation of disabled data contributors, we hope our effort expands the space of possibility for greater inclusion of marginalized communities in AI-infused systems.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Screen Magnification for Readers with Low Vision: A Study on Usability and Performance. Blind Users Accessing Their Training Images in Teachable Object Recognizers. Data Representativeness in Accessibility Datasets: A Meta-Analysis. Mobile Phone Use by People with Mild to Moderate Dementia: Uncovering Challenges and Identifying Opportunities: Mobile Phone Use by People with Mild to Moderate Dementia. An Open-source Tool for Simplifying Computer and Assistive Technology Use: Tool for simplification and auto-personalization of computers and assistive technologies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1