{"title":"Outlier bias: AI classification of curb ramps, outliers, and context","authors":"Shiloh Deitz","doi":"10.1177/20539517231203669","DOIUrl":null,"url":null,"abstract":"Technologies in the smart city, such as autonomous vehicles and delivery robots, promise to increase the mobility and freedom of people with disabilities. These technologies have also failed to “see” or comprehend wheelchair riders, people walking with service animals, and people walking with bicycles—all outliers to machine learning models. Big data and algorithms have been amply critiqued for their biases—harmful and systematic errors—but the harms that arise from AI's inherent inability to handle nuance, context, and exception have been largely overlooked. In this paper, I run two machine learning models across nine cities in the United States to attempt to fill a gap in data about the location of curb ramps. I find that while curb ramp prediction models may achieve up to 88% accuracy, the rate of accuracy varied in context in ways both predictable and unpredictable. I look closely at cases of unpredictable error (outlier bias), by triangulating with aerial and street view imagery. The sampling of cases shows that while it may be possible to conjecture about patterns in these errors, there is nothing clearly systematic. While more data and bigger models might improve the accuracy somewhat, I propose that a bias toward outliers is something fundamental to machine learning models which gravitate to the mean and require unbiased and not missing data. I conclude by arguing that universal design or design for the outliers is imperative for justice in the smart city where algorithms and data are increasingly embedded as infrastructure.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":"9 1","pages":"0"},"PeriodicalIF":6.5000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data & Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/20539517231203669","RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL SCIENCES, INTERDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Technologies in the smart city, such as autonomous vehicles and delivery robots, promise to increase the mobility and freedom of people with disabilities. These technologies have also failed to “see” or comprehend wheelchair riders, people walking with service animals, and people walking with bicycles—all outliers to machine learning models. Big data and algorithms have been amply critiqued for their biases—harmful and systematic errors—but the harms that arise from AI's inherent inability to handle nuance, context, and exception have been largely overlooked. In this paper, I run two machine learning models across nine cities in the United States to attempt to fill a gap in data about the location of curb ramps. I find that while curb ramp prediction models may achieve up to 88% accuracy, the rate of accuracy varied in context in ways both predictable and unpredictable. I look closely at cases of unpredictable error (outlier bias), by triangulating with aerial and street view imagery. The sampling of cases shows that while it may be possible to conjecture about patterns in these errors, there is nothing clearly systematic. While more data and bigger models might improve the accuracy somewhat, I propose that a bias toward outliers is something fundamental to machine learning models which gravitate to the mean and require unbiased and not missing data. I conclude by arguing that universal design or design for the outliers is imperative for justice in the smart city where algorithms and data are increasingly embedded as infrastructure.
期刊介绍:
Big Data & Society (BD&S) is an open access, peer-reviewed scholarly journal that publishes interdisciplinary work principally in the social sciences, humanities, and computing and their intersections with the arts and natural sciences. The journal focuses on the implications of Big Data for societies and aims to connect debates about Big Data practices and their effects on various sectors such as academia, social life, industry, business, and government.
BD&S considers Big Data as an emerging field of practices, not solely defined by but generative of unique data qualities such as high volume, granularity, data linking, and mining. The journal pays attention to digital content generated both online and offline, encompassing social media, search engines, closed networks (e.g., commercial or government transactions), and open networks like digital archives, open government, and crowdsourced data. Rather than providing a fixed definition of Big Data, BD&S encourages interdisciplinary inquiries, debates, and studies on various topics and themes related to Big Data practices.
BD&S seeks contributions that analyze Big Data practices, involve empirical engagements and experiments with innovative methods, and reflect on the consequences of these practices for the representation, realization, and governance of societies. As a digital-only journal, BD&S's platform can accommodate multimedia formats such as complex images, dynamic visualizations, videos, and audio content. The contents of the journal encompass peer-reviewed research articles, colloquia, bookcasts, think pieces, state-of-the-art methods, and work by early career researchers.