Dissecting Racial Bias in an Algorithm that Guides Health Decisions for 70 Million People

Proceedings of the Conference on Fairness, Accountability, and Transparency Pub Date : 2019-01-29 DOI:10.1145/3287560.3287593

Z. Obermeyer, S. Mullainathan

{"title":"Dissecting Racial Bias in an Algorithm that Guides Health Decisions for 70 Million People","authors":"Z. Obermeyer, S. Mullainathan","doi":"10.1145/3287560.3287593","DOIUrl":null,"url":null,"abstract":"A single algorithm drives an important health care decision for over 70 million people in the US. When health systems anticipate that a patient will have especially complex and intensive future health care needs, she is enrolled in a 'care management' program, which provides considerable additional resources: greater attention from trained providers and help with coordination of her care. To determine which patients will have complex future health care needs, and thus benefit from program enrollment, many systems rely on an algorithmically generated commercial risk score. In this paper, we exploit a rich dataset to study racial bias in a commercial algorithm that is deployed nationwide today in many of the US's most prominent Accountable Care Organizations (ACOs). We document significant racial bias in this widely used algorithm, using data on primary care patients at a large hospital. Blacks and whites with the same algorithmic risk scores have very different realized health. For example, the highest-risk black patients (those at the threshold where patients are auto-enrolled in the program), have significantly more chronic illnesses than white enrollees with the same risk score. We use detailed physiological data to show the pervasiveness of the bias: across a range of biomarkers, from HbA1c levels for diabetics to blood pressure control for hypertensives, we find significant racial health gaps conditional on risk score. This bias has significant material consequences for patients: it effectively means that white patients with the same health as black patients are far more likely be enrolled in the care management program, and benefit from its resources. If we simulated a world without this gap in predictions, blacks would be auto-enrolled into the program at more than double the current rate. An unusual aspect of our dataset is that we observe not just the risk scores but also the input data and objective function used to construct it. This provides a unique window into the mechanisms by which bias arises. The algorithm is given a data frame with (1) Yit (label), total medical expenditures ('costs') in year t; and (2) Xi,t--1 (features), fine-grained care utilization data in year t -- 1 (e.g., visits to cardiologists, number of x-rays, etc.). The algorithm's predicted risk of developing complex health needs is thus in fact predicted costs. And by this metric, one could easily call the algorithm unbiased: costs are very similar for black and white patients with the same risk scores. So far, this is inconsistent with algorithmic bias: conditional on risk score, predictions do not favor whites or blacks. The fundamental problem we uncover is that when thinking about 'health care needs,' hospitals and insurers focus on costs. They use an algorithm whose specific objective is cost prediction, and from this perspective, predictions are accurate and unbiased. Yet from the social perspective, actual health -- not just costs -- also matters. This is where the problem arises: costs are not the same as health. While costs are a reasonable proxy for health (the sick do cost more, on average), they are an imperfect one: factors other than health can drive cost -- for example, race. We find that blacks cost more than whites on average; but this gap can be decomposed into two countervailing effects. First, blacks bear a different and larger burden of disease, making them costlier. But this difference in illness is offset by a second factor: blacks cost less, holding constant their exact chronic conditions, a force that dramatically reduces the overall cost gap. Perversely, the fact that blacks cost less than whites conditional on health means an algorithm that predicts costs accurately across racial groups will necessarily also generate biased predictions on health. The root cause of this bias is not in the procedure for prediction, or the underlying data, but the algorithm's objective function itself. This bias is akin to, but distinct from, 'mis-measured labels': it arises here from the choice of labels, not their measurement, which is in turn a consequence of the differing objective functions of private actors in the health sector and society. From the private perspective, the variable they focus on -- cost -- is being appropriately optimized. But our results hint at how algorithms may amplify a fundamental problem in health care as a whole: externalities produced when health care providers focus too narrowly on financial motives, optimizing on costs to the detriment of health. In this sense, our results suggest that a pervasive problem in health care -- incentives that induce health systems to focus on dollars rather than health -- also has consequences for the way algorithms are built and monitored.","PeriodicalId":20573,"journal":{"name":"Proceedings of the Conference on Fairness, Accountability, and Transparency","volume":"42 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"97","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Conference on Fairness, Accountability, and Transparency","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3287560.3287593","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 97

Abstract

A single algorithm drives an important health care decision for over 70 million people in the US. When health systems anticipate that a patient will have especially complex and intensive future health care needs, she is enrolled in a 'care management' program, which provides considerable additional resources: greater attention from trained providers and help with coordination of her care. To determine which patients will have complex future health care needs, and thus benefit from program enrollment, many systems rely on an algorithmically generated commercial risk score. In this paper, we exploit a rich dataset to study racial bias in a commercial algorithm that is deployed nationwide today in many of the US's most prominent Accountable Care Organizations (ACOs). We document significant racial bias in this widely used algorithm, using data on primary care patients at a large hospital. Blacks and whites with the same algorithmic risk scores have very different realized health. For example, the highest-risk black patients (those at the threshold where patients are auto-enrolled in the program), have significantly more chronic illnesses than white enrollees with the same risk score. We use detailed physiological data to show the pervasiveness of the bias: across a range of biomarkers, from HbA1c levels for diabetics to blood pressure control for hypertensives, we find significant racial health gaps conditional on risk score. This bias has significant material consequences for patients: it effectively means that white patients with the same health as black patients are far more likely be enrolled in the care management program, and benefit from its resources. If we simulated a world without this gap in predictions, blacks would be auto-enrolled into the program at more than double the current rate. An unusual aspect of our dataset is that we observe not just the risk scores but also the input data and objective function used to construct it. This provides a unique window into the mechanisms by which bias arises. The algorithm is given a data frame with (1) Yit (label), total medical expenditures ('costs') in year t; and (2) Xi,t--1 (features), fine-grained care utilization data in year t -- 1 (e.g., visits to cardiologists, number of x-rays, etc.). The algorithm's predicted risk of developing complex health needs is thus in fact predicted costs. And by this metric, one could easily call the algorithm unbiased: costs are very similar for black and white patients with the same risk scores. So far, this is inconsistent with algorithmic bias: conditional on risk score, predictions do not favor whites or blacks. The fundamental problem we uncover is that when thinking about 'health care needs,' hospitals and insurers focus on costs. They use an algorithm whose specific objective is cost prediction, and from this perspective, predictions are accurate and unbiased. Yet from the social perspective, actual health -- not just costs -- also matters. This is where the problem arises: costs are not the same as health. While costs are a reasonable proxy for health (the sick do cost more, on average), they are an imperfect one: factors other than health can drive cost -- for example, race. We find that blacks cost more than whites on average; but this gap can be decomposed into two countervailing effects. First, blacks bear a different and larger burden of disease, making them costlier. But this difference in illness is offset by a second factor: blacks cost less, holding constant their exact chronic conditions, a force that dramatically reduces the overall cost gap. Perversely, the fact that blacks cost less than whites conditional on health means an algorithm that predicts costs accurately across racial groups will necessarily also generate biased predictions on health. The root cause of this bias is not in the procedure for prediction, or the underlying data, but the algorithm's objective function itself. This bias is akin to, but distinct from, 'mis-measured labels': it arises here from the choice of labels, not their measurement, which is in turn a consequence of the differing objective functions of private actors in the health sector and society. From the private perspective, the variable they focus on -- cost -- is being appropriately optimized. But our results hint at how algorithms may amplify a fundamental problem in health care as a whole: externalities produced when health care providers focus too narrowly on financial motives, optimizing on costs to the detriment of health. In this sense, our results suggest that a pervasive problem in health care -- incentives that induce health systems to focus on dollars rather than health -- also has consequences for the way algorithms are built and monitored.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在指导7000万人健康决策的算法中剖析种族偏见

在美国，一个算法就能驱动7000多万人做出重要的医疗保健决定。当卫生系统预计患者未来将有特别复杂和密集的卫生保健需求时，她将参加“护理管理”计划，这将提供相当多的额外资源:训练有素的提供者给予更多关注，并帮助协调其护理。为了确定哪些患者将有复杂的未来医疗保健需求，从而从项目登记中受益，许多系统依赖于算法生成的商业风险评分。在本文中，我们利用丰富的数据集来研究商业算法中的种族偏见，该算法目前在美国许多最著名的问责保健组织(ACOs)中部署。我们使用一家大医院初级保健患者的数据，在这个广泛使用的算法中记录了明显的种族偏见。具有相同算法风险评分的黑人和白人的实际健康水平差异很大。例如，风险最高的黑人患者(那些处于自动加入该计划的阈值的患者)比具有相同风险评分的白人患者有更多的慢性疾病。我们使用详细的生理数据来显示这种偏差的普遍性:在一系列生物标志物中，从糖尿病患者的HbA1c水平到高血压患者的血压控制，我们发现显著的种族健康差距取决于风险评分。这种偏见对患者产生了重大的物质影响:它实际上意味着，与黑人患者健康状况相同的白人患者更有可能参加护理管理项目，并从其资源中受益。如果我们模拟一个没有这种预测差距的世界，黑人自动加入该计划的比率将是目前的两倍多。我们数据集的一个不同寻常的方面是，我们不仅观察风险评分，还观察用于构建它的输入数据和目标函数。这为偏见产生的机制提供了一个独特的窗口。该算法给出了一个数据框架，其中(1)Yit(标签)，第t年的总医疗支出(“费用”);因此，该算法预测的产生复杂健康需求的风险实际上是预测的成本。根据这个指标，人们可以很容易地称该算法是无偏倚的:具有相同风险评分的黑人和白人患者的成本非常相似。到目前为止，这与算法偏见是不一致的:根据风险评分，预测不会偏向白人或黑人。我们发现的根本问题是，在考虑“医疗保健需求”时，医院和保险公司关注的是成本。他们使用一种算法，其具体目标是成本预测，从这个角度来看，预测是准确和公正的。然而，从社会的角度来看，实际的健康——而不仅仅是成本——也很重要。这就是问题所在:成本与健康不一样。虽然成本是健康的一个合理指标(平均而言，病人的成本确实更高)，但它并不完美:健康以外的因素也会推动成本——比如种族。我们发现，黑人的平均成本高于白人;但这种差距可以分解为两种相互抵消的效应。首先，黑人承担着不同的、更大的疾病负担，这使得他们的成本更高。但这种疾病上的差异被第二个因素抵消了:黑人花费更少，他们的慢性病没有变化，这一力量极大地缩小了总体成本差距。反常的是，以健康为条件的黑人成本低于白人这一事实意味着，准确预测种族群体成本的算法也必然会对健康产生有偏见的预测。这种偏差的根本原因不在于预测过程，也不在于底层数据，而在于算法的目标函数本身。这种偏见类似于但不同于“测量错误的标签”:它产生于标签的选择，而不是对标签的测量，这反过来又是卫生部门和社会中私人行为者不同的客观职能的结果。从私人的角度来看，他们关注的变量——成本——正在得到适当的优化。但我们的研究结果暗示，算法可能会放大整个医疗保健领域的一个基本问题:当医疗保健提供者过于狭隘地关注财务动机、优化成本而损害健康时，就会产生外部性。从这个意义上说，我们的研究结果表明，医疗保健中普遍存在的一个问题——诱导医疗系统关注金钱而不是健康的激励机制——也对算法的构建和监控方式产生了影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Conference on Fairness, Accountability, and Transparency

自引率

0.00%

发文量

期刊最新文献

Algorithmic Transparency from the South: Examining the state of algorithmic transparency in Chile's public administration algorithms FAccT '21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event / Toronto, Canada, March 3-10, 2021 Transparency universal Resisting transparency Conclusion