Racial/ethnic reporting differences in cancer literature regarding machine learning vs. a radiologist: a systematic review and meta- analysis

Rahil Patel, Destie Provenzano, Sherrie Flynt Wallington, Murray Loew, Yuan James Rao, Sharad Goyal
{"title":"Racial/ethnic reporting differences in cancer literature regarding machine learning vs. a radiologist: a systematic review and meta- analysis","authors":"Rahil Patel, Destie Provenzano, Sherrie Flynt Wallington, Murray Loew, Yuan James Rao, Sharad Goyal","doi":"10.21037/jmai-23-31","DOIUrl":null,"url":null,"abstract":"Background: Machine learning (ML) has emerged as a promising tool to assist physicians in diagnosis and classification of patient conditions from medical imaging data. However, as clinical applications of ML become more common, there is concern about the prevalence of ethnoracial biases due to improper algorithm training. It has long been known that cancer outcomes vary for different racial/ethnic groups. Methods: We reviewed 84 studies that reported results of ML algorithms compared to radiologists for cancer prediction to evaluate if algorithms targeted at cancer prediction account for potential ethnoracial biases in their training samples. The search engines used to extract the articles were: PubMed, MEDLINE, and Google Scholar. All studies published before May 2022 were extracted. Two researchers independently reviewed 115 articles and evaluated them for incorporation and inclusion of demographic information in the algorithm. Exclusion criteria were if an inappropriate imaging type was used, if they did not report benign vs. malignant cancer results, if the algorithm was not compared to a board-certified radiologist, or if they were not in English. Results: Of the 84 studies included, 87% (n=73) reported demographic information and 38% (n=32) evaluated the effect of demographic information on model performance. However, only about 11% (n=9) of the articles reported racial/ethnic groups and about 4% (n=3) incorporated racial/ethnic information into their models. Of the nine studies that reported racial/ethnic information, the specified racial/ethnic minorities that were included the most were White/Caucasian (n=9/9) and Black/African American (n=8/9). Asian (n=4/9), American Indian (n=3/9), and Hispanic (n=2/9) were reported in less than half of the studies. Conclusions: The lack of inclusion of not only racial/ethnic information but also other demographic information such as age, gender, body mass index (BMI), or patient history is indicative of a larger problem that exists within artificial intelligence (AI) for cancer imaging. It is crucial to report and consider demographics when considering not only AI for cancer, but also overall care of a cancer patient. The findings from this study highlight a need for greater consideration and evaluation of ML algorithms to consider demographic information when evaluating a patient population for training the algorithm.","PeriodicalId":73815,"journal":{"name":"Journal of medical artificial intelligence","volume":"102 5-6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of medical artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21037/jmai-23-31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Machine learning (ML) has emerged as a promising tool to assist physicians in diagnosis and classification of patient conditions from medical imaging data. However, as clinical applications of ML become more common, there is concern about the prevalence of ethnoracial biases due to improper algorithm training. It has long been known that cancer outcomes vary for different racial/ethnic groups. Methods: We reviewed 84 studies that reported results of ML algorithms compared to radiologists for cancer prediction to evaluate if algorithms targeted at cancer prediction account for potential ethnoracial biases in their training samples. The search engines used to extract the articles were: PubMed, MEDLINE, and Google Scholar. All studies published before May 2022 were extracted. Two researchers independently reviewed 115 articles and evaluated them for incorporation and inclusion of demographic information in the algorithm. Exclusion criteria were if an inappropriate imaging type was used, if they did not report benign vs. malignant cancer results, if the algorithm was not compared to a board-certified radiologist, or if they were not in English. Results: Of the 84 studies included, 87% (n=73) reported demographic information and 38% (n=32) evaluated the effect of demographic information on model performance. However, only about 11% (n=9) of the articles reported racial/ethnic groups and about 4% (n=3) incorporated racial/ethnic information into their models. Of the nine studies that reported racial/ethnic information, the specified racial/ethnic minorities that were included the most were White/Caucasian (n=9/9) and Black/African American (n=8/9). Asian (n=4/9), American Indian (n=3/9), and Hispanic (n=2/9) were reported in less than half of the studies. Conclusions: The lack of inclusion of not only racial/ethnic information but also other demographic information such as age, gender, body mass index (BMI), or patient history is indicative of a larger problem that exists within artificial intelligence (AI) for cancer imaging. It is crucial to report and consider demographics when considering not only AI for cancer, but also overall care of a cancer patient. The findings from this study highlight a need for greater consideration and evaluation of ML algorithms to consider demographic information when evaluating a patient population for training the algorithm.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
关于机器学习和放射科医生的癌症文献中种族/民族报告的差异:系统回顾和荟萃分析
背景:机器学习(ML)已经成为一种很有前途的工具,可以帮助医生从医学成像数据中诊断和分类患者的病情。然而,随着ML的临床应用越来越普遍,人们担心由于算法训练不当而导致种族偏见的普遍存在。人们早就知道,不同种族/民族的癌症结果是不同的。方法:我们回顾了84项研究,这些研究报告了ML算法与放射科医生在癌症预测方面的结果,以评估针对癌症预测的算法是否可以解释其训练样本中潜在的种族偏见。用于提取文章的搜索引擎是:PubMed, MEDLINE和Google Scholar。提取2022年5月之前发表的所有研究。两名研究人员独立审查了115篇文章,并对其在算法中纳入人口统计信息的情况进行了评估。排除标准是:使用了不适当的成像类型,没有报告良性和恶性癌症的结果,没有将算法与委员会认证的放射科医生进行比较,或者没有使用英语。结果:纳入的84项研究中,87% (n=73)报告了人口统计信息,38% (n=32)评估了人口统计信息对模型性能的影响。然而,只有约11% (n=9)的文章报告了种族/民族群体,约4% (n=3)的文章将种族/民族信息纳入其模型。在报告种族/民族信息的9项研究中,被纳入最多的特定种族/少数民族是白人/高加索人(n=9/9)和黑人/非裔美国人(n=8/9)。亚洲人(n=4/9)、美洲印第安人(n=3/9)和西班牙人(n=2/9)在不到一半的研究中被报道。结论:不仅缺乏种族/民族信息,而且缺乏其他人口统计信息,如年龄、性别、体重指数(BMI)或患者病史,这表明人工智能(AI)在癌症成像中存在更大的问题。在考虑人工智能治疗癌症,以及癌症患者的整体护理时,报告和考虑人口统计数据至关重要。这项研究的结果强调了在评估用于训练算法的患者群体时,需要更多地考虑和评估ML算法,以考虑人口统计信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.30
自引率
0.00%
发文量
0
期刊最新文献
Qualitative American Heart Association plot of late gadolinium enhancement with mortality and ventricular arrhythmia prediction using artificial intelligence. Artificial intelligence in periodontology and implantology—a narrative review Exploring the capabilities and limitations of large language models in nuclear medicine knowledge with primary focus on GPT-3.5, GPT-4 and Google Bard Hybrid artificial intelligence outcome prediction using features extraction from stress perfusion cardiac magnetic resonance images and electronic health records Analysis of factors influencing maternal mortality and newborn health—a machine learning approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1