An insight into racial bias in dermoscopy repositories: A HAM10000 data set analysis

Andres Morales-Forero, Lili Rueda Jaime, Sebastian Ramiro Gil-Quiñones, Marlon Y. Barrera Montañez, Samuel Bassetto, Eric Coatanea
{"title":"An insight into racial bias in dermoscopy repositories: A HAM10000 data set analysis","authors":"Andres Morales-Forero,&nbsp;Lili Rueda Jaime,&nbsp;Sebastian Ramiro Gil-Quiñones,&nbsp;Marlon Y. Barrera Montañez,&nbsp;Samuel Bassetto,&nbsp;Eric Coatanea","doi":"10.1002/jvc2.477","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Studies have revealed a lack of representation of skin of colour patients in academic sources of dermatologic diseases, including databases. This visual racism has consequently generated less comfort and confidence among the specialists in the care and attention of this ethnic group, including the opportunity of being correctly diagnosed.</p>\n </section>\n \n <section>\n \n <h3> Objectives</h3>\n \n <p>To investigate and uncover potential racial biases in the HAM10000 data set through an exploratory analysis of the dark skin tones representation, the identification of inaccuracies in its documentation, the recognition of relevant skin conditions absent for darker skin and the lack of ethnic diversity variables crucial for validating diagnosis across different skin tones.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>An exploratory examination was conducted to investigate the occurrence of dark skin within the HAM10000 database (housed in a Harvard Dataverse repository), consisting of 10,015 dermoscopic images of skin lesions. A visual depiction encompassing the whole skin tones was generated by sampling four crucial data points from each image and applying the Gray World Algorithm for colour normalization. To confirm the accuracy of the graphical representation, dermatologists validated the pixel sampling process by analysing a randomly selected 10% of the images for each type of skin lesion. This visual representation was produced for the entire data set as well as for each skin lesion type. The study was further enhanced by comparing the skin lesion representation within the HAM10000 data set against documented prevalences of relevant conditions affecting dark skin.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Less than 5% of the images came from dark-skinned patients. Nevertheless, in about 4.9% of cases, our pixel sampling method might inadvertently capture shadows or dark spots resulting from the imaging device or the lesion itself rather than the individual's actual skin tone. In addition, there are inaccuracies in the data set's claims of diversity and comprehensive coverage, notably the underrepresentation of conditions prevalent in darker skin and the absence of ethnic diversity variables.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>Visual racism is an issue that needs to be addressed in medical sources of information and education. Image databases and artificial intelligence models need to be nourished with information, including all skin types, to guarantee equal access to opportunities. Furthermore, any instances where conditions affecting people of colour are underrepresented must be meticulously documented and reported to highlight and address these disparities effectively. This is particularly important in dermoscopy imaging, where solely relying on image-based racial bias analysis is limited. The alteration of the patient's actual skin tone by the dermatoscope's lighting complicates the accurate assessment of racial bias.</p>\n </section>\n </div>","PeriodicalId":94325,"journal":{"name":"JEADV clinical practice","volume":"3 3","pages":"836-843"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jvc2.477","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JEADV clinical practice","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jvc2.477","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Studies have revealed a lack of representation of skin of colour patients in academic sources of dermatologic diseases, including databases. This visual racism has consequently generated less comfort and confidence among the specialists in the care and attention of this ethnic group, including the opportunity of being correctly diagnosed.

Objectives

To investigate and uncover potential racial biases in the HAM10000 data set through an exploratory analysis of the dark skin tones representation, the identification of inaccuracies in its documentation, the recognition of relevant skin conditions absent for darker skin and the lack of ethnic diversity variables crucial for validating diagnosis across different skin tones.

Methods

An exploratory examination was conducted to investigate the occurrence of dark skin within the HAM10000 database (housed in a Harvard Dataverse repository), consisting of 10,015 dermoscopic images of skin lesions. A visual depiction encompassing the whole skin tones was generated by sampling four crucial data points from each image and applying the Gray World Algorithm for colour normalization. To confirm the accuracy of the graphical representation, dermatologists validated the pixel sampling process by analysing a randomly selected 10% of the images for each type of skin lesion. This visual representation was produced for the entire data set as well as for each skin lesion type. The study was further enhanced by comparing the skin lesion representation within the HAM10000 data set against documented prevalences of relevant conditions affecting dark skin.

Results

Less than 5% of the images came from dark-skinned patients. Nevertheless, in about 4.9% of cases, our pixel sampling method might inadvertently capture shadows or dark spots resulting from the imaging device or the lesion itself rather than the individual's actual skin tone. In addition, there are inaccuracies in the data set's claims of diversity and comprehensive coverage, notably the underrepresentation of conditions prevalent in darker skin and the absence of ethnic diversity variables.

Conclusions

Visual racism is an issue that needs to be addressed in medical sources of information and education. Image databases and artificial intelligence models need to be nourished with information, including all skin types, to guarantee equal access to opportunities. Furthermore, any instances where conditions affecting people of colour are underrepresented must be meticulously documented and reported to highlight and address these disparities effectively. This is particularly important in dermoscopy imaging, where solely relying on image-based racial bias analysis is limited. The alteration of the patient's actual skin tone by the dermatoscope's lighting complicates the accurate assessment of racial bias.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
洞察皮肤镜资料库中的种族偏见:HAM10000 数据集分析
研究表明,在包括数据库在内的皮肤病学术资料中,有色人种患者的代表性不足。通过对深肤色代表的探索性分析、识别记录中的不准确之处、识别深肤色缺失的相关皮肤状况,以及缺乏对不同肤色进行诊断验证至关重要的种族多样性变量,调查并揭示 HAM10000 数据集中潜在的种族偏见。HAM10000 数据库由 10,015 幅皮肤病变的皮肤镜图像组成。通过从每张图像中抽取四个关键数据点,并应用灰色世界算法进行色彩归一化处理,生成了包含整个肤色的可视化描述。为确认图形表示法的准确性,皮肤科医生通过随机抽取 10%的图像对每种皮损类型进行分析,从而验证了像素抽样过程。整个数据集和每种皮损类型都采用了这种可视化表示方法。通过将 HAM10000 数据集中的皮损表示与影响深色皮肤的相关疾病的文献流行率进行比较,这项研究得到了进一步加强。然而,在大约 4.9% 的病例中,我们的像素采样方法可能会无意中捕捉到成像设备或病变本身造成的阴影或黑斑,而不是个人的实际肤色。此外,数据集所宣称的多样性和全面覆盖性也存在不准确之处,尤其是对深色皮肤常见疾病的代表性不足,以及缺乏种族多样性变量。图像数据库和人工智能模型需要丰富的信息,包括所有皮肤类型的信息,以保证获得平等的机会。此外,任何影响有色人种的情况都必须详细记录和报告,以突出并有效解决这些差异。这一点在皮肤镜成像中尤为重要,因为仅仅依靠基于图像的种族偏见分析是有限的。皮肤镜的光线会改变患者的实际肤色,这使得准确评估种族偏见变得更加复杂。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
0.30
自引率
0.00%
发文量
0
期刊最新文献
Issue Information Itch improvement has a major and comparable effect on the Dermatology Life Quality Index in psoriasis and atopic dermatitis patients Issue Information Sjögren syndrome from a dermatological perspective: A retrospective study of 185 SSA‐Ro positive patients Two pemphigoid cases with mucous membrane involvement successfully treated with baricitinib
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1