{"title":"在医学领域的人工智能和机器学习研究中,性别、民族和种族数据往往未被报道","authors":"Mahmoud Elmahdy, Ronnie Sebro","doi":"10.1016/j.ibmed.2023.100113","DOIUrl":null,"url":null,"abstract":"<div><p>The use of artificial intelligence (AI) programs in healthcare and medicine has steadily increased over the past decade. One major challenge affecting the use of AI programs is that the results of AI programs are sometimes not replicable, meaning that the performance of the AI program is substantially different in the external testing dataset when compared to its performance in the training or validation datasets. This often happens when the external testing dataset is very different from the training or validation datasets. Sex, ethnicity, and race are some of the most important biological and social determinants of health, and are important factors that may differ between training, validation, and external testing datasets, and may contribute to the lack of reproducibility of AI programs. We reviewed over 28,000 original research articles published in the three journals with the highest impact factors in each of 16 medical specialties between 2019 and 2022, to evaluate how often the sex, ethnic, and racial compositions of the datasets used to develop AI algorithms were reported. We also reviewed all currently used AI reporting guidelines, to evaluate which guidelines recommend specific reporting of sex, ethnicity, and race. We find that only 42.47 % (338/797) of articles reported sex, 1.4 % (12/831) reported ethnicity, and 7.3 % (61/831) reported race. When sex was reported, approximately 55.8 % of the study participants were female, and when ethnicity was reported, only 6.2 % of the study participants were Hispanic/Latino. When race was reported, only 29.4 % of study participants were non-White. Most AI guidelines (93.3 %; 14/15) also did not recommend reporting sex, ethnicity, and race. To have fair and ethnical AI, it is important that the sex, ethnic, and racial compositions of the datasets used to develop the AI program are known.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"8 ","pages":"Article 100113"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666521223000273/pdfft?md5=070012db350ebd8eac9219c40819eaa8&pid=1-s2.0-S2666521223000273-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Sex, ethnicity, and race data are often unreported in artificial intelligence and machine learning studies in medicine\",\"authors\":\"Mahmoud Elmahdy, Ronnie Sebro\",\"doi\":\"10.1016/j.ibmed.2023.100113\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The use of artificial intelligence (AI) programs in healthcare and medicine has steadily increased over the past decade. One major challenge affecting the use of AI programs is that the results of AI programs are sometimes not replicable, meaning that the performance of the AI program is substantially different in the external testing dataset when compared to its performance in the training or validation datasets. This often happens when the external testing dataset is very different from the training or validation datasets. Sex, ethnicity, and race are some of the most important biological and social determinants of health, and are important factors that may differ between training, validation, and external testing datasets, and may contribute to the lack of reproducibility of AI programs. We reviewed over 28,000 original research articles published in the three journals with the highest impact factors in each of 16 medical specialties between 2019 and 2022, to evaluate how often the sex, ethnic, and racial compositions of the datasets used to develop AI algorithms were reported. We also reviewed all currently used AI reporting guidelines, to evaluate which guidelines recommend specific reporting of sex, ethnicity, and race. We find that only 42.47 % (338/797) of articles reported sex, 1.4 % (12/831) reported ethnicity, and 7.3 % (61/831) reported race. When sex was reported, approximately 55.8 % of the study participants were female, and when ethnicity was reported, only 6.2 % of the study participants were Hispanic/Latino. When race was reported, only 29.4 % of study participants were non-White. Most AI guidelines (93.3 %; 14/15) also did not recommend reporting sex, ethnicity, and race. To have fair and ethnical AI, it is important that the sex, ethnic, and racial compositions of the datasets used to develop the AI program are known.</p></div>\",\"PeriodicalId\":73399,\"journal\":{\"name\":\"Intelligence-based medicine\",\"volume\":\"8 \",\"pages\":\"Article 100113\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666521223000273/pdfft?md5=070012db350ebd8eac9219c40819eaa8&pid=1-s2.0-S2666521223000273-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligence-based medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666521223000273\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666521223000273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sex, ethnicity, and race data are often unreported in artificial intelligence and machine learning studies in medicine
The use of artificial intelligence (AI) programs in healthcare and medicine has steadily increased over the past decade. One major challenge affecting the use of AI programs is that the results of AI programs are sometimes not replicable, meaning that the performance of the AI program is substantially different in the external testing dataset when compared to its performance in the training or validation datasets. This often happens when the external testing dataset is very different from the training or validation datasets. Sex, ethnicity, and race are some of the most important biological and social determinants of health, and are important factors that may differ between training, validation, and external testing datasets, and may contribute to the lack of reproducibility of AI programs. We reviewed over 28,000 original research articles published in the three journals with the highest impact factors in each of 16 medical specialties between 2019 and 2022, to evaluate how often the sex, ethnic, and racial compositions of the datasets used to develop AI algorithms were reported. We also reviewed all currently used AI reporting guidelines, to evaluate which guidelines recommend specific reporting of sex, ethnicity, and race. We find that only 42.47 % (338/797) of articles reported sex, 1.4 % (12/831) reported ethnicity, and 7.3 % (61/831) reported race. When sex was reported, approximately 55.8 % of the study participants were female, and when ethnicity was reported, only 6.2 % of the study participants were Hispanic/Latino. When race was reported, only 29.4 % of study participants were non-White. Most AI guidelines (93.3 %; 14/15) also did not recommend reporting sex, ethnicity, and race. To have fair and ethnical AI, it is important that the sex, ethnic, and racial compositions of the datasets used to develop the AI program are known.