艺术还是人工制品：评估《DALL-E 3》中人工智能生成的图像在说明先天性心脏病方面的准确性、吸引力和教育价值。

IF 3.5 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES Journal of Medical Systems Pub Date : 2024-05-23 DOI:10.1007/s10916-024-02072-0

Mohamad-Hani Temsah, Abdullah N Alhuzaimi, Mohammed Almansour, Fadi Aljamaan, Khalid Alhasan, Munirah A Batarfi, Ibraheem Altamimi, Amani Alharbi, Adel Abdulaziz Alsuhaibani, Leena Alwakeel, Abdulrahman Abdulkhaliq Alzahrani, Khaled B Alsulaim, Amr Jamal, Afnan Khayat, Mohammed Hussien Alghamdi, Rabih Halwani, Muhammad Khurram Khan, Ayman Al-Eyadhy, Rakan Nazer

{"title":"艺术还是人工制品：评估《DALL-E 3》中人工智能生成的图像在说明先天性心脏病方面的准确性、吸引力和教育价值。","authors":"Mohamad-Hani Temsah, Abdullah N Alhuzaimi, Mohammed Almansour, Fadi Aljamaan, Khalid Alhasan, Munirah A Batarfi, Ibraheem Altamimi, Amani Alharbi, Adel Abdulaziz Alsuhaibani, Leena Alwakeel, Abdulrahman Abdulkhaliq Alzahrani, Khaled B Alsulaim, Amr Jamal, Afnan Khayat, Mohammed Hussien Alghamdi, Rabih Halwani, Muhammad Khurram Khan, Ayman Al-Eyadhy, Rakan Nazer","doi":"10.1007/s10916-024-02072-0","DOIUrl":null,"url":null,"abstract":"Artificial Intelligence (AI), particularly AI-Generated Imagery, has the potential to impact medical and patient education. This research explores the use of AI-generated imagery, from text-to-images, in medical education, focusing on congenital heart diseases (CHD). Utilizing ChatGPT's DALL·E 3, the research aims to assess the accuracy and educational value of AI-created images for 20 common CHDs. In this study, we utilized DALL·E 3 to generate a comprehensive set of 110 images, comprising ten images depicting the normal human heart and five images for each of the 20 common CHDs. The generated images were evaluated by a diverse group of 33 healthcare professionals. This cohort included cardiology experts, pediatricians, non-pediatric faculty members, trainees (medical students, interns, pediatric residents), and pediatric nurses. Utilizing a structured framework, these professionals assessed each image for anatomical accuracy, the usefulness of in-picture text, its appeal to medical professionals, and the image's potential applicability in medical presentations. Each item was assessed on a Likert scale of three. The assessments produced a total of 3630 images' assessments. Most AI-generated cardiac images were rated poorly as follows: 80.8% of images were rated as anatomically incorrect or fabricated, 85.2% rated to have incorrect text labels, 78.1% rated as not usable for medical education. The nurses and medical interns were found to have a more positive perception about the AI-generated cardiac images compared to the faculty members, pediatricians, and cardiology experts. Complex congenital anomalies were found to be significantly more predicted to anatomical fabrication compared to simple cardiac anomalies. There were significant challenges identified in image generation. Based on our findings, we recommend a vigilant approach towards the use of AI-generated imagery in medical education at present, underscoring the imperative for thorough validation and the importance of collaboration across disciplines. While we advise against its immediate integration until further validations are conducted, the study advocates for future AI-models to be fine-tuned with accurate medical data, enhancing their reliability and educational utility.","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"48 1","pages":"54"},"PeriodicalIF":3.5000,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Art or Artifact: Evaluating the Accuracy, Appeal, and Educational Value of AI-Generated Imagery in DALL·E 3 for Illustrating Congenital Heart Diseases.\",\"authors\":\"Mohamad-Hani Temsah, Abdullah N Alhuzaimi, Mohammed Almansour, Fadi Aljamaan, Khalid Alhasan, Munirah A Batarfi, Ibraheem Altamimi, Amani Alharbi, Adel Abdulaziz Alsuhaibani, Leena Alwakeel, Abdulrahman Abdulkhaliq Alzahrani, Khaled B Alsulaim, Amr Jamal, Afnan Khayat, Mohammed Hussien Alghamdi, Rabih Halwani, Muhammad Khurram Khan, Ayman Al-Eyadhy, Rakan Nazer\",\"doi\":\"10.1007/s10916-024-02072-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Artificial Intelligence (AI), particularly AI-Generated Imagery, has the potential to impact medical and patient education. This research explores the use of AI-generated imagery, from text-to-images, in medical education, focusing on congenital heart diseases (CHD). Utilizing ChatGPT's DALL·E 3, the research aims to assess the accuracy and educational value of AI-created images for 20 common CHDs. In this study, we utilized DALL·E 3 to generate a comprehensive set of 110 images, comprising ten images depicting the normal human heart and five images for each of the 20 common CHDs. The generated images were evaluated by a diverse group of 33 healthcare professionals. This cohort included cardiology experts, pediatricians, non-pediatric faculty members, trainees (medical students, interns, pediatric residents), and pediatric nurses. Utilizing a structured framework, these professionals assessed each image for anatomical accuracy, the usefulness of in-picture text, its appeal to medical professionals, and the image's potential applicability in medical presentations. Each item was assessed on a Likert scale of three. The assessments produced a total of 3630 images' assessments. Most AI-generated cardiac images were rated poorly as follows: 80.8% of images were rated as anatomically incorrect or fabricated, 85.2% rated to have incorrect text labels, 78.1% rated as not usable for medical education. The nurses and medical interns were found to have a more positive perception about the AI-generated cardiac images compared to the faculty members, pediatricians, and cardiology experts. Complex congenital anomalies were found to be significantly more predicted to anatomical fabrication compared to simple cardiac anomalies. There were significant challenges identified in image generation. Based on our findings, we recommend a vigilant approach towards the use of AI-generated imagery in medical education at present, underscoring the imperative for thorough validation and the importance of collaboration across disciplines. While we advise against its immediate integration until further validations are conducted, the study advocates for future AI-models to be fine-tuned with accurate medical data, enhancing their reliability and educational utility.\",\"PeriodicalId\":16338,\"journal\":{\"name\":\"Journal of Medical Systems\",\"volume\":\"48 1\",\"pages\":\"54\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Medical Systems\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s10916-024-02072-0\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Systems","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10916-024-02072-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

人工智能（AI），尤其是人工智能生成的图像，有可能对医疗和患者教育产生影响。本研究探讨了人工智能生成的图像（从文本到图像）在医学教育中的应用，重点是先天性心脏病（CHD）。这项研究旨在利用 ChatGPT 的 DALL-E 3 评估人工智能生成的 20 种常见先天性心脏病图像的准确性和教育价值。在这项研究中，我们利用 DALL-E 3 生成了一套完整的 110 张图像，其中包括 10 张描绘正常人心脏的图像和 20 种常见先天性心脏病的各 5 张图像。生成的图像由 33 位不同的医疗保健专业人员进行评估。其中包括心脏病学专家、儿科医生、非儿科专业教师、受训人员（医学生、实习生、儿科住院医师）和儿科护士。利用结构化框架，这些专业人员对每张图片的解剖准确性、图片内文字的实用性、对医学专业人员的吸引力以及图片在医学演示中的潜在适用性进行了评估。每项评估均采用李克特三段式量表。评估共产生了 3630 张图像的评估结果。大多数人工智能生成的心脏图像都被评为较差，具体如下：80.8%的图像被评为解剖不正确或捏造，85.2%的图像被评为文本标签不正确，78.1%的图像被评为不能用于医学教育。与教师、儿科医生和心脏病学专家相比，护士和实习医生对人工智能生成的心脏图像有更积极的看法。与简单的心脏畸形相比，复杂的先天性畸形更容易预测解剖结构。在图像生成方面发现了一些重大挑战。基于我们的研究结果，我们建议目前在医学教育中使用人工智能生成的图像时要保持警惕，强调彻底验证的必要性和跨学科合作的重要性。虽然我们建议在进行进一步验证之前不要立即将其应用到医学教育中，但本研究主张利用准确的医学数据对未来的人工智能模型进行微调，以提高其可靠性和教育效用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Art or Artifact: Evaluating the Accuracy, Appeal, and Educational Value of AI-Generated Imagery in DALL·E 3 for Illustrating Congenital Heart Diseases.

Artificial Intelligence (AI), particularly AI-Generated Imagery, has the potential to impact medical and patient education. This research explores the use of AI-generated imagery, from text-to-images, in medical education, focusing on congenital heart diseases (CHD). Utilizing ChatGPT's DALL·E 3, the research aims to assess the accuracy and educational value of AI-created images for 20 common CHDs. In this study, we utilized DALL·E 3 to generate a comprehensive set of 110 images, comprising ten images depicting the normal human heart and five images for each of the 20 common CHDs. The generated images were evaluated by a diverse group of 33 healthcare professionals. This cohort included cardiology experts, pediatricians, non-pediatric faculty members, trainees (medical students, interns, pediatric residents), and pediatric nurses. Utilizing a structured framework, these professionals assessed each image for anatomical accuracy, the usefulness of in-picture text, its appeal to medical professionals, and the image's potential applicability in medical presentations. Each item was assessed on a Likert scale of three. The assessments produced a total of 3630 images' assessments. Most AI-generated cardiac images were rated poorly as follows: 80.8% of images were rated as anatomically incorrect or fabricated, 85.2% rated to have incorrect text labels, 78.1% rated as not usable for medical education. The nurses and medical interns were found to have a more positive perception about the AI-generated cardiac images compared to the faculty members, pediatricians, and cardiology experts. Complex congenital anomalies were found to be significantly more predicted to anatomical fabrication compared to simple cardiac anomalies. There were significant challenges identified in image generation. Based on our findings, we recommend a vigilant approach towards the use of AI-generated imagery in medical education at present, underscoring the imperative for thorough validation and the importance of collaboration across disciplines. While we advise against its immediate integration until further validations are conducted, the study advocates for future AI-models to be fine-tuned with accurate medical data, enhancing their reliability and educational utility.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Medical Systems 医学-卫生保健

CiteScore

11.60

自引率

1.90%

发文量

审稿时长

4.8 months

期刊介绍： Journal of Medical Systems provides a forum for the presentation and discussion of the increasingly extensive applications of new systems techniques and methods in hospital clinic and physician''s office administration; pathology radiology and pharmaceutical delivery systems; medical records storage and retrieval; and ancillary patient-support systems. The journal publishes informative articles essays and studies across the entire scale of medical systems from large hospital programs to novel small-scale medical services. Education is an integral part of this amalgamation of sciences and selected articles are published in this area. Since existing medical systems are constantly being modified to fit particular circumstances and to solve specific problems the journal includes a special section devoted to status reports on current installations.