Can AI-generated clinical vignettes in Japanese be used medically and linguistically?

Yasutaka Yanagita, Daiki Yokokawa, Shun Uchida, Yu Li, Takanori Uehara, Masatomi Ikusaka
{"title":"Can AI-generated clinical vignettes in Japanese be used medically and linguistically?","authors":"Yasutaka Yanagita, Daiki Yokokawa, Shun Uchida, Yu Li, Takanori Uehara, Masatomi Ikusaka","doi":"10.1101/2024.02.28.24303173","DOIUrl":null,"url":null,"abstract":"Background\nCreating clinical vignettes requires considerable effort. Recent developments in generative artificial intelligence (AI) for natural language processing have been remarkable and may allow for the easy and immediate creation of diverse clinical vignettes. Objective\nIn this study, we evaluated the medical accuracy and grammatical correctness of AI-generated clinical vignettes in Japanese and verified their usefulness.\nMethods\nClinical vignettes in Japanese were created using the generative AI model GPT-4-0613. The input prompts for the clinical vignettes specified the following seven elements: 1) age, 2) sex, 3) chief complaint and time course since onset, 4) physical findings, 5) examination results, 6) diagnosis, and 7) treatment course. The list of diseases integrated into the vignettes was based on 202 cases considered in the management of diseases and symptoms in Japan's Primary Care Physicians Training Program. The clinical vignettes were evaluated for medical and Japanese-language accuracy by three physicians using a five-point scale. A total score of 13 points or above was defined as 'sufficiently beneficial and immediately usable with minor revisions,' a score between 10 and 12 points was defined as 'partly insufficient and in need of modifications,' and a score of 9 points or below was defined as 'insufficient.'\nResults\nRegarding medical accuracy, of the 202 clinical vignettes, 118 scored 13 points or above, 78 scored between 10 and 12 points, and 6 scored 9 points or below. Regarding Japanese-language accuracy, 142 vignettes scored 13 points or above, 56 scored between 10 and 12 points, and 4 scored 9 points or below. Overall, 97% (196/202) of vignettes available with some modifications.\nConclusions\nOverall, 97% of the clinical vignettes proved practically useful, based on confirmation and revision by Japanese medical physicians. Given the significant effort required by physicians to create vignettes without AI assistance, the use of GPT is expected to greatly optimize this process.","PeriodicalId":501387,"journal":{"name":"medRxiv - Medical Education","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Medical Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.02.28.24303173","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background Creating clinical vignettes requires considerable effort. Recent developments in generative artificial intelligence (AI) for natural language processing have been remarkable and may allow for the easy and immediate creation of diverse clinical vignettes. Objective In this study, we evaluated the medical accuracy and grammatical correctness of AI-generated clinical vignettes in Japanese and verified their usefulness. Methods Clinical vignettes in Japanese were created using the generative AI model GPT-4-0613. The input prompts for the clinical vignettes specified the following seven elements: 1) age, 2) sex, 3) chief complaint and time course since onset, 4) physical findings, 5) examination results, 6) diagnosis, and 7) treatment course. The list of diseases integrated into the vignettes was based on 202 cases considered in the management of diseases and symptoms in Japan's Primary Care Physicians Training Program. The clinical vignettes were evaluated for medical and Japanese-language accuracy by three physicians using a five-point scale. A total score of 13 points or above was defined as 'sufficiently beneficial and immediately usable with minor revisions,' a score between 10 and 12 points was defined as 'partly insufficient and in need of modifications,' and a score of 9 points or below was defined as 'insufficient.' Results Regarding medical accuracy, of the 202 clinical vignettes, 118 scored 13 points or above, 78 scored between 10 and 12 points, and 6 scored 9 points or below. Regarding Japanese-language accuracy, 142 vignettes scored 13 points or above, 56 scored between 10 and 12 points, and 4 scored 9 points or below. Overall, 97% (196/202) of vignettes available with some modifications. Conclusions Overall, 97% of the clinical vignettes proved practically useful, based on confirmation and revision by Japanese medical physicians. Given the significant effort required by physicians to create vignettes without AI assistance, the use of GPT is expected to greatly optimize this process.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
人工智能生成的日语临床小故事能否用于医学和语言学?
背景创建临床小故事需要花费大量精力。用于自然语言处理的生成式人工智能(AI)的最新发展令人瞩目,可以轻松、即时地创建各种临床小故事。本研究评估了人工智能生成的日语临床小故事的医学准确性和语法正确性,并验证了其实用性。临床小故事的输入提示指定了以下七个要素:1)年龄;2)性别;3)主诉和发病时间;4)体征;5)检查结果;6)诊断;7)治疗过程。小故事中包含的疾病清单是基于日本初级保健医生培训计划中疾病和症状管理中的 202 个案例。临床小故事的医学和日语准确性由三位医生采用五级评分法进行评估。结果在医学准确性方面,202 个临床小故事中有 118 个获得 13 分或以上,78 个获得 10 分至 12 分,6 个获得 9 分或以下。在日语准确性方面,142 个小故事得分在 13 分或以上,56 个在 10 分至 12 分之间,4 个在 9 分或以下。总体而言,97%(196/202)的小案例经过了一定的修改。鉴于在没有人工智能辅助的情况下,医生需要花费大量精力来创建小故事,使用 GPT 预计将大大优化这一过程。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Barriers and facilitators for the implementation of wiki- and blog-based Virtual Learning Environments as tools for improving collaborative learning in the Bachelor of Nursing degree. Comparative Analysis of Stress Responses in Medical Students Using Virtual Reality Versus Traditional 3D-Printed Mannequins for Pericardiocentesis Training The Role of Artificial Intelligence in Modern Medical Education and Practice: A Systematic Literature Review Precision Education Tools for Pediatrics Trainees: A Mixed-Methods Multi-Site Usability Assessment Silence in physician clinical practice: a scoping review protocol
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1