JMIR Medical Education最新文献_第6页

Evaluating AI Competence in Specialized Medicine: Comparative Analysis of ChatGPT and Neurologists in a Neurology Specialist Examination in Spain. 评估人工智能在专科医学中的能力：西班牙神经病学专家考试中ChatGPT和神经科医生的比较分析。

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education

Pub Date : 2024-11-14 DOI: 10.2196/56762

Pablo Ros-Arlanzón, Angel Perez-Sempere

Background: With the rapid advancement of artificial intelligence (AI) in various fields, evaluating its application in specialized medical contexts becomes crucial. ChatGPT, a large language model developed by OpenAI, has shown potential in diverse applications, including medicine.

Objective: This study aims to compare the performance of ChatGPT with that of attending neurologists in a real neurology specialist examination conducted in the Valencian Community, Spain, assessing the AI's capabilities and limitations in medical knowledge.

Methods: We conducted a comparative analysis using the 2022 neurology specialist examination results from 120 neurologists and responses generated by ChatGPT versions 3.5 and 4. The examination consisted of 80 multiple-choice questions, with a focus on clinical neurology and health legislation. Questions were classified according to Bloom's Taxonomy. Statistical analysis of performance, including the κ coefficient for response consistency, was performed.

Results: Human participants exhibited a median score of 5.91 (IQR: 4.93-6.76), with 32 neurologists failing to pass. ChatGPT-3.5 ranked 116th out of 122, answering 54.5% of questions correctly (score 3.94). ChatGPT-4 showed marked improvement, ranking 17th with 81.8% of correct answers (score 7.57), surpassing several human specialists. No significant variations were observed in the performance on lower-order questions versus higher-order questions. Additionally, ChatGPT-4 demonstrated increased interrater reliability, as reflected by a higher κ coefficient of 0.73, compared to ChatGPT-3.5's coefficient of 0.69.

Conclusions: This study underscores the evolving capabilities of AI in medical knowledge assessment, particularly in specialized fields. ChatGPT-4's performance, outperforming the median score of human participants in a rigorous neurology examination, represents a significant milestone in AI development, suggesting its potential as an effective tool in specialized medical education and assessment.

背景：随着人工智能（AI）在各个领域的快速发展，评估其在专业医学环境中的应用变得至关重要。ChatGPT是OpenAI开发的一种大型语言模型，在包括医学在内的多种应用中显示出潜力。目的：本研究旨在比较ChatGPT在西班牙瓦伦西亚社区进行的一次真实神经病学专家检查中的表现，评估AI在医学知识方面的能力和局限性。方法：对120名神经科医生的2022年神经内科专科检查结果与ChatGPT 3.5和4版生成的回复进行对比分析。考试包括80道选择题，重点是临床神经病学和卫生立法。问题根据布鲁姆分类法进行分类。对性能进行统计分析，包括反应一致性的κ系数。结果：人类受试者的中位得分为5.91 (IQR: 4.93-6.76)， 32名神经科医生未通过。ChatGPT-3.5在122个问题中排名第116位，正确率为54.5%（3.94分）。ChatGPT-4表现出明显的进步，以81.8%的正确率（7.57分）排名第17位，超过了几位人类专家。在低阶问题和高阶问题上的表现没有显著差异。此外，与ChatGPT-3.5的0.69系数相比，ChatGPT-4的κ系数为0.73，显示出更高的互译可靠性。结论：这项研究强调了人工智能在医学知识评估中的不断发展的能力，特别是在专业领域。ChatGPT-4的表现在一项严格的神经学检查中超过了人类参与者的中位数得分，这是人工智能发展的一个重要里程碑，表明它有潜力成为专业医学教育和评估的有效工具。

{"title":"Evaluating AI Competence in Specialized Medicine: Comparative Analysis of ChatGPT and Neurologists in a Neurology Specialist Examination in Spain.","authors":"Pablo Ros-Arlanzón, Angel Perez-Sempere","doi":"10.2196/56762","DOIUrl":"10.2196/56762","url":null,"abstract":"Background: With the rapid advancement of artificial intelligence (AI) in various fields, evaluating its application in specialized medical contexts becomes crucial. ChatGPT, a large language model developed by OpenAI, has shown potential in diverse applications, including medicine.Objective: This study aims to compare the performance of ChatGPT with that of attending neurologists in a real neurology specialist examination conducted in the Valencian Community, Spain, assessing the AI's capabilities and limitations in medical knowledge.Methods: We conducted a comparative analysis using the 2022 neurology specialist examination results from 120 neurologists and responses generated by ChatGPT versions 3.5 and 4. The examination consisted of 80 multiple-choice questions, with a focus on clinical neurology and health legislation. Questions were classified according to Bloom's Taxonomy. Statistical analysis of performance, including the κ coefficient for response consistency, was performed.Results: Human participants exhibited a median score of 5.91 (IQR: 4.93-6.76), with 32 neurologists failing to pass. ChatGPT-3.5 ranked 116th out of 122, answering 54.5% of questions correctly (score 3.94). ChatGPT-4 showed marked improvement, ranking 17th with 81.8% of correct answers (score 7.57), surpassing several human specialists. No significant variations were observed in the performance on lower-order questions versus higher-order questions. Additionally, ChatGPT-4 demonstrated increased interrater reliability, as reflected by a higher κ coefficient of 0.73, compared to ChatGPT-3.5's coefficient of 0.69.Conclusions: This study underscores the evolving capabilities of AI in medical knowledge assessment, particularly in specialized fields. ChatGPT-4's performance, outperforming the median score of human participants in a rigorous neurology examination, represents a significant milestone in AI development, suggesting its potential as an effective tool in specialized medical education and assessment.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"10 ","pages":"e56762"},"PeriodicalIF":3.2,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611784/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142773216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Impact of Ophthalmic Knowledge Assessment Program Scores and Surgical Volume on Subspecialty Fellowship Application in Ophthalmology Residency: Retrospective Cohort Study. 眼科知识评估项目得分和手术量对眼科住院医师亚专科奖学金申请的影响：回顾性队列研究。

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education

Pub Date : 2024-11-13 DOI: 10.2196/60940

Amanda Kay Hertel, Radwan S Ajlan

Background: Ophthalmology residents take the Ophthalmic Knowledge Assessment Program (OKAP) exam annually, which provides percentile rank for multiple categories and the total score. In addition, ophthalmology residency training programs have multiple subspecialty rotations with defined minimum procedure requirements. However, residents' surgical volumes vary, with some residents exceeding their peers in specific subspecialty rotations.

Objective: This study aims to identify if there is a difference in OKAP examination scores and surgical volume exposure during ophthalmology residency training between nonfellowship and fellowship applicants and among various subspecialties.

Methods: A retrospective review of OKAP scores and surgical procedure numbers of graduating residents in an accredited academic ophthalmology residency program in the Midwest United States was conducted. Data were collected from 2012 to 2022.

Results: A total of 31 residents were identified. Most residents decided to pursue fellowship training upon graduation (20/31, 65% residents), and the rest chose to practice comprehensive ophthalmology (11/31, 35% residents). A total of 18/31 residents had OKAP score reports available. The fellowship group outperformed the nonfellowship group in multiple subsections and the total exam (P=.04). Those pursuing fellowship training in glaucoma performed higher on the Glaucoma section (P=.004) and the total exam (P=.005). Residents pursuing cornea performed higher on nearly all subsections, including External Disease and Cornea (P=.02) and the total exam (P=.007). The majority of the surgical volume exposure was identical between fellowship and nonfellowship groups. Those who pursued glaucoma fellowship performed more glaucoma filtering and shunting procedures (P=.03). Residents going into pediatrics fellowship were primary surgeons in more strabismus cases (P=.01), assisted in fewer strabismus cases (P<.001), and had no difference in the total number of strabismus surgeries.

Conclusions: In our program, residents pursuing fellowship training had higher OKAP scores on multiple sections and the total exam. There was no significant difference in the overall surgical volume averages between fellowship and nonfellowship groups, but few differences existed in subspecialty procedures among fellowship applicants. Larger multicenter studies are needed to clarify the relationship between OKAP scores and ophthalmology fellowship decisions nationwide.

背景：眼科住院医师每年参加眼科知识评估计划（OKAP）考试，该考试提供多个类别的百分位排名和总分。此外，眼科住院医师培训项目有多个亚专科轮转，并有明确的最低手术要求。然而，住院医生的手术量各不相同，一些住院医生在特定的亚专科轮转中超过了同行。目的：本研究旨在确定眼科住院医师培训期间OKAP考试分数和手术量暴露在非奖学金和奖学金申请人之间以及不同亚专科之间是否存在差异。方法：对美国中西部一所认可的眼科住院医师项目的毕业住院医师的OKAP评分和手术次数进行回顾性分析。数据收集于2012年至2022年。结果：共识别出31名居民。大多数住院医师决定毕业后继续进修进修（20/31,65%住院医师），其余住院医师选择从事综合眼科（11/31,35%住院医师）。共有18/31的居民有OKAP评分报告。奖学金组在多个子部分和总考试中表现优于非奖学金组（P=.04）。在青光眼方面接受奖学金培训的患者在青光眼部分（P= 0.004）和总考试（P= 0.005）中表现更高。接受角膜检查的住院医师在几乎所有亚部分的表现都较高，包括外部疾病和角膜检查（P= 0.02）和总检查（P= 0.007）。大部分的手术暴露量在研究员组和非研究员组之间是相同的。那些追求青光眼奖学金的患者进行了更多的青光眼滤过和分流手术（P=.03）。进入儿科进修的住院医师是更多斜视病例的主刀医师（P= 0.01），协助的斜视病例较少(P结论：在我们的项目中，接受进修培训的住院医师在OKAP的多个部分和总考试中得分较高。总的平均手术量在奖学金组和非奖学金组之间没有显著差异，但在亚专科手术方面在奖学金申请者之间几乎没有差异。需要更大规模的多中心研究来阐明OKAP评分与全国眼科奖学金决定之间的关系。

{"title":"Impact of Ophthalmic Knowledge Assessment Program Scores and Surgical Volume on Subspecialty Fellowship Application in Ophthalmology Residency: Retrospective Cohort Study.","authors":"Amanda Kay Hertel, Radwan S Ajlan","doi":"10.2196/60940","DOIUrl":"10.2196/60940","url":null,"abstract":"Background: Ophthalmology residents take the Ophthalmic Knowledge Assessment Program (OKAP) exam annually, which provides percentile rank for multiple categories and the total score. In addition, ophthalmology residency training programs have multiple subspecialty rotations with defined minimum procedure requirements. However, residents' surgical volumes vary, with some residents exceeding their peers in specific subspecialty rotations.Objective: This study aims to identify if there is a difference in OKAP examination scores and surgical volume exposure during ophthalmology residency training between nonfellowship and fellowship applicants and among various subspecialties.Methods: A retrospective review of OKAP scores and surgical procedure numbers of graduating residents in an accredited academic ophthalmology residency program in the Midwest United States was conducted. Data were collected from 2012 to 2022.Results: A total of 31 residents were identified. Most residents decided to pursue fellowship training upon graduation (20/31, 65% residents), and the rest chose to practice comprehensive ophthalmology (11/31, 35% residents). A total of 18/31 residents had OKAP score reports available. The fellowship group outperformed the nonfellowship group in multiple subsections and the total exam (P=.04). Those pursuing fellowship training in glaucoma performed higher on the Glaucoma section (P=.004) and the total exam (P=.005). Residents pursuing cornea performed higher on nearly all subsections, including External Disease and Cornea (P=.02) and the total exam (P=.007). The majority of the surgical volume exposure was identical between fellowship and nonfellowship groups. Those who pursued glaucoma fellowship performed more glaucoma filtering and shunting procedures (P=.03). Residents going into pediatrics fellowship were primary surgeons in more strabismus cases (P=.01), assisted in fewer strabismus cases (P<.001), and had no difference in the total number of strabismus surgeries.Conclusions: In our program, residents pursuing fellowship training had higher OKAP scores on multiple sections and the total exam. There was no significant difference in the overall surgical volume averages between fellowship and nonfellowship groups, but few differences existed in subspecialty procedures among fellowship applicants. Larger multicenter studies are needed to clarify the relationship between OKAP scores and ophthalmology fellowship decisions nationwide.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"10 ","pages":"e60940"},"PeriodicalIF":3.2,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611791/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142773234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging the Electronic Health Record to Measure Resident Clinical Experiences and Identify Training Gaps: Development and Usability Study. 利用电子健康记录测量住院医师的临床经验并找出培训差距：开发和可用性研究。

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education

Pub Date : 2024-11-06 DOI: 10.2196/53337

Vasudha L Bhavaraju, Sarada Panchanathan, Brigham C Willis, Pamela Garcia-Filion

Background: Competence-based medical education requires robust data to link competence with clinical experiences. The SARS-CoV-2 (COVID-19) pandemic abruptly altered the standard trajectory of clinical exposure in medical training programs. Residency program directors were tasked with identifying and addressing the resultant gaps in each trainee's experiences using existing tools.

Objective: This study aims to demonstrate a feasible and efficient method to capture electronic health record (EHR) data that measure the volume and variety of pediatric resident clinical experiences from a continuity clinic; generate individual-, class-, and graduate-level benchmark data; and create a visualization for learners to quickly identify gaps in clinical experiences.

Methods: This pilot was conducted in a large, urban pediatric residency program from 2016 to 2022. Through consensus, 5 pediatric faculty identified diagnostic groups that pediatric residents should see to be competent in outpatient pediatrics. Information technology consultants used International Classification of Diseases, Tenth Revision (ICD-10) codes corresponding with each diagnostic group to extract EHR patient encounter data as an indicator of exposure to the specific diagnosis. The frequency (volume) and diagnosis types (variety) seen by active residents (classes of 2020-2022) were compared with class and graduated resident (classes of 2016-2019) averages. These data were converted to percentages and translated to a radar chart visualization for residents to quickly compare their current clinical experiences with peers and graduates. Residents were surveyed on the use of these data and the visualization to identify training gaps.

Results: Patient encounter data about clinical experiences for 102 residents (N=52 graduates) were extracted. Active residents (n=50) received data reports with radar graphs biannually: 3 for the classes of 2020 and 2021 and 2 for the class of 2022. Radar charts distinctly demonstrated gaps in diagnoses exposure compared with classmates and graduates. Residents found the visualization useful in setting clinical and learning goals.

Conclusions: This pilot describes an innovative method of capturing and presenting data about resident clinical experiences, compared with peer and graduate benchmarks, to identify learning gaps that may result from disruptions or modifications in medical training. This methodology can be aggregated across specialties and institutions and potentially inform competence-based medical education.

背景：以能力为基础的医学教育需要可靠的数据将能力与临床经验联系起来。SARS-CoV-2（COVID-19）大流行突然改变了医学培训项目中临床接触的标准轨迹。住院医师培训项目主任的任务是利用现有工具找出并解决每个学员的经验差距：本研究旨在展示一种可行且高效的方法，用于采集电子健康记录（EHR）数据，以衡量连续性诊所儿科住院医师临床经验的数量和多样性；生成个人、班级和研究生水平的基准数据；并为学员创建可视化工具，以快速识别临床经验中的差距：该试点项目于 2016 年至 2022 年在一个大型城市儿科住院医师培训项目中开展。通过达成共识，5 位儿科教师确定了儿科住院医师为胜任儿科门诊工作而应看的诊断组别。信息技术顾问使用与每个诊断组相对应的《国际疾病分类第十版》（ICD-10）代码提取电子病历患者就诊数据，作为接触特定诊断的指标。在职住院医师（2020-2022 届）的就诊频率（数量）和诊断类型（种类）与班级和毕业住院医师（2016-2019 届）的平均值进行了比较。这些数据被转换为百分比，并转化为雷达图可视化，以便住院医师将其当前的临床经验与同级住院医师和毕业生进行快速比较。住院医师接受了关于使用这些数据和可视化图表找出培训差距的调查：提取了 102 名住院医师（毕业生人数=52）的临床经验数据。在职住院医师（人数=50）每半年收到一次带有雷达图的数据报告：2020届和2021届各3份，2022届2份。雷达图明显显示了与同学和毕业生相比在诊断暴露方面的差距。住院医师发现，这种可视化方法有助于制定临床和学习目标：本试验介绍了一种创新方法，通过与同学和毕业生的基准进行比较，获取并展示住院医师临床经验的数据，从而找出因医学培训中断或修改而可能导致的学习差距。这种方法可以在各专科和机构间进行汇总，并有可能为基于能力的医学教育提供参考。

{"title":"Leveraging the Electronic Health Record to Measure Resident Clinical Experiences and Identify Training Gaps: Development and Usability Study.","authors":"Vasudha L Bhavaraju, Sarada Panchanathan, Brigham C Willis, Pamela Garcia-Filion","doi":"10.2196/53337","DOIUrl":"10.2196/53337","url":null,"abstract":"Background: Competence-based medical education requires robust data to link competence with clinical experiences. The SARS-CoV-2 (COVID-19) pandemic abruptly altered the standard trajectory of clinical exposure in medical training programs. Residency program directors were tasked with identifying and addressing the resultant gaps in each trainee's experiences using existing tools.Objective: This study aims to demonstrate a feasible and efficient method to capture electronic health record (EHR) data that measure the volume and variety of pediatric resident clinical experiences from a continuity clinic; generate individual-, class-, and graduate-level benchmark data; and create a visualization for learners to quickly identify gaps in clinical experiences.Methods: This pilot was conducted in a large, urban pediatric residency program from 2016 to 2022. Through consensus, 5 pediatric faculty identified diagnostic groups that pediatric residents should see to be competent in outpatient pediatrics. Information technology consultants used International Classification of Diseases, Tenth Revision (ICD-10) codes corresponding with each diagnostic group to extract EHR patient encounter data as an indicator of exposure to the specific diagnosis. The frequency (volume) and diagnosis types (variety) seen by active residents (classes of 2020-2022) were compared with class and graduated resident (classes of 2016-2019) averages. These data were converted to percentages and translated to a radar chart visualization for residents to quickly compare their current clinical experiences with peers and graduates. Residents were surveyed on the use of these data and the visualization to identify training gaps.Results: Patient encounter data about clinical experiences for 102 residents (N=52 graduates) were extracted. Active residents (n=50) received data reports with radar graphs biannually: 3 for the classes of 2020 and 2021 and 2 for the class of 2022. Radar charts distinctly demonstrated gaps in diagnoses exposure compared with classmates and graduates. Residents found the visualization useful in setting clinical and learning goals.Conclusions: This pilot describes an innovative method of capturing and presenting data about resident clinical experiences, compared with peer and graduate benchmarks, to identify learning gaps that may result from disruptions or modifications in medical training. This methodology can be aggregated across specialties and institutions and potentially inform competence-based medical education.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"10 ","pages":"e53337"},"PeriodicalIF":3.2,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11559912/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ChatGPT-4 Omni Performance in USMLE Disciplines and Clinical Skills: Comparative Analysis. ChatGPT-4 Omni 在 USMLE 学科和临床技能中的表现：比较分析。

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education

Pub Date : 2024-11-06 DOI: 10.2196/63430

Brenton T Bicknell, Danner Butler, Sydney Whalen, James Ricks, Cory J Dixon, Abigail B Clark, Olivia Spaedy, Adam Skelton, Neel Edupuganti, Lance Dzubinski, Hudson Tate, Garrett Dyess, Brenessa Lindeman, Lisa Soleymani Lehmann

Background: Recent studies, including those by the National Board of Medical Examiners, have highlighted the remarkable capabilities of recent large language models (LLMs) such as ChatGPT in passing the United States Medical Licensing Examination (USMLE). However, there is a gap in detailed analysis of LLM performance in specific medical content areas, thus limiting an assessment of their potential utility in medical education.

Objective: This study aimed to assess and compare the accuracy of successive ChatGPT versions (GPT-3.5, GPT-4, and GPT-4 Omni) in USMLE disciplines, clinical clerkships, and the clinical skills of diagnostics and management.

Methods: This study used 750 clinical vignette-based multiple-choice questions to characterize the performance of successive ChatGPT versions (ChatGPT 3.5 [GPT-3.5], ChatGPT 4 [GPT-4], and ChatGPT 4 Omni [GPT-4o]) across USMLE disciplines, clinical clerkships, and in clinical skills (diagnostics and management). Accuracy was assessed using a standardized protocol, with statistical analyses conducted to compare the models' performances.

Results: GPT-4o achieved the highest accuracy across 750 multiple-choice questions at 90.4%, outperforming GPT-4 and GPT-3.5, which scored 81.1% and 60.0%, respectively. GPT-4o's highest performances were in social sciences (95.5%), behavioral and neuroscience (94.2%), and pharmacology (93.2%). In clinical skills, GPT-4o's diagnostic accuracy was 92.7% and management accuracy was 88.8%, significantly higher than its predecessors. Notably, both GPT-4o and GPT-4 significantly outperformed the medical student average accuracy of 59.3% (95% CI 58.3-60.3).

Conclusions: GPT-4o's performance in USMLE disciplines, clinical clerkships, and clinical skills indicates substantial improvements over its predecessors, suggesting significant potential for the use of this technology as an educational aid for medical students. These findings underscore the need for careful consideration when integrating LLMs into medical education, emphasizing the importance of structured curricula to guide their appropriate use and the need for ongoing critical analyses to ensure their reliability and effectiveness.

背景：最近的研究，包括美国国家医学考试委员会（National Board of Medical Examiners）的研究，都强调了最近的大型语言模型（LLM），如 ChatGPT，在通过美国医学执照考试（USMLE）方面的卓越能力。然而，在详细分析 LLM 在特定医学内容领域的表现方面还存在空白，从而限制了对其在医学教育中潜在用途的评估：本研究旨在评估和比较历代 ChatGPT 版本（GPT-3.5、GPT-4 和 GPT-4 Omni）在 USMLE 学科、临床实习以及诊断和管理临床技能方面的准确性：本研究使用了 750 道基于临床小故事的选择题，以描述历代 ChatGPT 版本（ChatGPT 3.5 [GPT-3.5]、ChatGPT 4 [GPT-4]和 ChatGPT 4 Omni [GPT-4o]）在 USMLE 学科、临床实习和临床技能（诊断和管理）方面的表现。采用标准化方案评估准确性，并进行统计分析以比较模型的性能：结果：在750道选择题中，GPT-4o的准确率最高，达到90.4%，超过了分别为81.1%和60.0%的GPT-4和GPT-3.5。GPT-4o 在社会科学（95.5%）、行为与神经科学（94.2%）和药理学（93.2%）方面表现最佳。在临床技能方面，GPT-4o 的诊断准确率为 92.7%，管理准确率为 88.8%，明显高于其前身。值得注意的是，GPT-4o和GPT-4的准确率均明显高于医学生59.3%（95% CI 58.3-60.3）的平均准确率：结论：GPT-4o 在 USMLE 学科、临床实习和临床技能方面的表现比其前代产品有了大幅提高，这表明该技术作为医学生教育辅助工具的巨大潜力。这些发现强调了在将 LLMs 纳入医学教育时需要慎重考虑的问题，强调了结构化课程对指导其合理使用的重要性，以及持续进行关键分析以确保其可靠性和有效性的必要性。

{"title":"ChatGPT-4 Omni Performance in USMLE Disciplines and Clinical Skills: Comparative Analysis.","authors":"Brenton T Bicknell, Danner Butler, Sydney Whalen, James Ricks, Cory J Dixon, Abigail B Clark, Olivia Spaedy, Adam Skelton, Neel Edupuganti, Lance Dzubinski, Hudson Tate, Garrett Dyess, Brenessa Lindeman, Lisa Soleymani Lehmann","doi":"10.2196/63430","DOIUrl":"10.2196/63430","url":null,"abstract":"Background: Recent studies, including those by the National Board of Medical Examiners, have highlighted the remarkable capabilities of recent large language models (LLMs) such as ChatGPT in passing the United States Medical Licensing Examination (USMLE). However, there is a gap in detailed analysis of LLM performance in specific medical content areas, thus limiting an assessment of their potential utility in medical education.Objective: This study aimed to assess and compare the accuracy of successive ChatGPT versions (GPT-3.5, GPT-4, and GPT-4 Omni) in USMLE disciplines, clinical clerkships, and the clinical skills of diagnostics and management.Methods: This study used 750 clinical vignette-based multiple-choice questions to characterize the performance of successive ChatGPT versions (ChatGPT 3.5 [GPT-3.5], ChatGPT 4 [GPT-4], and ChatGPT 4 Omni [GPT-4o]) across USMLE disciplines, clinical clerkships, and in clinical skills (diagnostics and management). Accuracy was assessed using a standardized protocol, with statistical analyses conducted to compare the models' performances.Results: GPT-4o achieved the highest accuracy across 750 multiple-choice questions at 90.4%, outperforming GPT-4 and GPT-3.5, which scored 81.1% and 60.0%, respectively. GPT-4o's highest performances were in social sciences (95.5%), behavioral and neuroscience (94.2%), and pharmacology (93.2%). In clinical skills, GPT-4o's diagnostic accuracy was 92.7% and management accuracy was 88.8%, significantly higher than its predecessors. Notably, both GPT-4o and GPT-4 significantly outperformed the medical student average accuracy of 59.3% (95% CI 58.3-60.3).Conclusions: GPT-4o's performance in USMLE disciplines, clinical clerkships, and clinical skills indicates substantial improvements over its predecessors, suggesting significant potential for the use of this technology as an educational aid for medical students. These findings underscore the need for careful consideration when integrating LLMs into medical education, emphasizing the importance of structured curricula to guide their appropriate use and the need for ongoing critical analyses to ensure their reliability and effectiveness.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"10 ","pages":"e63430"},"PeriodicalIF":3.2,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611793/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Potential of Artificial Intelligence Tools for Reducing Uncertainty in Medicine and Directions for Medical Education. 人工智能工具在减少医学不确定性方面的潜力及医学教育的方向》（The Potential of Artificial Intelligence Tools for Reducing Ununcertainty in Medicine and Directions for Medical Education）。

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education

Pub Date : 2024-11-04 DOI: 10.2196/51446

Sauliha Rabia Alli, Soaad Qahhār Hossain, Sunit Das, Ross Upshur

Unlabelled: In the field of medicine, uncertainty is inherent. Physicians are asked to make decisions on a daily basis without complete certainty, whether it is in understanding the patient's problem, performing the physical examination, interpreting the findings of diagnostic tests, or proposing a management plan. The reasons for this uncertainty are widespread, including the lack of knowledge about the patient, individual physician limitations, and the limited predictive power of objective diagnostic tools. This uncertainty poses significant problems in providing competent patient care. Research efforts and teaching are attempts to reduce uncertainty that have now become inherent to medicine. Despite this, uncertainty is rampant. Artificial intelligence (AI) tools, which are being rapidly developed and integrated into practice, may change the way we navigate uncertainty. In their strongest forms, AI tools may have the ability to improve data collection on diseases, patient beliefs, values, and preferences, thereby allowing more time for physician-patient communication. By using methods not previously considered, these tools hold the potential to reduce the uncertainty in medicine, such as those arising due to the lack of clinical information and provider skill and bias. Despite this possibility, there has been considerable resistance to the implementation of AI tools in medical practice. In this viewpoint article, we discuss the impact of AI on medical uncertainty and discuss practical approaches to teaching the use of AI tools in medical schools and residency training programs, including AI ethics, real-world skills, and technological aptitude.

无标签：在医学领域，不确定性是与生俱来的。医生每天都要在没有十足把握的情况下做出决定，无论是在了解病人的问题、进行体格检查、解释诊断检测结果还是提出治疗方案方面。造成这种不确定性的原因很多，包括对病人缺乏了解、医生个人能力有限以及客观诊断工具的预测能力有限。这种不确定性给提供合格的病人护理带来了重大问题。研究工作和教学试图减少不确定性，这已成为医学的固有特点。尽管如此，不确定性依然猖獗。人工智能（AI）工具正在迅速发展并融入实践，它可能会改变我们驾驭不确定性的方式。在最强大的形式下，人工智能工具可能有能力改善有关疾病、患者信仰、价值观和偏好的数据收集，从而为医患沟通留出更多时间。通过使用以前未曾考虑过的方法，这些工具有可能减少医学中的不确定性，例如由于缺乏临床信息以及提供者的技能和偏见而产生的不确定性。尽管存在这种可能性，但在医疗实践中使用人工智能工具却遇到了相当大的阻力。在这篇观点文章中，我们讨论了人工智能对医学不确定性的影响，并探讨了在医学院和住院医师培训项目中教授使用人工智能工具的实用方法，包括人工智能伦理、实际技能和技术能力。

{"title":"The Potential of Artificial Intelligence Tools for Reducing Uncertainty in Medicine and Directions for Medical Education.","authors":"Sauliha Rabia Alli, Soaad Qahhār Hossain, Sunit Das, Ross Upshur","doi":"10.2196/51446","DOIUrl":"10.2196/51446","url":null,"abstract":"Unlabelled: In the field of medicine, uncertainty is inherent. Physicians are asked to make decisions on a daily basis without complete certainty, whether it is in understanding the patient's problem, performing the physical examination, interpreting the findings of diagnostic tests, or proposing a management plan. The reasons for this uncertainty are widespread, including the lack of knowledge about the patient, individual physician limitations, and the limited predictive power of objective diagnostic tools. This uncertainty poses significant problems in providing competent patient care. Research efforts and teaching are attempts to reduce uncertainty that have now become inherent to medicine. Despite this, uncertainty is rampant. Artificial intelligence (AI) tools, which are being rapidly developed and integrated into practice, may change the way we navigate uncertainty. In their strongest forms, AI tools may have the ability to improve data collection on diseases, patient beliefs, values, and preferences, thereby allowing more time for physician-patient communication. By using methods not previously considered, these tools hold the potential to reduce the uncertainty in medicine, such as those arising due to the lack of clinical information and provider skill and bias. Despite this possibility, there has been considerable resistance to the implementation of AI tools in medical practice. In this viewpoint article, we discuss the impact of AI on medical uncertainty and discuss practical approaches to teaching the use of AI tools in medical schools and residency training programs, including AI ethics, real-world skills, and technological aptitude.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"10 ","pages":"e51446"},"PeriodicalIF":3.2,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11554287/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142575636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Transforming the Future of Digital Health Education: Redesign of a Graduate Program Using Competency Mapping. 改变数字健康教育的未来：利用能力图谱重新设计研究生课程。

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education

Pub Date : 2024-10-31 DOI: 10.2196/54112

Michelle Mun, Sonia Chanchlani, Kayley Lyons, Kathleen Gray

Unlabelled: Digital transformation has disrupted many industries but is yet to revolutionize health care. Educational programs must be aligned with the reality that goes beyond developing individuals in their own professions, professionals wishing to make an impact in digital health will need a multidisciplinary understanding of how business models, organizational processes, stakeholder relationships, and workforce dynamics across the health care ecosystem may be disrupted by digital health technology. This paper describes the redesign of an existing postgraduate program, ensuring that core digital health content is relevant, pedagogically sound, and evidence-based, and that the program provides learning and practical application of concepts of the digital transformation of health. Existing subjects were mapped to the American Medical Informatics Association Clinical Informatics Core Competencies, followed by consultation with leadership to further identify gaps or opportunities to revise the course structure. New additions of core and elective subjects were proposed to align with the competencies. Suitable electives were chosen based on stakeholder feedback and a review of subjects in fields relevant to digital transformation of health. The program was revised with a new title, course overview, course intended learning outcomes, reorganizing of core subjects, and approval of new electives, adding to a suite of professional development offerings and forming a structured pathway to further qualification. Programs in digital health must move beyond purely informatics-based competencies toward enabling transformational change. Postgraduate program development in this field is possible within a short time frame with the use of established competency frameworks and expert and student consultation.

无标签：数字化转型已经颠覆了许多行业，但尚未彻底改变医疗行业。教育计划必须与现实相一致，除了培养本专业的人才外，希望在数字医疗领域有所作为的专业人士还需要多学科的理解，即数字医疗技术可能会如何颠覆整个医疗生态系统中的商业模式、组织流程、利益相关者关系和劳动力动态。本文介绍了对现有研究生课程的重新设计，以确保数字医疗的核心内容具有相关性、教学合理性和循证性，并确保该课程能够提供医疗数字化转型概念的学习和实际应用。我们将现有科目与美国医学信息学协会临床信息学核心能力进行了映射，随后与领导层进行了磋商，以进一步确定差距或修改课程结构的机会。为了与这些能力保持一致，提出了新增核心科目和选修科目的建议。根据利益相关者的反馈以及对医疗数字化转型相关领域科目的审查，选择了合适的选修课。该计划经过修订，采用了新的名称、课程概述、课程预期学习成果，重新组织了核心科目，并批准了新的选修课，增加了一系列专业发展课程，形成了一条结构化的晋升途径。数字医疗课程必须超越纯粹的信息学能力，实现转型变革。利用已有的能力框架以及专家和学生咨询，可以在短时间内开发出该领域的研究生课程。

{"title":"Transforming the Future of Digital Health Education: Redesign of a Graduate Program Using Competency Mapping.","authors":"Michelle Mun, Sonia Chanchlani, Kayley Lyons, Kathleen Gray","doi":"10.2196/54112","DOIUrl":"10.2196/54112","url":null,"abstract":"Unlabelled: Digital transformation has disrupted many industries but is yet to revolutionize health care. Educational programs must be aligned with the reality that goes beyond developing individuals in their own professions, professionals wishing to make an impact in digital health will need a multidisciplinary understanding of how business models, organizational processes, stakeholder relationships, and workforce dynamics across the health care ecosystem may be disrupted by digital health technology. This paper describes the redesign of an existing postgraduate program, ensuring that core digital health content is relevant, pedagogically sound, and evidence-based, and that the program provides learning and practical application of concepts of the digital transformation of health. Existing subjects were mapped to the American Medical Informatics Association Clinical Informatics Core Competencies, followed by consultation with leadership to further identify gaps or opportunities to revise the course structure. New additions of core and elective subjects were proposed to align with the competencies. Suitable electives were chosen based on stakeholder feedback and a review of subjects in fields relevant to digital transformation of health. The program was revised with a new title, course overview, course intended learning outcomes, reorganizing of core subjects, and approval of new electives, adding to a suite of professional development offerings and forming a structured pathway to further qualification. Programs in digital health must move beyond purely informatics-based competencies toward enabling transformational change. Postgraduate program development in this field is possible within a short time frame with the use of established competency frameworks and expert and student consultation.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"10 ","pages":"e54112"},"PeriodicalIF":3.2,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11542907/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142559031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Pilot Project to Promote Research Competency in Medical Students Through Journal Clubs: Mixed Methods Study. 通过期刊俱乐部提高医学生研究能力的试点项目：混合方法研究。

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education

Pub Date : 2024-10-31 DOI: 10.2196/51173

Mert Karabacak, Zeynep Ozcan, Burak Berksu Ozkara, Zeynep Sude Furkan, Sotirios Bisdas

Background: Undergraduate medical students often lack hands-on research experience and fundamental scientific research skills, limiting their exposure to the practical aspects of scientific investigation. The Cerrahpasa Neuroscience Society introduced a program to address this deficiency and facilitate student-led research.

Objective: The primary goal of this initiative was to enhance medical students' research output by enabling them to generate and publish peer-reviewed papers within the framework of this pilot project. The project aimed to provide an accessible, global model for research training through structured journal clubs, mentorship from experienced peers, and resource access.

Methods: In January 2022, a total of 30 volunteer students from various Turkish medical schools participated in this course-based undergraduate research experience program. Students self-organized into 2 groups according to their preferred study type: original research or systematic review. Two final-year students with prior research experience led the project, developing training modules using selected materials. The project was implemented entirely online, with participants completing training modules before using their newly acquired theoretical knowledge to perform assigned tasks.

Results: Based on student feedback, the project timeline was adjusted to allow for greater flexibility in meeting deadlines. Despite these adjustments, participants successfully completed their tasks, applying the theoretical knowledge they had gained to their respective assignments. As of April 2024, the initiative has culminated in 3 published papers and 3 more under peer review. The project has also seen an increase in student interest in further involvement and self-paced learning.

Conclusions: This initiative leverages globally accessible resources for research training, effectively fostering research competency among participants. It has successfully demonstrated the potential for undergraduates to contribute to medical research output and paved the way for a self-sustaining, student-led research program. Despite some logistical challenges, the project provided valuable insights for future implementations, showcasing the potential for students to engage in meaningful, publishable research.

背景：医科本科生往往缺乏实际研究经验和基本科学研究技能，这限制了他们接触科学研究的实践方面。Cerrahpasa 神经科学学会推出了一项计划，以解决这一不足，促进学生主导的研究：该计划的主要目标是提高医科学生的研究成果，使他们能够在试点项目框架内撰写并发表经同行评审的论文。该项目旨在通过有组织的期刊俱乐部、经验丰富的同行指导和资源获取，为研究培训提供一个可利用的全球模式：2022 年 1 月，共有 30 名来自土耳其不同医学院校的志愿学生参加了这个以课程为基础的本科生研究体验项目。学生们根据自己喜欢的研究类型自行分为两组：原创研究或系统综述。两名有研究经验的毕业班学生领导该项目，使用选定的材料开发培训模块。该项目完全在网上实施，参与者先完成培训模块，然后再利用新获得的理论知识完成指定任务：结果：根据学生的反馈意见，对项目时间表进行了调整，以便在截止日期前更灵活地完成任务。尽管进行了这些调整，学员们还是成功地完成了任务，将所学的理论知识应用到了各自的任务中。截至 2024 年 4 月，该项目已发表 3 篇论文，另有 3 篇论文正在接受同行评审。该项目还提高了学生进一步参与和自主学习的兴趣：该倡议利用全球可获取的资源开展研究培训，有效地培养了参与者的研究能力。它成功展示了本科生为医学研究成果做出贡献的潜力，并为学生主导的自立研究项目铺平了道路。尽管在后勤方面存在一些挑战，但该项目为今后的实施提供了宝贵的见解，展示了学生参与有意义、可发表的研究的潜力。

{"title":"A Pilot Project to Promote Research Competency in Medical Students Through Journal Clubs: Mixed Methods Study.","authors":"Mert Karabacak, Zeynep Ozcan, Burak Berksu Ozkara, Zeynep Sude Furkan, Sotirios Bisdas","doi":"10.2196/51173","DOIUrl":"10.2196/51173","url":null,"abstract":"Background: Undergraduate medical students often lack hands-on research experience and fundamental scientific research skills, limiting their exposure to the practical aspects of scientific investigation. The Cerrahpasa Neuroscience Society introduced a program to address this deficiency and facilitate student-led research.Objective: The primary goal of this initiative was to enhance medical students' research output by enabling them to generate and publish peer-reviewed papers within the framework of this pilot project. The project aimed to provide an accessible, global model for research training through structured journal clubs, mentorship from experienced peers, and resource access.Methods: In January 2022, a total of 30 volunteer students from various Turkish medical schools participated in this course-based undergraduate research experience program. Students self-organized into 2 groups according to their preferred study type: original research or systematic review. Two final-year students with prior research experience led the project, developing training modules using selected materials. The project was implemented entirely online, with participants completing training modules before using their newly acquired theoretical knowledge to perform assigned tasks.Results: Based on student feedback, the project timeline was adjusted to allow for greater flexibility in meeting deadlines. Despite these adjustments, participants successfully completed their tasks, applying the theoretical knowledge they had gained to their respective assignments. As of April 2024, the initiative has culminated in 3 published papers and 3 more under peer review. The project has also seen an increase in student interest in further involvement and self-paced learning.Conclusions: This initiative leverages globally accessible resources for research training, effectively fostering research competency among participants. It has successfully demonstrated the potential for undergraduates to contribute to medical research output and paved the way for a self-sustaining, student-led research program. Despite some logistical challenges, the project provided valuable insights for future implementations, showcasing the potential for students to engage in meaningful, publishable research.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"10 ","pages":"e51173"},"PeriodicalIF":3.2,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11542906/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142559030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A SIMBA CoMICs Initiative to Cocreating and Disseminating Evidence-Based, Peer-Reviewed Short Videos on Social Media: Mixed Methods Prospective Study. SIMBA CoMICs 在社交媒体上共同创作和传播基于证据、经同行评审的短片的倡议：混合方法前瞻性研究。

IF 4.3 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education

Pub Date : 2024-10-30 DOI: 10.2196/52924

Maiar Elhariry, Kashish Malhotra, Kashish Goyal, Marco Bardus, Punith Kempegowda

Background: Social media is a powerful platform for disseminating health information, yet it is often riddled with misinformation. Further, few guidelines exist for producing reliable, peer-reviewed content. This study describes a framework for creating and disseminating evidence-based videos on polycystic ovary syndrome (PCOS) and thyroid conditions to improve health literacy and tackle misinformation.Objective: The study aims to evaluate the creation, dissemination, and impact of evidence-based, peer-reviewed short videos on PCOS and thyroid disorders across social media. It also explores the experiences of content creators and assesses audience engagement.Methods: This mixed methods prospective study was conducted between December 2022 and May 2023 and comprised five phases: (1) script generation, (2) video creation, (3) cross-platform publication, (4) process evaluation, and (5) impact evaluation. The SIMBA-CoMICs (Simulation via Instant Messaging for Bedside Application-Combined Medical Information Cines) initiative provides a structured process where medical concepts are simplified and converted to visually engaging videos. The initiative recruited medical students interested in making visually appealing and scientifically accurate videos for social media. The students were then guided to create video scripts based on frequently searched PCOS- and thyroid-related topics. Once experts confirmed the accuracy of the scripts, the medical students produced the videos. The videos were checked by clinical experts and experts with lived experience to ensure clarity and engagement. The SIMBA-CoMICs team then guided the students in editing these videos to fit platform requirements before posting them on TikTok, Instagram, YouTube, and Twitter. Engagement metrics were tracked over 2 months. Content creators were interviewed, and thematic analysis was performed to explore their experiences.Results: The 20 videos received 718 likes, 120 shares, and 54,686 views across all platforms, with TikTok (19,458 views) and Twitter (19,678 views) being the most popular. Engagement increased significantly, with follower growth ranging from 5% on Twitter to 89% on TikTok. Thematic analysis of interviews with 8 out of 38 participants revealed 4 key themes: views on social media, advice for using social media, reasons for participating, and reflections on the project. Content creators highlighted the advantages of social media, such as large outreach (12 references), convenience (10 references), and accessibility to opportunities (7 references). Participants appreciated the nonrestrictive participation criteria, convenience (8 references), and the ability to record from home using prewritten scripts (6 references). Further recommendations to improve the content creation experience included awareness of audience demographics (9 references), sharing content on multiple platforms

背景社交媒体是传播健康信息的一个强大平台，但其中往往充斥着错误信息。此外，在制作可靠的、经过同行评审的内容方面也鲜有指南。本研究描述了一个创建和传播多囊卵巢综合症（PCOS）和甲状腺疾病循证视频的框架，以提高健康素养和应对错误信息：本研究旨在评估社交媒体上关于多囊卵巢综合征（PCOS）和甲状腺疾病的循证同行评审短视频的制作、传播和影响。研究还探讨了内容创作者的经验，并评估了受众的参与度：这项混合方法前瞻性研究在 2022 年 12 月至 2023 年 5 月期间进行，包括五个阶段：（1）脚本生成；（2）视频创作；（3）跨平台发布；（4）过程评估；（5）影响评估。SIMBA-CoMICs（床旁应用即时通讯模拟-组合医学信息视频）计划提供了一个结构化的过程，将医学概念简化并转换成具有视觉吸引力的视频。该计划招募有兴趣为社交媒体制作具有视觉吸引力和科学准确性视频的医学生。然后，指导学生根据经常搜索的多囊卵巢综合症和甲状腺相关主题制作视频脚本。专家确认脚本的准确性后，医学生们就开始制作视频。临床专家和有生活经验的专家对视频进行了检查，以确保清晰度和参与性。随后，SIMBA-CoMICs 团队指导学生编辑这些视频，使其符合平台要求，然后发布到 TikTok、Instagram、YouTube 和 Twitter 上。在两个月的时间里对参与度指标进行了跟踪。对内容创作者进行了访谈，并进行了主题分析，以探讨他们的经验：20 个视频在所有平台上获得了 718 个赞、120 次分享和 54,686 次浏览，其中 TikTok（19,458 次浏览）和 Twitter（19,678 次浏览）最受欢迎。参与度大幅提高，Twitter 追随者增长 5%，TikTok 追随者增长 89%。对 38 位参与者中的 8 位进行的访谈进行了主题分析，发现了 4 个关键主题：对社交媒体的看法、使用社交媒体的建议、参与的原因以及对项目的反思。内容创作者强调了社交媒体的优势，如覆盖面广（12 次提及）、方便（10 次提及）和机会多（7 次提及）。参与者则对无限制的参与标准、便利性（8 次提及）以及在家使用预先写好的脚本进行录制的能力（6 次提及）表示赞赏。关于改善内容创作体验的其他建议包括：了解受众人口统计（9 次引用）、在多个平台上共享内容（5 次引用）以及与组织机构合作（3 次引用）：本研究表明，SIMBA CoMICs 计划在培训医学生为社交媒体传播创建准确的多囊卵巢综合症和甲状腺疾病医疗信息方面非常有效。该模式为消除错误信息和提高健康素养提供了一个可扩展的解决方案。

{"title":"A SIMBA CoMICs Initiative to Cocreating and Disseminating Evidence-Based, Peer-Reviewed Short Videos on Social Media: Mixed Methods Prospective Study.","authors":"Maiar Elhariry, Kashish Malhotra, Kashish Goyal, Marco Bardus, Punith Kempegowda","doi":"10.2196/52924","DOIUrl":"10.2196/52924","url":null,"abstract":"Background: Social media is a powerful platform for disseminating health information, yet it is often riddled with misinformation. Further, few guidelines exist for producing reliable, peer-reviewed content. This study describes a framework for creating and disseminating evidence-based videos on polycystic ovary syndrome (PCOS) and thyroid conditions to improve health literacy and tackle misinformation.Objective: The study aims to evaluate the creation, dissemination, and impact of evidence-based, peer-reviewed short videos on PCOS and thyroid disorders across social media. It also explores the experiences of content creators and assesses audience engagement.Methods: This mixed methods prospective study was conducted between December 2022 and May 2023 and comprised five phases: (1) script generation, (2) video creation, (3) cross-platform publication, (4) process evaluation, and (5) impact evaluation. The SIMBA-CoMICs (Simulation via Instant Messaging for Bedside Application-Combined Medical Information Cines) initiative provides a structured process where medical concepts are simplified and converted to visually engaging videos. The initiative recruited medical students interested in making visually appealing and scientifically accurate videos for social media. The students were then guided to create video scripts based on frequently searched PCOS- and thyroid-related topics. Once experts confirmed the accuracy of the scripts, the medical students produced the videos. The videos were checked by clinical experts and experts with lived experience to ensure clarity and engagement. The SIMBA-CoMICs team then guided the students in editing these videos to fit platform requirements before posting them on TikTok, Instagram, YouTube, and Twitter. Engagement metrics were tracked over 2 months. Content creators were interviewed, and thematic analysis was performed to explore their experiences.Results: The 20 videos received 718 likes, 120 shares, and 54,686 views across all platforms, with TikTok (19,458 views) and Twitter (19,678 views) being the most popular. Engagement increased significantly, with follower growth ranging from 5% on Twitter to 89% on TikTok. Thematic analysis of interviews with 8 out of 38 participants revealed 4 key themes: views on social media, advice for using social media, reasons for participating, and reflections on the project. Content creators highlighted the advantages of social media, such as large outreach (12 references), convenience (10 references), and accessibility to opportunities (7 references). Participants appreciated the nonrestrictive participation criteria, convenience (8 references), and the ability to record from home using prewritten scripts (6 references). Further recommendations to improve the content creation experience included awareness of audience demographics (9 references), sharing content on multiple platforms ","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"10 ","pages":"e52924"},"PeriodicalIF":4.3,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11561432/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142548074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Utilization of, Perceptions on, and Intention to Use AI Chatbots Among Medical Students in China: National Cross-Sectional Study. 中国医学生对人工智能聊天机器人的使用情况、看法和意向：全国横断面研究

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education

Pub Date : 2024-10-28 DOI: 10.2196/57132

Wenjuan Tao, Jinming Yang, Xing Qu

Background: Artificial intelligence (AI) chatbots are poised to have a profound impact on medical education. Medical students, as early adopters of technology and future health care providers, play a crucial role in shaping the future of health care. However, little is known about the utilization of, perceptions on, and intention to use AI chatbots among medical students in China.

Objective: This study aims to explore the utilization of, perceptions on, and intention to use generative AI chatbots among medical students in China, using the Unified Theory of Acceptance and Use of Technology (UTAUT) framework. By conducting a national cross-sectional survey, we sought to identify the key determinants that influence medical students' acceptance of AI chatbots, thereby providing a basis for enhancing their integration into medical education. Understanding these factors is crucial for educators, policy makers, and technology developers to design and implement effective AI-driven educational tools that align with the needs and expectations of future health care professionals.

Methods: A web-based electronic survey questionnaire was developed and distributed via social media to medical students across the country. The UTAUT was used as a theoretical framework to design the questionnaire and analyze the data. The relationship between behavioral intention to use AI chatbots and UTAUT predictors was examined using multivariable regression.

Results: A total of 693 participants were from 57 universities covering 21 provinces or municipalities in China. Only a minority (199/693, 28.72%) reported using AI chatbots for studying, with ChatGPT (129/693, 18.61%) being the most commonly used. Most of the participants used AI chatbots for quickly obtaining medical information and knowledge (631/693, 91.05%) and increasing learning efficiency (594/693, 85.71%). Utilization behavior, social influence, facilitating conditions, perceived risk, and personal innovativeness showed significant positive associations with the behavioral intention to use AI chatbots (all P values were <.05).

Conclusions: Chinese medical students hold positive perceptions toward and high intentions to use AI chatbots, but there are gaps between intention and actual adoption. This highlights the need for strategies to improve access, training, and support and provide peer usage examples to fully harness the potential benefits of chatbot technology.

背景：人工智能（AI）聊天机器人将对医学教育产生深远影响：人工智能（AI）聊天机器人将对医学教育产生深远影响。医学生作为技术的早期采用者和未来的医疗服务提供者，在塑造医疗保健的未来方面发挥着至关重要的作用。然而，人们对中国医学生使用人工智能聊天机器人的情况、看法和意向知之甚少：本研究旨在采用技术接受和使用统一理论（UTAUT）框架，探讨中国医学生对生成式人工智能聊天机器人的使用情况、看法和使用意向。通过开展全国横断面调查，我们试图找出影响医学生接受人工智能聊天机器人的关键决定因素，从而为加强人工智能聊天机器人与医学教育的结合提供依据。了解这些因素对于教育者、政策制定者和技术开发者设计和实施有效的人工智能驱动的教育工具至关重要，这些工具应符合未来医疗专业人员的需求和期望：方法：我们开发了一个基于网络的电子调查问卷，并通过社交媒体向全国各地的医学生发放。UTAUT作为设计问卷和分析数据的理论框架。使用多元回归法研究了使用人工智能聊天机器人的行为意向与UTAUT预测因素之间的关系：共有 693 名参与者来自中国 21 个省市的 57 所高校。只有少数人（199/693，28.72%）表示在学习中使用了人工智能聊天机器人，其中最常用的是 ChatGPT（129/693，18.61%）。大多数参与者使用人工智能聊天机器人来快速获取医疗信息和知识（631/693，91.05%）以及提高学习效率（594/693，85.71%）。使用行为、社会影响、便利条件、感知风险和个人创新性与使用人工智能聊天机器人的行为意向呈显著正相关（所有 P 值均为结论）：中国医学生对人工智能聊天机器人持有积极的看法和较高的使用意愿，但在意愿和实际采用之间存在差距。这凸显出需要制定策略来改善获取、培训和支持，并提供同行使用范例，以充分利用聊天机器人技术的潜在优势。

{"title":"Utilization of, Perceptions on, and Intention to Use AI Chatbots Among Medical Students in China: National Cross-Sectional Study.","authors":"Wenjuan Tao, Jinming Yang, Xing Qu","doi":"10.2196/57132","DOIUrl":"10.2196/57132","url":null,"abstract":"Background: Artificial intelligence (AI) chatbots are poised to have a profound impact on medical education. Medical students, as early adopters of technology and future health care providers, play a crucial role in shaping the future of health care. However, little is known about the utilization of, perceptions on, and intention to use AI chatbots among medical students in China.Objective: This study aims to explore the utilization of, perceptions on, and intention to use generative AI chatbots among medical students in China, using the Unified Theory of Acceptance and Use of Technology (UTAUT) framework. By conducting a national cross-sectional survey, we sought to identify the key determinants that influence medical students' acceptance of AI chatbots, thereby providing a basis for enhancing their integration into medical education. Understanding these factors is crucial for educators, policy makers, and technology developers to design and implement effective AI-driven educational tools that align with the needs and expectations of future health care professionals.Methods: A web-based electronic survey questionnaire was developed and distributed via social media to medical students across the country. The UTAUT was used as a theoretical framework to design the questionnaire and analyze the data. The relationship between behavioral intention to use AI chatbots and UTAUT predictors was examined using multivariable regression.Results: A total of 693 participants were from 57 universities covering 21 provinces or municipalities in China. Only a minority (199/693, 28.72%) reported using AI chatbots for studying, with ChatGPT (129/693, 18.61%) being the most commonly used. Most of the participants used AI chatbots for quickly obtaining medical information and knowledge (631/693, 91.05%) and increasing learning efficiency (594/693, 85.71%). Utilization behavior, social influence, facilitating conditions, perceived risk, and personal innovativeness showed significant positive associations with the behavioral intention to use AI chatbots (all P values were <.05).Conclusions: Chinese medical students hold positive perceptions toward and high intentions to use AI chatbots, but there are gaps between intention and actual adoption. This highlights the need for strategies to improve access, training, and support and provide peer usage examples to fully harness the potential benefits of chatbot technology.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"10 ","pages":"e57132"},"PeriodicalIF":3.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11533383/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142509604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating the Effectiveness of an Online Course on Pediatric Malnutrition for Syrian Health Professionals: Qualitative Delphi Study. 评估叙利亚卫生专业人员儿科营养不良在线课程的有效性：定性德尔菲研究。

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education

Pub Date : 2024-10-28 DOI: 10.2196/53151

Amal Sahyouni, Imad Zoukar, Mayssoon Dashash

Background: There is a shortage of competent health professionals in managing malnutrition. Online education may be a practical and flexible approach to address this gap.

Objective: This study aimed to identify essential competencies and assess the effectiveness of an online course on pediatric malnutrition in improving the knowledge of pediatricians and health professionals.

Methods: A focus group (n=5) and Delphi technique (n=21 health professionals) were used to identify 68 essential competencies. An online course consisting of 4 educational modules in Microsoft PowerPoint (Microsoft Corp) slide form with visual aids (photos and videos) was designed and published on the Syrian Virtual University platform website using an asynchronous e-learning system. The course covered definition, classification, epidemiology, anthropometrics, treatment, and consequences. Participants (n=10) completed a pretest of 40 multiple-choice questions, accessed the course, completed a posttest after a specified period, and filled out a questionnaire to measure their attitude and assess their satisfaction.

Results: A total of 68 essential competencies were identified, categorized into 3 domains: knowledge (24 competencies), skills (29 competencies), and attitudes (15 competencies). These competencies were further classified based on their focus area: etiology (10 competencies), assessment and diagnosis (21 competencies), and management (37 competencies). Further, 10 volunteers, consisting of 5 pediatricians and 5 health professionals, participated in this study over a 2-week period. A statistically significant increase in knowledge was observed among participants following completion of the online course (pretest mean 24.2, SD 6.1, and posttest mean 35.2, SD 3.3; P<.001). Pediatricians demonstrated higher pre- and posttest scores compared to other health care professionals (all P values were <.05). Prior malnutrition training within the past year positively impacted pretest scores (P=.03). Participants highly rated the course (mean satisfaction score >3.0 on a 5-point Likert scale), with 60% (6/10) favoring a blended learning approach.

Conclusions: In total, 68 essential competencies are required for pediatricians to manage children who are malnourished. The online course effectively improved knowledge acquisition among health care professionals, with high participant satisfaction and approval of the e-learning environment.

背景：在管理营养不良方面缺乏合格的卫生专业人员。在线教育可能是解决这一差距的一种实用而灵活的方法。目的：本研究旨在确定儿童营养不良在线课程在提高儿科医生和卫生专业人员知识方面的基本能力并评估其有效性。方法：采用焦点小组法（n=5）和德尔菲法（n=21），确定68项基本胜任力。利用异步电子学习系统，在叙利亚虚拟大学平台网站上设计并发布了一门由4个教育模块组成的在线课程，该课程采用微软公司（Microsoft Corp .）的PowerPoint幻灯片形式，并附有视觉辅助工具（照片和视频）。课程涵盖了定义、分类、流行病学、人体测量学、治疗和后果。参与者（n=10）完成了40道选择题的前测，学习了课程，在指定的时间后完成了后测，并填写了一份问卷来测量他们的态度和评估他们的满意度。结果：共确定了68项基本胜任力，分为3个领域：知识（24项胜任力）、技能（29项胜任力）和态度（15项胜任力）。这些能力根据其重点领域进一步分类：病因学（10项能力）、评估和诊断（21项能力）和管理（37项能力）。此外，10名志愿者，包括5名儿科医生和5名卫生专业人员，参加了为期两周的研究。在完成在线课程后，观察到参与者的知识有统计学意义的增加(测前平均24.2,SD 6.1，测后平均35.2,SD 3.3；（5分李克特量表P3.0）， 60%（6/10）赞成混合式学习方法。结论：儿科医生管理营养不良儿童总共需要68项基本能力。在线课程有效地改善了卫生保健专业人员的知识获取，参与者满意度高，对电子学习环境的认可度高。

{"title":"Evaluating the Effectiveness of an Online Course on Pediatric Malnutrition for Syrian Health Professionals: Qualitative Delphi Study.","authors":"Amal Sahyouni, Imad Zoukar, Mayssoon Dashash","doi":"10.2196/53151","DOIUrl":"10.2196/53151","url":null,"abstract":"Background: There is a shortage of competent health professionals in managing malnutrition. Online education may be a practical and flexible approach to address this gap.Objective: This study aimed to identify essential competencies and assess the effectiveness of an online course on pediatric malnutrition in improving the knowledge of pediatricians and health professionals.Methods: A focus group (n=5) and Delphi technique (n=21 health professionals) were used to identify 68 essential competencies. An online course consisting of 4 educational modules in Microsoft PowerPoint (Microsoft Corp) slide form with visual aids (photos and videos) was designed and published on the Syrian Virtual University platform website using an asynchronous e-learning system. The course covered definition, classification, epidemiology, anthropometrics, treatment, and consequences. Participants (n=10) completed a pretest of 40 multiple-choice questions, accessed the course, completed a posttest after a specified period, and filled out a questionnaire to measure their attitude and assess their satisfaction.Results: A total of 68 essential competencies were identified, categorized into 3 domains: knowledge (24 competencies), skills (29 competencies), and attitudes (15 competencies). These competencies were further classified based on their focus area: etiology (10 competencies), assessment and diagnosis (21 competencies), and management (37 competencies). Further, 10 volunteers, consisting of 5 pediatricians and 5 health professionals, participated in this study over a 2-week period. A statistically significant increase in knowledge was observed among participants following completion of the online course (pretest mean 24.2, SD 6.1, and posttest mean 35.2, SD 3.3; P<.001). Pediatricians demonstrated higher pre- and posttest scores compared to other health care professionals (all P values were <.05). Prior malnutrition training within the past year positively impacted pretest scores (P=.03). Participants highly rated the course (mean satisfaction score >3.0 on a 5-point Likert scale), with 60% (6/10) favoring a blended learning approach.Conclusions: In total, 68 essential competencies are required for pediatricians to manage children who are malnourished. The online course effectively improved knowledge acquisition among health care professionals, with high participant satisfaction and approval of the e-learning environment.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"10 ","pages":"e53151"},"PeriodicalIF":3.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11615703/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142802437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0