{"title":"Using the past to explain the present: understanding tiered grading in medical education.","authors":"James F Smith, Nicole M Piemonte","doi":"10.1093/acamed/wvaf015","DOIUrl":"https://doi.org/10.1093/acamed/wvaf015","url":null,"abstract":"","PeriodicalId":50929,"journal":{"name":"Academic Medicine","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145953775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Young-Min Kim, Young-Mee Lee, Do-Hwan Kim, Suyoun Kim, Ji-Hoon Kim, Hye Rim Jin, Chang-Jin Choi
Problem: Although the use of artificial intelligence (AI) as a diagnostic aid is increasing in clinical practice, medical education provides little training on how to incorporate AI-generated information into diagnosis and use it effectively in shared decision-making (SDM) with patients.
Approach: The authors developed and piloted a simulation-based course to train AI-assisted SDM to final-year medical students preparing for residency. Conducted between June and October 2023, the course combined online prelearning with onsite simulations using clinically approved AI tools (Lunit INSIGHT CXR, version 3.1.4.1 and MMG, version 1.1.4.3; Lunit Inc, Seoul, South Korea; used November 16 and 27, 2023). Scenarios portrayed asymptomatic patients with incidental findings (eg, pulmonary nodules, breast microcalcifications). Students engaged in two 12-minute simulated patient encounters featuring SDM with 2 management options. Sessions concluded with simulated patient-written feedback and expert-facilitated debriefing. Twenty-seven students from 3 medical schools participated.
Outcomes: Program evaluation showed significant improvements in participants' comprehension and confidence in SDM (t = 6.51 and t = 7.56, P < .001, respectively) and AI-assisted SDM (t = 5.72 and t = 5.80, P < .001, respectively). Students found AI tools helpful for facilitating SDM and patient communication. Thematic analysis of interviews highlighted strengths, such as structured course design and reflective debriefing. Participants noted that prior education focused on diagnostic algorithms, whereas this course emphasized patient communication and preference-based decisions. They found AI tools useful for diagnosis and supporting discussion with patients through visual outputs. However, they identified limitations, including their own clinical knowledge gaps and lack of explainability in AI tool shortage. They suggested integrating SDM and AI-assisted diagnosis training into formal curricula to better prepare students for clinical practice.
Next steps: Future efforts should focus on integrating this course into undergraduate curricula or transition training programs to provide experiential learning opportunities in AI-assisted clinical practice.
问题:尽管在临床实践中越来越多地使用人工智能(AI)作为诊断辅助手段,但医学教育几乎没有提供关于如何将人工智能生成的信息纳入诊断并有效地将其用于与患者共同决策(SDM)的培训。方法:作者开发并试点了一门基于模拟的课程,为准备住院医师的最后一年级医学生培训人工智能辅助SDM。该课程于2023年6月至10月进行,使用临床批准的人工智能工具(Lunit INSIGHT CXR,版本3.1.4.1和MMG,版本1.1.4.3;Lunit Inc,韩国首尔;于2023年11月16日至27日使用),将在线预学习与现场模拟相结合。场景描述无症状患者附带发现(如肺结节,乳房微钙化)。学生们参与了两次12分钟的模拟患者接触,其中包括SDM和两种管理选择。会议以模拟患者书面反馈和专家指导的汇报结束。来自3所医学院的27名学生参与了研究。结果:项目评估显示参与者对SDM的理解和信心(t = 6.51和t = 7.56, P < 0.001)和ai辅助SDM (t = 5.72和t = 5.80, P < 0.001)有显著改善。学生发现人工智能工具有助于促进SDM和患者沟通。访谈的专题分析突出了优势,如结构化的课程设计和反思性汇报。参与者注意到先前的教育侧重于诊断算法,而本课程强调患者沟通和基于偏好的决策。他们发现人工智能工具有助于诊断,并通过视觉输出支持与患者的讨论。然而,他们发现了局限性,包括他们自己的临床知识差距和人工智能工具短缺缺乏可解释性。他们建议将SDM和人工智能辅助诊断培训纳入正式课程,以更好地为学生的临床实践做好准备。下一步:未来的努力应该集中在将这门课程整合到本科课程或过渡培训计划中,以提供人工智能辅助临床实践的体验学习机会。
{"title":"Artificial intelligence-assisted shared decision-making training for medical students transitioning to residency.","authors":"Young-Min Kim, Young-Mee Lee, Do-Hwan Kim, Suyoun Kim, Ji-Hoon Kim, Hye Rim Jin, Chang-Jin Choi","doi":"10.1093/acamed/wvaf006","DOIUrl":"https://doi.org/10.1093/acamed/wvaf006","url":null,"abstract":"<p><strong>Problem: </strong>Although the use of artificial intelligence (AI) as a diagnostic aid is increasing in clinical practice, medical education provides little training on how to incorporate AI-generated information into diagnosis and use it effectively in shared decision-making (SDM) with patients.</p><p><strong>Approach: </strong>The authors developed and piloted a simulation-based course to train AI-assisted SDM to final-year medical students preparing for residency. Conducted between June and October 2023, the course combined online prelearning with onsite simulations using clinically approved AI tools (Lunit INSIGHT CXR, version 3.1.4.1 and MMG, version 1.1.4.3; Lunit Inc, Seoul, South Korea; used November 16 and 27, 2023). Scenarios portrayed asymptomatic patients with incidental findings (eg, pulmonary nodules, breast microcalcifications). Students engaged in two 12-minute simulated patient encounters featuring SDM with 2 management options. Sessions concluded with simulated patient-written feedback and expert-facilitated debriefing. Twenty-seven students from 3 medical schools participated.</p><p><strong>Outcomes: </strong>Program evaluation showed significant improvements in participants' comprehension and confidence in SDM (t = 6.51 and t = 7.56, P < .001, respectively) and AI-assisted SDM (t = 5.72 and t = 5.80, P < .001, respectively). Students found AI tools helpful for facilitating SDM and patient communication. Thematic analysis of interviews highlighted strengths, such as structured course design and reflective debriefing. Participants noted that prior education focused on diagnostic algorithms, whereas this course emphasized patient communication and preference-based decisions. They found AI tools useful for diagnosis and supporting discussion with patients through visual outputs. However, they identified limitations, including their own clinical knowledge gaps and lack of explainability in AI tool shortage. They suggested integrating SDM and AI-assisted diagnosis training into formal curricula to better prepare students for clinical practice.</p><p><strong>Next steps: </strong>Future efforts should focus on integrating this course into undergraduate curricula or transition training programs to provide experiential learning opportunities in AI-assisted clinical practice.</p>","PeriodicalId":50929,"journal":{"name":"Academic Medicine","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145985867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gregory M Ow, Geoffrey V Stetson, Joseph A Costello, Anthony R Artino, Lauren A Maggio
<p><strong>Problem: </strong>Medical education scholars struggle to join ongoing conversations in their field due to the lack of a dedicated medical education corpus. Without such a corpus, scholars must search too widely across thousands of irrelevant journals or too narrowly by relying on PubMed's Medical Subject Headings (MeSH). In tests conducted for this study, MeSH missed 34% of medical education articles.</p><p><strong>Approach: </strong>From January to December 2024, the authors developed the Medical Education Corpus (MEC), the first dedicated collection of medical education articles, through a 3-step process. First, using the core-periphery model, they created the Medical Education Journals (MEJ), a collection of 2 groups of journals based on participation and influence in medical education discourse: the MEJ-Core (formerly the MEJ-24, 24 journals) and the MEJ-Adjacent (127 journals). Second, they developed and evaluated a machine learning model, the MEC Classifier, trained on 4,032 manually labeled articles to identify medical education content. Third, they applied the MEC Classifier to extract medical education articles from the MEJ-Core and MEJ-Adjacent journals.</p><p><strong>Outcomes: </strong>As of December 2024, the MEC contained 119,137 medical education articles from the MEJ-Core (54,927 articles) and MEJ-Adjacent journals (64,210 articles). In an evaluation using 1,358 test articles, the MEC Classifier demonstrated significantly improved sensitivity compared with MeSH (90% vs 66%, P = .001), while maintaining a similar positive predictive value (82% vs 81%).</p><p><strong>Next steps: </strong>The MEC provides a focused corpus that enables medical education scholars to more easily join conversations in the field. Scholars can rely on the MEC when reviewing literature to frame their work, and the MEC also creates opportunities for field-wide analyses and meta-research. The core methodology also underlies the MedEdMentor Paper Database (mededmentor.org), a separately maintained online tool that complements the versioned MEC snapshot with a web-based search interface.Teaser text: Medical education scholars often struggle to effectively "join the conversation" because relevant literature is buried within biomedical databases like PubMed or general academic search engines like Google Scholar. This article introduces the Medical Education Corpus (MEC), a dedicated collection of 119,137 medical education articles curated using a specialized machine-learning classifier. In head-to-head testing, the MEC significantly outperformed PubMed's MeSH terms, capturing 90% of medical education articles compared with MeSH's 66%. By assembling these articles into a single, focused dataset, the MEC allows scholars to more easily find the literature they need to frame their work. The core methodology also underlies MedEdMentor, a separately maintained online tool that makes these optimized searches accessible to the wider medical education community.
{"title":"Joining the conversation: introducing a dedicated medical education corpus.","authors":"Gregory M Ow, Geoffrey V Stetson, Joseph A Costello, Anthony R Artino, Lauren A Maggio","doi":"10.1093/acamed/wvaf008","DOIUrl":"https://doi.org/10.1093/acamed/wvaf008","url":null,"abstract":"<p><strong>Problem: </strong>Medical education scholars struggle to join ongoing conversations in their field due to the lack of a dedicated medical education corpus. Without such a corpus, scholars must search too widely across thousands of irrelevant journals or too narrowly by relying on PubMed's Medical Subject Headings (MeSH). In tests conducted for this study, MeSH missed 34% of medical education articles.</p><p><strong>Approach: </strong>From January to December 2024, the authors developed the Medical Education Corpus (MEC), the first dedicated collection of medical education articles, through a 3-step process. First, using the core-periphery model, they created the Medical Education Journals (MEJ), a collection of 2 groups of journals based on participation and influence in medical education discourse: the MEJ-Core (formerly the MEJ-24, 24 journals) and the MEJ-Adjacent (127 journals). Second, they developed and evaluated a machine learning model, the MEC Classifier, trained on 4,032 manually labeled articles to identify medical education content. Third, they applied the MEC Classifier to extract medical education articles from the MEJ-Core and MEJ-Adjacent journals.</p><p><strong>Outcomes: </strong>As of December 2024, the MEC contained 119,137 medical education articles from the MEJ-Core (54,927 articles) and MEJ-Adjacent journals (64,210 articles). In an evaluation using 1,358 test articles, the MEC Classifier demonstrated significantly improved sensitivity compared with MeSH (90% vs 66%, P = .001), while maintaining a similar positive predictive value (82% vs 81%).</p><p><strong>Next steps: </strong>The MEC provides a focused corpus that enables medical education scholars to more easily join conversations in the field. Scholars can rely on the MEC when reviewing literature to frame their work, and the MEC also creates opportunities for field-wide analyses and meta-research. The core methodology also underlies the MedEdMentor Paper Database (mededmentor.org), a separately maintained online tool that complements the versioned MEC snapshot with a web-based search interface.Teaser text: Medical education scholars often struggle to effectively \"join the conversation\" because relevant literature is buried within biomedical databases like PubMed or general academic search engines like Google Scholar. This article introduces the Medical Education Corpus (MEC), a dedicated collection of 119,137 medical education articles curated using a specialized machine-learning classifier. In head-to-head testing, the MEC significantly outperformed PubMed's MeSH terms, capturing 90% of medical education articles compared with MeSH's 66%. By assembling these articles into a single, focused dataset, the MEC allows scholars to more easily find the literature they need to frame their work. The core methodology also underlies MedEdMentor, a separately maintained online tool that makes these optimized searches accessible to the wider medical education community.","PeriodicalId":50929,"journal":{"name":"Academic Medicine","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145960721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Shame is a deeply personal, complex, and underexplored emotion in medical training; however, how medical learners engage with shame (ie, how they process, recover from, and/or address shame) and the environmental factors affecting this process are currently unknown. This study used hermeneutic phenomenology to explore how medical learners (ie, resident physicians and medical students) engage with shame once it has occurred and the factors that influence this engagement.
Method: This study, which is part of a qualitative research program addressing shame in medical learners, used data collected from 12 residents (2016 and 2017) and 16 medical students (2018) from a residency program and a medical school in the United States. Data collection occurred via semistructured interviews during which participants reflected on shame experiences, including their engagement with it. The authors selected 14 transcripts (7 medical students, 7 residents) to achieve a range of shame experiences and impacts. Data were analyzed using Ajjawi and Higgs' 6 steps of hermeneutic analysis.
Results: Internal scaffolding (thought processes, self-evaluative tendencies, and position relative to others that informed participants' self-concept) was central to shame engagement. Learners' internal scaffoldings shaped and were shaped by distressing shame-integrating engagement (ie, hiding the self, deflecting shame, and transferring shame) and constructive shame-disintegrating engagement (ie, orienting toward others, exerting agency over self-evaluation, and reorienting to a core sense of self). Learning environments influenced shame engagement; environmental values that promoted shame-disintegrating engagement included learner centeredness, inclusivity, vulnerability, and respect.
Conclusions: Although struggle in medical training is inevitable, how learners respond to the shame that can follow is not. The divergent nature of shame engagement highlights the importance of learner agency and environmental response to shame. The authors provide specific suggestions for learners, faculty, and leaders to advance constructive shame engagement and the growth, connection, and belonging it can inspire.
{"title":"Seeking stabilization: how medical learners engage with shame during training.","authors":"Anna V Kulawiec, Luna Dolezal, William E Bynum","doi":"10.1093/acamed/wvaf029","DOIUrl":"https://doi.org/10.1093/acamed/wvaf029","url":null,"abstract":"<p><strong>Purpose: </strong>Shame is a deeply personal, complex, and underexplored emotion in medical training; however, how medical learners engage with shame (ie, how they process, recover from, and/or address shame) and the environmental factors affecting this process are currently unknown. This study used hermeneutic phenomenology to explore how medical learners (ie, resident physicians and medical students) engage with shame once it has occurred and the factors that influence this engagement.</p><p><strong>Method: </strong>This study, which is part of a qualitative research program addressing shame in medical learners, used data collected from 12 residents (2016 and 2017) and 16 medical students (2018) from a residency program and a medical school in the United States. Data collection occurred via semistructured interviews during which participants reflected on shame experiences, including their engagement with it. The authors selected 14 transcripts (7 medical students, 7 residents) to achieve a range of shame experiences and impacts. Data were analyzed using Ajjawi and Higgs' 6 steps of hermeneutic analysis.</p><p><strong>Results: </strong>Internal scaffolding (thought processes, self-evaluative tendencies, and position relative to others that informed participants' self-concept) was central to shame engagement. Learners' internal scaffoldings shaped and were shaped by distressing shame-integrating engagement (ie, hiding the self, deflecting shame, and transferring shame) and constructive shame-disintegrating engagement (ie, orienting toward others, exerting agency over self-evaluation, and reorienting to a core sense of self). Learning environments influenced shame engagement; environmental values that promoted shame-disintegrating engagement included learner centeredness, inclusivity, vulnerability, and respect.</p><p><strong>Conclusions: </strong>Although struggle in medical training is inevitable, how learners respond to the shame that can follow is not. The divergent nature of shame engagement highlights the importance of learner agency and environmental response to shame. The authors provide specific suggestions for learners, faculty, and leaders to advance constructive shame engagement and the growth, connection, and belonging it can inspire.</p>","PeriodicalId":50929,"journal":{"name":"Academic Medicine","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145991774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel J Schumacher, Daniel J Sklansky, Brian Rissmiller, Lynn Thoreson, Linda A Waggoner-Fountain, Rajat Pareek, Sue E Poynter, Ariel S Winn, Catherine Michelson, Benjamin Kinnear, David A Turner, Leah S Millstein, Jennifer R Di Rocco, Kelsie Avants, Joanna Lewis, Pavan Srivastava, Erin L Giudice, Michelle Arandes, Sylvia Yeh, Alan Schwartz, Daniel J Schumacher, Daniel J Sklansky, Brian Rissmiller, Lynn Thoreson, Linda A Waggoner-Fountain, Rajat Pareek, Sue E Poynter, Ariel S Winn, Catherine Michelson, Benjamin Kinnear, David A Turner, Leah S Millstein, Jennifer R Di Rocco, Kelsie Avants, Joanna Lewis, Pavan Srivastava, Erin L Giudice, Michelle Arandes, Sylvia Yeh, Alan Schwartz
Purpose: Entrustable professional activities (EPAs) detail essential activities within a given specialty. Although 17 general pediatrics EPAs have been defined, it is not known how many are needed to make high-reliability overall entrustment decisions about resident readiness for practice at the time of graduation and initial certification. This study sought to determine how many general pediatrics EPAs are needed.
Method: During the 2021 to 2022, 2022 to 2023, and 2023 to 2024 academic years, the authors collected entrustment-supervision levels, determined by clinical competency committees biannually, for the 17 general pediatrics EPAs for residents at 48 U.S. pediatric residency training programs. Midyear reports were collected between November and January of each year, and end-of-year reports were collected between May and July. The authors conducted generalizability and decision studies to determine the number of EPAs needed to make a reliable overall entrustment decision.
Results: A total of 166,077 individual entrustment-supervision levels were collected for 4,250 pediatric residents across the 17 general pediatrics EPAs. Across all data reporting cycles, the authors found that assessing 6 EPAs yields a generalizability coefficient of 0.8 and assessing 12 EPAs yields a generalizability coefficient of 0.9. However, results differed for midyear compared with end-of-year data collection timepoints as well as by postgraduate year. At graduation, 9 to 13 EPAs are needed to make a highly reliable (generalizability coefficient of 0.9) overall decision about degree of entrustment for unsupervised practice.
Conclusions: This study provides rich insight into the number of EPAs needed to make reliable entrustment decisions about resident readiness to provide patient care. Although readiness can be determined with as few as 9 general pediatrics EPAs (an assessment task), more may be needed to inform a comprehensive curriculum that ensures focus in all areas important to developing general pediatricians during residency training (a curricular task).Teaser text: This study sought to determine how many entrustable professional activities are necessary to make high reliability overall entrustment decisions about pediatric resident readiness for unsupervised practice.
{"title":"Use of entrustable professional activities for reliable overall entrustment decisions.","authors":"Daniel J Schumacher, Daniel J Sklansky, Brian Rissmiller, Lynn Thoreson, Linda A Waggoner-Fountain, Rajat Pareek, Sue E Poynter, Ariel S Winn, Catherine Michelson, Benjamin Kinnear, David A Turner, Leah S Millstein, Jennifer R Di Rocco, Kelsie Avants, Joanna Lewis, Pavan Srivastava, Erin L Giudice, Michelle Arandes, Sylvia Yeh, Alan Schwartz, Daniel J Schumacher, Daniel J Sklansky, Brian Rissmiller, Lynn Thoreson, Linda A Waggoner-Fountain, Rajat Pareek, Sue E Poynter, Ariel S Winn, Catherine Michelson, Benjamin Kinnear, David A Turner, Leah S Millstein, Jennifer R Di Rocco, Kelsie Avants, Joanna Lewis, Pavan Srivastava, Erin L Giudice, Michelle Arandes, Sylvia Yeh, Alan Schwartz","doi":"10.1093/acamed/wvaf001","DOIUrl":"https://doi.org/10.1093/acamed/wvaf001","url":null,"abstract":"<p><strong>Purpose: </strong>Entrustable professional activities (EPAs) detail essential activities within a given specialty. Although 17 general pediatrics EPAs have been defined, it is not known how many are needed to make high-reliability overall entrustment decisions about resident readiness for practice at the time of graduation and initial certification. This study sought to determine how many general pediatrics EPAs are needed.</p><p><strong>Method: </strong>During the 2021 to 2022, 2022 to 2023, and 2023 to 2024 academic years, the authors collected entrustment-supervision levels, determined by clinical competency committees biannually, for the 17 general pediatrics EPAs for residents at 48 U.S. pediatric residency training programs. Midyear reports were collected between November and January of each year, and end-of-year reports were collected between May and July. The authors conducted generalizability and decision studies to determine the number of EPAs needed to make a reliable overall entrustment decision.</p><p><strong>Results: </strong>A total of 166,077 individual entrustment-supervision levels were collected for 4,250 pediatric residents across the 17 general pediatrics EPAs. Across all data reporting cycles, the authors found that assessing 6 EPAs yields a generalizability coefficient of 0.8 and assessing 12 EPAs yields a generalizability coefficient of 0.9. However, results differed for midyear compared with end-of-year data collection timepoints as well as by postgraduate year. At graduation, 9 to 13 EPAs are needed to make a highly reliable (generalizability coefficient of 0.9) overall decision about degree of entrustment for unsupervised practice.</p><p><strong>Conclusions: </strong>This study provides rich insight into the number of EPAs needed to make reliable entrustment decisions about resident readiness to provide patient care. Although readiness can be determined with as few as 9 general pediatrics EPAs (an assessment task), more may be needed to inform a comprehensive curriculum that ensures focus in all areas important to developing general pediatricians during residency training (a curricular task).Teaser text: This study sought to determine how many entrustable professional activities are necessary to make high reliability overall entrustment decisions about pediatric resident readiness for unsupervised practice.</p>","PeriodicalId":50929,"journal":{"name":"Academic Medicine","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145991816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Psychological safety and stress in the learning environment.","authors":"Alaina B Mui, Timothy D Bradley, Erika S Abel","doi":"10.1093/acamed/wvaf040","DOIUrl":"https://doi.org/10.1093/acamed/wvaf040","url":null,"abstract":"","PeriodicalId":50929,"journal":{"name":"Academic Medicine","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amy K Ribera, Bruck Mulat, Kevin Wenger, Gabriel T Bosslet, Laura Torbeck, Stephen P Bogdewic, Emily C Walvoord, Mary Dankoski, Megan M Palmer
Purpose: In 2003, the Indiana University School of Medicine launched the Leadership in Academic Medicine Program (LAMP), a cohort-based leadership program for early-career faculty. This study evaluated the program's effectiveness in faculty retention, promotion, and attainment of leadership roles.
Method: The authors identified early-career faculty hired at the assistant professor rank between 2000 and 2017 who were eligible for LAMP (2003-2017). Propensity score matching was used to create comparable groups of LAMP participants and matched controls (nonparticipants) based on track, department type, age at hire, hire year, sex, and underrepresented in medicine status. Survival analyses, including Kaplan-Meier curves and Cox proportional hazards models, were conducted to compare time to promotion and faculty departure between the matched groups. Logistic regression was used to evaluate differences between groups and their likelihood of achieving leadership roles.
Results: Among the 1,329 early-career faculty included in the study, 443 LAMP participants demonstrated higher rates of promotion (hazard ratio [HR], 1.57; 95% CI, 1.31-1.88; P < .001) and significantly lower rates of departure (HR, 0.72; 95% CI, 0.60-0.85; P < .001) compared with 886 matched controls. Additionally, LAMP participants were significantly more likely to attain a leadership role (odds ratio [OR], 2.02; 95% CI, 1.49-2.73; P < .001). The positive impact of LAMP was especially pronounced among nontenure-track faculty, with notable improvements observed in promotion, retention, and leadership attainment. Although both male and female faculty benefited from participation, women had particularly increased odds of securing leadership positions (OR, 2.49; 95% CI, 1.59-3.87; P < .001).
Conclusions: Participation in LAMP significantly improved promotion rates, retention, and leadership attainment, particularly for nontenure-track and female faculty. These findings demonstrate the program's effectiveness in supporting faculty development and career advancement.
目的:2003年,印第安纳大学医学院启动了学术医学领导力项目(LAMP),这是一个针对早期职业教师的基于群体的领导力项目。本研究评估了该计划在教师留任、晋升和领导角色实现方面的有效性。方法:选取符合LAMP(2003-2017)条件的2000 -2017年助理教授级别的早期职业教师。倾向评分匹配用于创建LAMP参与者和匹配对照(非参与者)的可比较组,这些组基于跟踪、部门类型、雇用年龄、雇用年份、性别和在医学状况中代表性不足。采用Kaplan-Meier曲线和Cox比例风险模型进行生存分析,比较匹配组的晋升时间和离职时间。采用逻辑回归的方法评估不同群体之间的差异和他们实现领导角色的可能性。结果:在纳入研究的1329名早期职业教师中,与886名匹配的对照相比,443名LAMP参与者表现出更高的晋升率(风险比[HR], 1.57; 95% CI, 1.31-1.88; P < .001)和显著更低的离职率(风险比[HR], 0.72; 95% CI, 0.60-0.85; P < .001)。此外,LAMP参与者更有可能获得领导角色(优势比[OR], 2.02; 95% CI, 1.49-2.73; P < .001)。LAMP的积极影响在非终身教职员工中尤为明显,在晋升、留任和领导成就方面都有显著改善。尽管男性和女性教师都从参与中受益,但女性获得领导职位的几率特别高(OR, 2.49; 95% CI, 1.59-3.87; P < .001)。结论:参与LAMP显著提高了晋升率、留任率和领导成就,尤其是对非终身教职员工和女性教职员工。这些发现证明了该计划在支持教师发展和职业发展方面的有效性。
{"title":"Measuring the impact of a leadership in academic medicine program on faculty advancement and leadership attainment.","authors":"Amy K Ribera, Bruck Mulat, Kevin Wenger, Gabriel T Bosslet, Laura Torbeck, Stephen P Bogdewic, Emily C Walvoord, Mary Dankoski, Megan M Palmer","doi":"10.1093/acamed/wvaf017","DOIUrl":"https://doi.org/10.1093/acamed/wvaf017","url":null,"abstract":"<p><strong>Purpose: </strong>In 2003, the Indiana University School of Medicine launched the Leadership in Academic Medicine Program (LAMP), a cohort-based leadership program for early-career faculty. This study evaluated the program's effectiveness in faculty retention, promotion, and attainment of leadership roles.</p><p><strong>Method: </strong>The authors identified early-career faculty hired at the assistant professor rank between 2000 and 2017 who were eligible for LAMP (2003-2017). Propensity score matching was used to create comparable groups of LAMP participants and matched controls (nonparticipants) based on track, department type, age at hire, hire year, sex, and underrepresented in medicine status. Survival analyses, including Kaplan-Meier curves and Cox proportional hazards models, were conducted to compare time to promotion and faculty departure between the matched groups. Logistic regression was used to evaluate differences between groups and their likelihood of achieving leadership roles.</p><p><strong>Results: </strong>Among the 1,329 early-career faculty included in the study, 443 LAMP participants demonstrated higher rates of promotion (hazard ratio [HR], 1.57; 95% CI, 1.31-1.88; P < .001) and significantly lower rates of departure (HR, 0.72; 95% CI, 0.60-0.85; P < .001) compared with 886 matched controls. Additionally, LAMP participants were significantly more likely to attain a leadership role (odds ratio [OR], 2.02; 95% CI, 1.49-2.73; P < .001). The positive impact of LAMP was especially pronounced among nontenure-track faculty, with notable improvements observed in promotion, retention, and leadership attainment. Although both male and female faculty benefited from participation, women had particularly increased odds of securing leadership positions (OR, 2.49; 95% CI, 1.59-3.87; P < .001).</p><p><strong>Conclusions: </strong>Participation in LAMP significantly improved promotion rates, retention, and leadership attainment, particularly for nontenure-track and female faculty. These findings demonstrate the program's effectiveness in supporting faculty development and career advancement.</p>","PeriodicalId":50929,"journal":{"name":"Academic Medicine","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145991807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Donya Derakshani, Luísa de Lima Barbosa Colapietro, Nadia Islam, Michelle Louie
<p><strong>Purpose: </strong>This study synthesizes and characterizes pregnancy outcomes and complications among US female physician trainees.</p><p><strong>Method: </strong>In this systematic review, the PubMed, Embase, Scopus, Web of Science, and Cochrane databases were searched in May 2024 and May 2025 for peer-reviewed, English-language articles published from January 2000 to May 2025 reporting pregnancy outcomes in US female residents or fellows. Medical Subject Headings and keywords for pregnancy complications, pregnancy outcomes, and women or female physicians were used. Cross-sectional, retrospective, prospective, and case-control studies were included. Opinion articles, editorials, systematic or scoping reviews, and other non-peer-reviewed articles were excluded. Abstracts and full-text articles were screened by 3 investigators. Extracted data were documented in standardized forms. Risk of bias was assessed regarding study population and outcome reporting and summarized using standardized quality assessment tools.</p><p><strong>Results: </strong>Of 792 titles and abstracts screened, 10 articles involving 4,891 trainees were included. Eight of these 10 studies described pregnancy outcomes specifically in surgical trainees; all were cross-sectional surveys. The numbers of trainees experiencing pregnancy complications ranged from 38.9% to 56.0%. The most frequently described complication was preterm labor (9 of 10 studies), affecting 5.3% to 22.9% of surveyed trainees. The most cited fetal complication was miscarriage (5 of 10 studies), affecting 20.0% to 28.0% of trainee pregnancies. Other commonly reported complications included preeclampsia or eclampsia, hyperemesis gravidarum, and gestational hypertension.</p><p><strong>Conclusions: </strong>Although pregnancy complications appear to be common among female physician trainees, current data on obstetric outcomes in this population remain limited. Most studies focus specifically on surgical trainees and rely on self-reported data, limiting generalizability. Residency programs may consider findings from this review to enhance support for childbearing trainees, although prospective studies are needed to better discern the impact of medical training on maternal and fetal outcomes.</p><p><strong>Teaser text: </strong>Pregnancy often coincides with the demanding years of medical training, yet robust data on pregnancy outcomes among physician trainees remain limited. This systematic review synthesizes and characterizes pregnancy outcomes and complications among US female physician trainees. Ten unique studies involving nearly 5000 trainees were included and findings raise concern for a high frequency of pregnancy complications in this population. Reported overall complication rates ranged from 38.9% to 56.0%, with preterm labor and miscarriage among the most frequently reported outcomes. Interpretation of these findings is limited by confounding factors, study heterogeneity, and reliance on self
目的:本研究对美国女实习医师的妊娠结局和并发症进行综合分析。方法:在本系统综述中,检索了PubMed、Embase、Scopus、Web of Science和Cochrane数据库,检索了2000年1月至2025年5月发表的关于美国女性住院医师或研究员妊娠结局的同行评审的英文文章。使用了妊娠并发症、妊娠结局、女性或女医生的医学主题标题和关键词。包括横断面、回顾性、前瞻性和病例对照研究。观点文章、社论、系统或范围评价以及其他非同行评议的文章被排除在外。摘要和全文文章由3名研究者筛选。提取的数据以标准化格式记录。根据研究人群和结果报告评估偏倚风险,并使用标准化质量评估工具进行汇总。结果:在筛选的792篇题目和摘要中,共纳入10篇,涉及4891名学员。这10项研究中有8项专门描述了外科实习生的妊娠结局;所有调查都是横断面调查。出现妊娠并发症的学员人数从38.9%到56.0%不等。最常见的并发症是早产(10项研究中有9项),影响5.3%至22.9%的受访学员。引用最多的胎儿并发症是流产(10项研究中有5项),影响20.0%至28.0%的受训者妊娠。其他常见的并发症包括先兆子痫或子痫、妊娠剧吐和妊娠高血压。结论:尽管妊娠并发症在女性医师培训生中似乎很常见,但目前关于这一人群产科结局的数据仍然有限。大多数研究专门针对外科培训生,并依赖于自我报告的数据,限制了普遍性。尽管需要前瞻性研究来更好地辨别医学培训对孕产妇和胎儿结局的影响,但住院医师项目可能会考虑本综述的发现,以加强对生育培训生的支持。导语:怀孕往往与医疗培训的高要求年份相吻合,但关于医生培训生怀孕结果的可靠数据仍然有限。本系统综述综合和表征妊娠结局和并发症在美国女医师培训。纳入了10项独特的研究,涉及近5000名受训人员,研究结果引起了人们对这一人群中妊娠并发症高频率的关注。报道的总体并发症发生率从38.9%到56.0%不等,早产和流产是最常见的报道结果。这些发现的解释受到混杂因素、研究异质性和对自我报告数据的依赖的限制,突出了现有文献中的显着差距。尽管如此,这些发现强调了生育培训生面临的持续挑战,并强调了前瞻性、标准化研究的必要性,以更好地为住院医师计划政策提供信息,并在医学培训期间支持更健康的怀孕。
{"title":"Pregnancy outcomes in US female physician trainees: a systematic review.","authors":"Donya Derakshani, Luísa de Lima Barbosa Colapietro, Nadia Islam, Michelle Louie","doi":"10.1093/acamed/wvaf030","DOIUrl":"https://doi.org/10.1093/acamed/wvaf030","url":null,"abstract":"<p><strong>Purpose: </strong>This study synthesizes and characterizes pregnancy outcomes and complications among US female physician trainees.</p><p><strong>Method: </strong>In this systematic review, the PubMed, Embase, Scopus, Web of Science, and Cochrane databases were searched in May 2024 and May 2025 for peer-reviewed, English-language articles published from January 2000 to May 2025 reporting pregnancy outcomes in US female residents or fellows. Medical Subject Headings and keywords for pregnancy complications, pregnancy outcomes, and women or female physicians were used. Cross-sectional, retrospective, prospective, and case-control studies were included. Opinion articles, editorials, systematic or scoping reviews, and other non-peer-reviewed articles were excluded. Abstracts and full-text articles were screened by 3 investigators. Extracted data were documented in standardized forms. Risk of bias was assessed regarding study population and outcome reporting and summarized using standardized quality assessment tools.</p><p><strong>Results: </strong>Of 792 titles and abstracts screened, 10 articles involving 4,891 trainees were included. Eight of these 10 studies described pregnancy outcomes specifically in surgical trainees; all were cross-sectional surveys. The numbers of trainees experiencing pregnancy complications ranged from 38.9% to 56.0%. The most frequently described complication was preterm labor (9 of 10 studies), affecting 5.3% to 22.9% of surveyed trainees. The most cited fetal complication was miscarriage (5 of 10 studies), affecting 20.0% to 28.0% of trainee pregnancies. Other commonly reported complications included preeclampsia or eclampsia, hyperemesis gravidarum, and gestational hypertension.</p><p><strong>Conclusions: </strong>Although pregnancy complications appear to be common among female physician trainees, current data on obstetric outcomes in this population remain limited. Most studies focus specifically on surgical trainees and rely on self-reported data, limiting generalizability. Residency programs may consider findings from this review to enhance support for childbearing trainees, although prospective studies are needed to better discern the impact of medical training on maternal and fetal outcomes.</p><p><strong>Teaser text: </strong>Pregnancy often coincides with the demanding years of medical training, yet robust data on pregnancy outcomes among physician trainees remain limited. This systematic review synthesizes and characterizes pregnancy outcomes and complications among US female physician trainees. Ten unique studies involving nearly 5000 trainees were included and findings raise concern for a high frequency of pregnancy complications in this population. Reported overall complication rates ranged from 38.9% to 56.0%, with preterm labor and miscarriage among the most frequently reported outcomes. Interpretation of these findings is limited by confounding factors, study heterogeneity, and reliance on self","PeriodicalId":50929,"journal":{"name":"Academic Medicine","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laurah Turner, Christine Yang Zhou, Danielle Weber, Seth Overla, Mathew Kelleher, Pamela Baker, Sally A Santen
Problem: Medical educators pay increasing attention to the potential utility of narrative data for assessment, but lack of efficient and standardized ways of interpreting the data have limited its use. Natural language processing (NLP) algorithms could provide new insights for using narrative data within assessment especially identifying at-risk or low-performing students.
Approach: Assessment data were reviewed from 16 cohorts of medical students from the University of Cincinnati College of Medicine (graduating classes of 2006-2022). A T-score Average (TSA) was calculated for each student based on clerkship assessment data. Narrative data from core clerkship evaluations in responses to the prompt "any opportunities for improvement" were utilized for analysis. The narrative data and calculated TSA were then used to train and test 4 NLP models with the goal of utilizing NLP to identify at-risk students as defined by the bottom 10% average TSA score.
Outcomes: Based on typical NLP methods, the developed NLP models all performed adequately in identifying student performance with an overall accuracy of 0.8 across all 4 models. However, none of the NLP models was able to identify students within the bottom 10% of performance. During this process, we uncovered the presence of "copy/paste" behavior, a previously undocumented phenomenon within narrative data where preceptors duplicated comments from 1 student to the next. Training NLP models including "copy/paste" comments improved NLP ability to identify students within the bottom and top 10% of performance.
Next steps: NLP models were unable to accurately identify at-risk students. Model accuracy increased with the inclusion of "copy/paste" comments, indicating there may be some discriminatory functionality within aspects of narrative data beyond keywords such as lexical diversity and word quantity. Future work will explore how these other aspects of narrative data associate with performance and utilize more advanced large language models of analysis.Based on typical natural language processing (NLP) methods, the tested NLP models performed adequately in identifying student performance with an overall accuracy of 0.8 across all models. However, none of the NLP models was able to identify students within the bottom 10% of performance, revealing widespread "copy/paste" behavior in evaluations that may limit narrative data's discriminatory utility.
问题:医学教育者越来越关注叙事数据在评估中的潜在效用,但缺乏有效和标准化的解读数据的方法,限制了叙事数据的使用。自然语言处理(NLP)算法可以为在评估中使用叙事数据提供新的见解,特别是识别有风险或表现不佳的学生。方法:对来自辛辛那提大学医学院(2006-2022年毕业班)的16组医学生的评估数据进行了回顾。每个学生的T-score Average (TSA)是根据见习评估数据计算出来的。对“任何改进的机会”这一提示作出反应的核心职员评价的叙述性数据被用于分析。然后使用叙事数据和计算的TSA来训练和测试4个NLP模型,目的是利用NLP识别由TSA平均分数最低的10%定义的有风险的学生。结果:基于典型的NLP方法,开发的NLP模型在识别学生表现方面都表现良好,所有4个模型的总体精度为0.8。然而,没有一个NLP模型能够识别出表现最差的10%的学生。在这个过程中,我们发现了“复制/粘贴”行为的存在,这是一种以前没有记录的现象,在叙述数据中,训导员将一个学生的评论复制到另一个学生。包括“复制/粘贴”评论在内的训练NLP模型提高了NLP识别成绩在前10%和后10%的学生的能力。下一步:NLP模型无法准确识别有风险的学生。模型的准确性随着“复制/粘贴”评论的加入而提高,这表明除了词汇多样性和词量等关键词之外,叙事数据的某些方面可能存在一些歧视性功能。未来的工作将探索叙事数据的其他方面如何与表现相关联,并利用更先进的大型语言分析模型。基于典型的自然语言处理(NLP)方法,测试的NLP模型在识别学生成绩方面表现良好,所有模型的总体准确率为0.8。然而,没有一个NLP模型能够识别出表现最差的10%的学生,这揭示了评估中普遍存在的“复制/粘贴”行为,这可能会限制叙事数据的歧视性效用。
{"title":"Challenges in using natural language processing to stratify students by narrative assessments in undergraduate medical education.","authors":"Laurah Turner, Christine Yang Zhou, Danielle Weber, Seth Overla, Mathew Kelleher, Pamela Baker, Sally A Santen","doi":"10.1093/acamed/wvaf065","DOIUrl":"https://doi.org/10.1093/acamed/wvaf065","url":null,"abstract":"<p><strong>Problem: </strong>Medical educators pay increasing attention to the potential utility of narrative data for assessment, but lack of efficient and standardized ways of interpreting the data have limited its use. Natural language processing (NLP) algorithms could provide new insights for using narrative data within assessment especially identifying at-risk or low-performing students.</p><p><strong>Approach: </strong>Assessment data were reviewed from 16 cohorts of medical students from the University of Cincinnati College of Medicine (graduating classes of 2006-2022). A T-score Average (TSA) was calculated for each student based on clerkship assessment data. Narrative data from core clerkship evaluations in responses to the prompt \"any opportunities for improvement\" were utilized for analysis. The narrative data and calculated TSA were then used to train and test 4 NLP models with the goal of utilizing NLP to identify at-risk students as defined by the bottom 10% average TSA score.</p><p><strong>Outcomes: </strong>Based on typical NLP methods, the developed NLP models all performed adequately in identifying student performance with an overall accuracy of 0.8 across all 4 models. However, none of the NLP models was able to identify students within the bottom 10% of performance. During this process, we uncovered the presence of \"copy/paste\" behavior, a previously undocumented phenomenon within narrative data where preceptors duplicated comments from 1 student to the next. Training NLP models including \"copy/paste\" comments improved NLP ability to identify students within the bottom and top 10% of performance.</p><p><strong>Next steps: </strong>NLP models were unable to accurately identify at-risk students. Model accuracy increased with the inclusion of \"copy/paste\" comments, indicating there may be some discriminatory functionality within aspects of narrative data beyond keywords such as lexical diversity and word quantity. Future work will explore how these other aspects of narrative data associate with performance and utilize more advanced large language models of analysis.Based on typical natural language processing (NLP) methods, the tested NLP models performed adequately in identifying student performance with an overall accuracy of 0.8 across all models. However, none of the NLP models was able to identify students within the bottom 10% of performance, revealing widespread \"copy/paste\" behavior in evaluations that may limit narrative data's discriminatory utility.</p>","PeriodicalId":50929,"journal":{"name":"Academic Medicine","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146214924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander A Iyer, Karen E Hauer, Richard M Schwartzstein
{"title":"Reply to Smith and Piemonte.","authors":"Alexander A Iyer, Karen E Hauer, Richard M Schwartzstein","doi":"10.1093/acamed/wvaf018","DOIUrl":"https://doi.org/10.1093/acamed/wvaf018","url":null,"abstract":"","PeriodicalId":50929,"journal":{"name":"Academic Medicine","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145953720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}