Pub Date : 2025-06-09DOI: 10.1016/j.mcpdig.2025.100237
Bincy Baby PharmD, MSc , Jasdeep Kaur Gill PharmD , Sadaf Faisal BPharm, PhD , Ghada Elba PharmD, MSc , SooMin Park PharmD (c) , Annette McKinnon , Kirk Patterson BA , Sara J.T. Guilcher PT, PhD , Feng Chang PharmD , Linda Lee MD , Catherine Burns PhD , Ryan Griffin PhD , Tejal Patel BScPharm, PharmD
Objective
To develop a comprehensive classification system for medication adherence technologies based on an inventory of characteristics and features of existing technology.
Participants and Methods
Using a 3-stage approach methodology—development, validation, and evaluation, the study adopted the taxonomy development method and was conducted from February 1, 2023 to July 31, 2024. In the development stage, medication adherence technologies were defined, end users were identified, and a meta-characteristic was determined; using both empirical-to-conceptual and conceptual-to-empirical approaches, dimensions and characteristics were identified. The taxonomy was validated through the Delphi consensus approach and classifying 20 sample medication adherence technologies and evaluated by mapping to codes identified from a qualitative study.
Results
After undergoing 8 iterations, which included incorporating feedback from a Delphi consensus survey, the final taxonomy comprised 7 dimensions, 25 subdimensions, and 320 characteristics. These key dimensions include Physical Features, Display, Connectivity, System Alert, Data Collection and Management, Operations, and Integration. The taxonomy was considered complete and valuable once all preestablished ending conditions were met, and its applicability and comprehensiveness were verified by comparing various medication adherence technologies and mapping to codes identified from a qualitative study.
Conclusion
This study successfully establishes the first comprehensive classification system for medication adherence technologies based on features, addressing a critical gap in literature. The taxonomy provides a structured framework for categorizing and evaluating technologies, supporting usability testing and the selection of appropriate devices tailored to the unique needs of older adults.
{"title":"Medication Adherence Technologies: A Classification Taxonomy Based on Features","authors":"Bincy Baby PharmD, MSc , Jasdeep Kaur Gill PharmD , Sadaf Faisal BPharm, PhD , Ghada Elba PharmD, MSc , SooMin Park PharmD (c) , Annette McKinnon , Kirk Patterson BA , Sara J.T. Guilcher PT, PhD , Feng Chang PharmD , Linda Lee MD , Catherine Burns PhD , Ryan Griffin PhD , Tejal Patel BScPharm, PharmD","doi":"10.1016/j.mcpdig.2025.100237","DOIUrl":"10.1016/j.mcpdig.2025.100237","url":null,"abstract":"<div><h3>Objective</h3><div>To develop a comprehensive classification system for medication adherence technologies based on an inventory of characteristics and features of existing technology.</div></div><div><h3>Participants and Methods</h3><div>Using a 3-stage approach methodology—development, validation, and evaluation, the study adopted the taxonomy development method and was conducted from February 1, 2023 to July 31, 2024. In the development stage, medication adherence technologies were defined, end users were identified, and a meta-characteristic was determined; using both empirical-to-conceptual and conceptual-to-empirical approaches, dimensions and characteristics were identified. The taxonomy was validated through the Delphi consensus approach and classifying 20 sample medication adherence technologies and evaluated by mapping to codes identified from a qualitative study.</div></div><div><h3>Results</h3><div>After undergoing 8 iterations, which included incorporating feedback from a Delphi consensus survey, the final taxonomy comprised 7 dimensions, 25 subdimensions, and 320 characteristics. These key dimensions include Physical Features, Display, Connectivity, System Alert, Data Collection and Management, Operations, and Integration. The taxonomy was considered complete and valuable once all preestablished ending conditions were met, and its applicability and comprehensiveness were verified by comparing various medication adherence technologies and mapping to codes identified from a qualitative study.</div></div><div><h3>Conclusion</h3><div>This study successfully establishes the first comprehensive classification system for medication adherence technologies based on features, addressing a critical gap in literature. The taxonomy provides a structured framework for categorizing and evaluating technologies, supporting usability testing and the selection of appropriate devices tailored to the unique needs of older adults.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 3","pages":"Article 100237"},"PeriodicalIF":0.0,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144596753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-09DOI: 10.1016/j.mcpdig.2025.100241
Arya S. Rao BA , Siona Prasad BA , Richard S. Lee BS , Susan Farrell MD , Sophia McKinley MD, MED , Marc D. Succi MD
Objective
To develop and validate an artificial intelligence–powered platform that simulates surgical oral examinations, addressing the limitations of traditional faculty-led sessions.
Patients and Methods
This cross-sectional study, conducted from June 1, 2024, through December 1, 2024, comprised technical validation and educational assessment of a novel large language model (LLM)–based surgical education tool (surgery oral examination large language model [SOE-LLM]). The study involved 12 surgical clerkship students completing their core rotation at a major academic medical center. The SOE-LLM, using MIMIC-IV–derived surgical cases (acute appendicitis and pancreatitis), was implemented to simulate oral examinations. Technical validation assessed performance across 8 domains: case presentation accuracy, physical examination findings, historical detail preservation, laboratory data reporting, imaging interpretation, management decisions, and recognition of contraindicated interventions. Educational utility was evaluated using a 5-point Likert scale.
Results
Technical validation showed the SOE-LLM’s ability to function as a consistent oral examiner. The model accurately guided students through case presentations, responded to diagnostic questions, and provided clinically sound responses based on MIMIC-IV cases. When tested with standardized prompts, it maintained examination fidelity, requiring proper diagnostic reasoning and differentiating operative versus medical management. Student evaluations highlighted the platform’s value as an examination preparation tool (mean, 4.250; SEM, 0.1794) and its ability to create a low-stakes environment for high-stakes decision practice (mean, 4.833; SEM, 0.1124).
Conclusion
The SOE-LLM shows potential as a valuable tool for surgical education, offering a consistent and accessible platform for simulating oral examinations.
{"title":"Development and Evaluation of an Artificial Intelligence–Powered Surgical Oral Examination Simulator: A Pilot Study","authors":"Arya S. Rao BA , Siona Prasad BA , Richard S. Lee BS , Susan Farrell MD , Sophia McKinley MD, MED , Marc D. Succi MD","doi":"10.1016/j.mcpdig.2025.100241","DOIUrl":"10.1016/j.mcpdig.2025.100241","url":null,"abstract":"<div><h3>Objective</h3><div>To develop and validate an artificial intelligence–powered platform that simulates surgical oral examinations, addressing the limitations of traditional faculty-led sessions.</div></div><div><h3>Patients and Methods</h3><div>This cross-sectional study, conducted from June 1, 2024, through December 1, 2024, comprised technical validation and educational assessment of a novel large language model (LLM)–based surgical education tool (surgery oral examination large language model [SOE-LLM]). The study involved 12 surgical clerkship students completing their core rotation at a major academic medical center. The SOE-LLM, using MIMIC-IV–derived surgical cases (acute appendicitis and pancreatitis), was implemented to simulate oral examinations. Technical validation assessed performance across 8 domains: case presentation accuracy, physical examination findings, historical detail preservation, laboratory data reporting, imaging interpretation, management decisions, and recognition of contraindicated interventions. Educational utility was evaluated using a 5-point Likert scale.</div></div><div><h3>Results</h3><div>Technical validation showed the SOE-LLM’s ability to function as a consistent oral examiner. The model accurately guided students through case presentations, responded to diagnostic questions, and provided clinically sound responses based on MIMIC-IV cases. When tested with standardized prompts, it maintained examination fidelity, requiring proper diagnostic reasoning and differentiating operative versus medical management. Student evaluations highlighted the platform’s value as an examination preparation tool (mean, 4.250; SEM, 0.1794) and its ability to create a low-stakes environment for high-stakes decision practice (mean, 4.833; SEM, 0.1124).</div></div><div><h3>Conclusion</h3><div>The SOE-LLM shows potential as a valuable tool for surgical education, offering a consistent and accessible platform for simulating oral examinations.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 3","pages":"Article 100241"},"PeriodicalIF":0.0,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-09DOI: 10.1016/j.mcpdig.2025.100240
Stephanie Zawada PhD, MS , Jestrii Acosta MS , Caden Collins BA , Oana Dumitrascu MD, MS , Ehab Harahsheh MBBS , Clinton Hagen MS , Ali Ganjizadeh MD , Elham Mahmoudi MD , Bradley Erickson MD, PhD , Bart Demaerschalk MD, MSc
Objective
To assess the feasibility of using smartphones to longitudinally collect objective behavior measures and establish the extent to which they can predict gold-standard depression severity in patients with ischemic stroke and transient ischemic attack (IS/TIA) symptoms.
Patients and Methods
Participants with IS/TIA symptoms were monitored in real-world settings using the Beiwe application for 8 or more weeks during March 1, 2024 to November 15, 2024. Depression symptoms were tracked via weekly Patient Health Questionnaire (PHQ)-8 surveys, monthly personnel-administered Montgomery–Åsberg Depression Rating Scale (MADRS) assessments, and weekly averages of smartphone sensor measures. Repeated measures correlation established associations between PHQ-8 scores and objective behavior measures. To investigate how closely smartphone data predicted MADRS scores, linear mixed models were used.
Results
Among enrolled participants (n=54), 35 completed the study (64.8%). PHQ-8 scores were associated with distance from home (r=0.173), time spent at home (r=−0.147) and PHQ-8 administration duration (r=0.151). Using demographic data and the most recent PHQ-8 scores, average root-mean-squared error for depression severity prediction across models was 1.64 with only PHQ-8 scores, 1.49 also including accelerometer and GPS data, and 1.36 also including PHQ-8 administration duration.
Conclusion
Smartphone sensors captured objective behavior measures in patients with IS/TIA. In predictive models, the accuracy of depression severity scores improved as measures from additional smartphone sensors were included. Future research should validate this decentralized, exploratory approach in a larger cohort. Our work is a step toward showing that real-world monitoring with active and passive data may triage patients with IS/TIA for efficient depression screening and provide digital mobility and response time endpoints.
{"title":"Real-World Smartphone Data Predicts Mood After Ischemic Stroke and Transient Ischemic Attack Symptoms and May Constitute Digital Endpoints: A Proof-of-Concept Study","authors":"Stephanie Zawada PhD, MS , Jestrii Acosta MS , Caden Collins BA , Oana Dumitrascu MD, MS , Ehab Harahsheh MBBS , Clinton Hagen MS , Ali Ganjizadeh MD , Elham Mahmoudi MD , Bradley Erickson MD, PhD , Bart Demaerschalk MD, MSc","doi":"10.1016/j.mcpdig.2025.100240","DOIUrl":"10.1016/j.mcpdig.2025.100240","url":null,"abstract":"<div><h3>Objective</h3><div>To assess the feasibility of using smartphones to longitudinally collect objective behavior measures and establish the extent to which they can predict gold-standard depression severity in patients with ischemic stroke and transient ischemic attack (IS/TIA) symptoms.</div></div><div><h3>Patients and Methods</h3><div>Participants with IS/TIA symptoms were monitored in real-world settings using the Beiwe application for 8 or more weeks during March 1, 2024 to November 15, 2024. Depression symptoms were tracked via weekly Patient Health Questionnaire (PHQ)-8 surveys, monthly personnel-administered Montgomery–Åsberg Depression Rating Scale (MADRS) assessments, and weekly averages of smartphone sensor measures. Repeated measures correlation established associations between PHQ-8 scores and objective behavior measures. To investigate how closely smartphone data predicted MADRS scores, linear mixed models were used.</div></div><div><h3>Results</h3><div>Among enrolled participants (n=54), 35 completed the study (64.8%). PHQ-8 scores were associated with distance from home (<em>r</em>=0.173), time spent at home (<em>r</em>=−0.147) and PHQ-8 administration duration (<em>r</em>=0.151). Using demographic data and the most recent PHQ-8 scores, average root-mean-squared error for depression severity prediction across models was 1.64 with only PHQ-8 scores, 1.49 also including accelerometer and GPS data, and 1.36 also including PHQ-8 administration duration.</div></div><div><h3>Conclusion</h3><div>Smartphone sensors captured objective behavior measures in patients with IS/TIA. In predictive models, the accuracy of depression severity scores improved as measures from additional smartphone sensors were included. Future research should validate this decentralized, exploratory approach in a larger cohort. Our work is a step toward showing that real-world monitoring with active and passive data may triage patients with IS/TIA for efficient depression screening and provide digital mobility and response time endpoints.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 3","pages":"Article 100240"},"PeriodicalIF":0.0,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144596839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-07DOI: 10.1016/j.mcpdig.2025.100238
Carmen Simone Grilo Diniz PhD , Ana Carolina Arruda Franzon PhD , Beatriz Fioretti-Foschi PhD , Livia Sanches Pedrilio MSc , Edson Amaro Jr. PhD , João Ricardo Sato PhD , Denise Yoshie Niy PhD
Objective
To design and evaluate an information and communication intervention via a smartphone application that provides access to essential information on best practices and safety in maternity services.
Participants and Methods
A randomized controlled trial using a mobile application to recruit and deliver the intervention, conducted from October 31, 2020, through December 12, 2020. The study was offered to all users registered on the application who self-identified as women, with ages between 18 and 49 years, with at least 1 child, pregnant or interested in having children in the future. The primary outcome measured was increased participant engagement in seeking an active role and informed choices. Participants received information about best practices (intervention) or about diapers (control). The trial was registered on the Brazilian Clinical Trials Registry Platform, and the protocol was published according to CONSORT e-Health guidelines. Effect size was estimated by odds ratio, with CI and P values.
Results
In total, 20,608 users were invited to participate in the study; of 17,643 enrolled (85.6% of invited users), 13,969 (79.1% of enrolled participants) women completed the intervention stage and were included in the analyses; 7121 (50.9% of all women included) had up to high school level; and 5855 (41.9%) used both public and private services. The intervention group registered an increased engagement in seeking an active role or making informed choices (odds ratio, 2.06; P<.001). The intervention proved to be highly effective for all secondary outcomes, as well.
Conclusion
This affordable digital technology effectively promoted awareness of safer, empowered choices in childbirth care, facilitating the translation of evidence-based, rights-based knowledge from institutional guidelines and recommendations to a broader audience.
{"title":"Digital Technology for Informed Choices at Childbirth in Brazil: A Randomized Controlled Trial","authors":"Carmen Simone Grilo Diniz PhD , Ana Carolina Arruda Franzon PhD , Beatriz Fioretti-Foschi PhD , Livia Sanches Pedrilio MSc , Edson Amaro Jr. PhD , João Ricardo Sato PhD , Denise Yoshie Niy PhD","doi":"10.1016/j.mcpdig.2025.100238","DOIUrl":"10.1016/j.mcpdig.2025.100238","url":null,"abstract":"<div><h3>Objective</h3><div>To design and evaluate an information and communication intervention via a smartphone application that provides access to essential information on best practices and safety in maternity services.</div></div><div><h3>Participants and Methods</h3><div>A randomized controlled trial using a mobile application to recruit and deliver the intervention, conducted from October 31, 2020, through December 12, 2020. The study was offered to all users registered on the application who self-identified as women, with ages between 18 and 49 years, with at least 1 child, pregnant or interested in having children in the future. The primary outcome measured was increased participant engagement in seeking an active role and informed choices. Participants received information about best practices (intervention) or about diapers (control). The trial was registered on the Brazilian Clinical Trials Registry Platform, and the protocol was published according to CONSORT e-Health guidelines. Effect size was estimated by odds ratio, with CI and <em>P</em> values.</div></div><div><h3>Results</h3><div>In total, 20,608 users were invited to participate in the study; of 17,643 enrolled (85.6% of invited users), 13,969 (79.1% of enrolled participants) women completed the intervention stage and were included in the analyses; 7121 (50.9% of all women included) had up to high school level; and 5855 (41.9%) used both public and private services. The intervention group registered an increased engagement in seeking an active role or making informed choices (odds ratio, 2.06; <em>P</em><.001). The intervention proved to be highly effective for all secondary outcomes, as well.</div></div><div><h3>Conclusion</h3><div>This affordable digital technology effectively promoted awareness of safer, empowered choices in childbirth care, facilitating the translation of evidence-based, rights-based knowledge from institutional guidelines and recommendations to a broader audience.</div></div><div><h3>Trial Registration</h3><div>Brazilian Registry of Clinical Trials Identifier: RBR-3g5f9f; WHO’s Unique Trial Identifier: UTN U1111-1255-8683.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 3","pages":"Article 100238"},"PeriodicalIF":0.0,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144534515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-03DOI: 10.1016/j.mcpdig.2025.100232
Kaylin T. Nguyen MD , Jingzhi Yu BA , Haley Hedlin PhD , Adam T. Phillips MD , Sumbul Desai MD , Lauren Cheung MD , Peter R. Kowey MD , Sneha S. Jain MD , John S. Rumsfeld MD, PhD , Andrea M. Russo MD , Christopher B. Granger MD , Mellanie True Hills BS , Manisha Desai PhD , Kenneth W. Mahaffey MD , Mintu P. Turakhia MD, MAS , Marco V. Perez MD
Objective
To evaluate differences in study engagement in diverse racial/ethnic groups that have been significantly underrepresented in atrial fibrillation and digital clinical trials.
Patients and Methods
This was a secondary analysis of participants from the Apple Heart Study, a prospective, siteless, single-arm pragmatic clinical trial from November 29, 2017, to January 31, 2019. Black, Hispanic, Asian, and White participants were monitored using an irregular rhythm notification algorithm designed to detect atrial fibrillation on a smartwatch. Logistic regression was performed to evaluate the relationship between race/ethnicity and completion of the first study visit after an irregular rhythm notification, adjusting for demographic characteristics and comorbidities.
Results
Of the 419,297 participants, 393,396 (93.8%) individuals self-identified as White, Black, Hispanic, or Asian. Overall, participants were 57% men and had a mean (SD) age of 41 (13) years. Among 2044 (0.52%) participants who received an irregular rhythm notification, non-White participants had lower odds of completing the initial virtual study visit compared with White participants (Black: OR, 0.61; 95% CI, 0.39-0.94; Hispanic: OR, 0.62; 95% CI, 0.40-0.95; Asian: OR, 0.40; 95% CI, 0.23-0.66) after multivariate adjustment. Among those who completed the initial study visit, there was no statistically significant difference in the odds of returning the electrocardiogram patch in the non-White groups compared with that of the White group.
Conclusion
Despite successful recruitment of racially and ethnically diverse participants, there were differences in subsequent engagement by non-White compared with that by White participants. Equitable representation and engagement of diverse racial and ethnic groups in digital clinical studies requires further study.
{"title":"Racial and Ethnic Representation and Study Engagement in a Siteless Digital Clinical Trial Using a Smartwatch: Findings From the Apple Heart Study","authors":"Kaylin T. Nguyen MD , Jingzhi Yu BA , Haley Hedlin PhD , Adam T. Phillips MD , Sumbul Desai MD , Lauren Cheung MD , Peter R. Kowey MD , Sneha S. Jain MD , John S. Rumsfeld MD, PhD , Andrea M. Russo MD , Christopher B. Granger MD , Mellanie True Hills BS , Manisha Desai PhD , Kenneth W. Mahaffey MD , Mintu P. Turakhia MD, MAS , Marco V. Perez MD","doi":"10.1016/j.mcpdig.2025.100232","DOIUrl":"10.1016/j.mcpdig.2025.100232","url":null,"abstract":"<div><h3>Objective</h3><div>To evaluate differences in study engagement in diverse racial/ethnic groups that have been significantly underrepresented in atrial fibrillation and digital clinical trials.</div></div><div><h3>Patients and Methods</h3><div>This was a secondary analysis of participants from the Apple Heart Study, a prospective, siteless, single-arm pragmatic clinical trial from November 29, 2017, to January 31, 2019. Black, Hispanic, Asian, and White participants were monitored using an irregular rhythm notification algorithm designed to detect atrial fibrillation on a smartwatch. Logistic regression was performed to evaluate the relationship between race/ethnicity and completion of the first study visit after an irregular rhythm notification, adjusting for demographic characteristics and comorbidities.</div></div><div><h3>Results</h3><div>Of the 419,297 participants, 393,396 (93.8%) individuals self-identified as White, Black, Hispanic, or Asian. Overall, participants were 57% men and had a mean (SD) age of 41 (13) years. Among 2044 (0.52%) participants who received an irregular rhythm notification, non-White participants had lower odds of completing the initial virtual study visit compared with White participants (Black: OR, 0.61; 95% CI, 0.39-0.94; Hispanic: OR, 0.62; 95% CI, 0.40-0.95; Asian: OR, 0.40; 95% CI, 0.23-0.66) after multivariate adjustment. Among those who completed the initial study visit, there was no statistically significant difference in the odds of returning the electrocardiogram patch in the non-White groups compared with that of the White group.</div></div><div><h3>Conclusion</h3><div>Despite successful recruitment of racially and ethnically diverse participants, there were differences in subsequent engagement by non-White compared with that by White participants. Equitable representation and engagement of diverse racial and ethnic groups in digital clinical studies requires further study.</div></div><div><h3>Trial Registration</h3><div>Clinicaltrials.gov Identifier: <span><span>NCT03335800</span><svg><path></path></svg></span></div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 3","pages":"Article 100232"},"PeriodicalIF":0.0,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144365990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-02DOI: 10.1016/j.mcpdig.2025.100209
{"title":"Erratum to “Assessment of Positive Cardiac Remodeling in Hypertrophic Obstructive Cardiomyopathy Using an Artificial Intelligence-Based Electrocardiographic Platform in Patients Treated With Mavacamten”","authors":"","doi":"10.1016/j.mcpdig.2025.100209","DOIUrl":"10.1016/j.mcpdig.2025.100209","url":null,"abstract":"","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 3","pages":"Article 100209"},"PeriodicalIF":0.0,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144280103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-01DOI: 10.1016/j.mcpdig.2025.100226
Kishwen Kanna Yoga Ratnam MD, MPH, DrPH
{"title":"What Becomes of the Human Touch in the Age of Generative Artificial Intelligence?","authors":"Kishwen Kanna Yoga Ratnam MD, MPH, DrPH","doi":"10.1016/j.mcpdig.2025.100226","DOIUrl":"10.1016/j.mcpdig.2025.100226","url":null,"abstract":"","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 2","pages":"Article 100226"},"PeriodicalIF":0.0,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144184999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-27DOI: 10.1016/j.mcpdig.2025.100231
Vidhi Singh BS, Susan Cheng MD, MPH, Alan C. Kwan MD, MS, Joseph Ebinger MD, MS
{"title":"United States Food and Drug Administration Regulation of Clinical Software in the Era of Artificial Intelligence and Machine Learning","authors":"Vidhi Singh BS, Susan Cheng MD, MPH, Alan C. Kwan MD, MS, Joseph Ebinger MD, MS","doi":"10.1016/j.mcpdig.2025.100231","DOIUrl":"10.1016/j.mcpdig.2025.100231","url":null,"abstract":"","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 3","pages":"Article 100231"},"PeriodicalIF":0.0,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144470263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-23DOI: 10.1016/j.mcpdig.2025.100228
Alexander Saelmans MD , Tom Seinen PhD , Victor Pera PharmD , Aniek F. Markus PhD , Egill Fridgeirsson PhD , Luis H. John MSc , Lieke Schiphof-Godart PhD , Peter Rijnbeek PhD , Jenna Reps PhD , Ross Williams PhD
Objective
To summarize the implementation approaches and updating methods of clinically implemented models and consecutively advise researchers on the implementation and updating.
Patients and Methods
We included studies describing the implementation of prognostic binary prediction models in a clinical setting. We retrieved articles from Embase, Medline, and Web of Science from January 1, 2010, to January 1, 2024. We performed data extraction, based on Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis and Prediction Model Risk of Bias Assessment guidelines, and summarized.
Results
The search yielded 1872 articles. Following screening, 37 articles, describing 56 prediction models, were eligible for inclusion. The overall risk of bias was high in 86% of publications. In model development and internal validation, 32% of the models was assessed for calibration. External validation was performed for 27% of the models. Most models were implemented into the hospital information system (63%), followed by a web application (32%) and a patient decision aid tool (5%). Moreover, 13% of models have been updated following implementation.
Conclusion
Impact assessments generally showed successful model implementation and the ability to improve patient care, despite not fully adhering to prediction modeling best practice. Both impact assessment and updating could play a key role in identifying and lowering bias in models.
目的总结临床实施模型的实施途径和更新方法,为研究人员提供实施和更新建议。患者和方法我们纳入了描述在临床环境中实施预后二元预测模型的研究。我们从Embase、Medline和Web of Science检索了2010年1月1日至2024年1月1日的文章。我们根据透明报告个体预后或诊断的多变量预测模型和预测模型偏倚风险评估指南进行数据提取,并总结。结果检索得到1872篇文章。经过筛选,37篇文章,描述了56个预测模型,符合纳入条件。86%的出版物的总体偏倚风险很高。在模型开发和内部验证中,对32%的模型进行了校准评估。27%的模型进行了外部验证。大多数模型被应用到医院信息系统中(63%),其次是web应用程序(32%)和患者决策辅助工具(5%)。此外,13%的模型在实现之后得到了更新。结论影响评估总体上显示了模型的成功实施和改善患者护理的能力,尽管没有完全遵循预测建模的最佳实践。影响评估和更新都可以在识别和降低模型偏差方面发挥关键作用。
{"title":"Implementation and Updating of Clinical Prediction Models: A Systematic Review","authors":"Alexander Saelmans MD , Tom Seinen PhD , Victor Pera PharmD , Aniek F. Markus PhD , Egill Fridgeirsson PhD , Luis H. John MSc , Lieke Schiphof-Godart PhD , Peter Rijnbeek PhD , Jenna Reps PhD , Ross Williams PhD","doi":"10.1016/j.mcpdig.2025.100228","DOIUrl":"10.1016/j.mcpdig.2025.100228","url":null,"abstract":"<div><h3>Objective</h3><div>To summarize the implementation approaches and updating methods of clinically implemented models and consecutively advise researchers on the implementation and updating.</div></div><div><h3>Patients and Methods</h3><div>We included studies describing the implementation of prognostic binary prediction models in a clinical setting. We retrieved articles from Embase, Medline, and Web of Science from January 1, 2010, to January 1, 2024. We performed data extraction, based on Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis and Prediction Model Risk of Bias Assessment guidelines, and summarized.</div></div><div><h3>Results</h3><div>The search yielded 1872 articles. Following screening, 37 articles, describing 56 prediction models, were eligible for inclusion. The overall risk of bias was high in 86% of publications. In model development and internal validation, 32% of the models was assessed for calibration. External validation was performed for 27% of the models. Most models were implemented into the hospital information system (63%), followed by a web application (32%) and a patient decision aid tool (5%). Moreover, 13% of models have been updated following implementation.</div></div><div><h3>Conclusion</h3><div>Impact assessments generally showed successful model implementation and the ability to improve patient care, despite not fully adhering to prediction modeling best practice. Both impact assessment and updating could play a key role in identifying and lowering bias in models.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 3","pages":"Article 100228"},"PeriodicalIF":0.0,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144297539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-23DOI: 10.1016/j.mcpdig.2025.100230
Fabio Borgonovo MD , Takahiro Matsuo MD , Francesco Petri MD , Seyed Mohammad Amin Alavi MD , Laura Chelsea Mazudie Ndjonko , Andrea Gori MD , Elie F. Berbari MD, MBA
Objective
To evaluate the ability of 15 different large language models (LLMs) to solve clinical cases with osteoarticular infections following published guidelines.
Materials and Methods
The study evaluated 15 LLMs across 5 categories of osteoarticular infections: periprosthetic joint infection, diabetic foot infection, native vertebral osteomyelitis, fracture-related infections, and septic arthritis. Models were selected systematically, including general-purpose and medical-specific systems, ensuring robust English support. In total, 126 text-based questions, developed by the authors from published guidelines and validated by experts, assessed diagnostic, management, and treatment strategies. Each model answered individually, with responses classified as correct or incorrect based on guidelines. All tests were conducted between April 17, 2025, and April 28, 2025. Results, presented as percentages of correct answers and aggregated scores, highlight performance trends. Mixed-effects logistic regression with a random question effect was used to quantify how each LLM compared in answering the study questions.
Results
The performance of 15 LLMs was evaluated, with the percentage of correct answers reported. OpenEvidence and Microsoft Copilot achieved the highest score (119/126 [94.4%]), excelling in multiple categories. ChatGPT-4o and Gemini 2.5 Pro scored 117 of the 126 (92.8%). When used as references, OpenEvidence was not inferior to any comparator and was superior to 5 LLMs. Performance varied across categories, highlighting the strengths and limitations of individual models.
Conclusion
OpenEvidence and Miccrosoft Copilot achieved the highest accuracy among evaluated LLMs, highlighting their potential for precisely addressing complex clinical cases. This study emphasizes the need for specialized, validated artificial intelligence tools in medical practice. Although promising, current models face limitations in real-world applications, requiring further refinement to support clinical decision making reliably.
{"title":"Battle of the Bots: Solving Clinical Cases in Osteoarticular Infections With Large Language Models","authors":"Fabio Borgonovo MD , Takahiro Matsuo MD , Francesco Petri MD , Seyed Mohammad Amin Alavi MD , Laura Chelsea Mazudie Ndjonko , Andrea Gori MD , Elie F. Berbari MD, MBA","doi":"10.1016/j.mcpdig.2025.100230","DOIUrl":"10.1016/j.mcpdig.2025.100230","url":null,"abstract":"<div><h3>Objective</h3><div>To evaluate the ability of 15 different large language models (LLMs) to solve clinical cases with osteoarticular infections following published guidelines.</div></div><div><h3>Materials and Methods</h3><div>The study evaluated 15 LLMs across 5 categories of osteoarticular infections: periprosthetic joint infection, diabetic foot infection, native vertebral osteomyelitis, fracture-related infections, and septic arthritis. Models were selected systematically, including general-purpose and medical-specific systems, ensuring robust English support. In total, 126 text-based questions, developed by the authors from published guidelines and validated by experts, assessed diagnostic, management, and treatment strategies. Each model answered individually, with responses classified as correct or incorrect based on guidelines. All tests were conducted between April 17, 2025, and April 28, 2025. Results, presented as percentages of correct answers and aggregated scores, highlight performance trends. Mixed-effects logistic regression with a random question effect was used to quantify how each LLM compared in answering the study questions.</div></div><div><h3>Results</h3><div>The performance of 15 LLMs was evaluated, with the percentage of correct answers reported. OpenEvidence and Microsoft Copilot achieved the highest score (119/126 [94.4%]), excelling in multiple categories. ChatGPT-4o and Gemini 2.5 Pro scored 117 of the 126 (92.8%). When used as references, OpenEvidence was not inferior to any comparator and was superior to 5 LLMs. Performance varied across categories, highlighting the strengths and limitations of individual models.</div></div><div><h3>Conclusion</h3><div>OpenEvidence and Miccrosoft Copilot achieved the highest accuracy among evaluated LLMs, highlighting their potential for precisely addressing complex clinical cases. This study emphasizes the need for specialized, validated artificial intelligence tools in medical practice. Although promising, current models face limitations in real-world applications, requiring further refinement to support clinical decision making reliably.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 3","pages":"Article 100230"},"PeriodicalIF":0.0,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144280102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}