首页 > 最新文献

JMIR AI最新文献

英文 中文
Correction: Real-World Evidence Synthesis of Digital Scribes Using Ambient Listening and Generative Artificial Intelligence for Clinician Documentation Workflows: Rapid Review. 更正:临床医生文档工作流程中使用环境聆听和生成人工智能的数字抄写员的真实世界证据合成:快速回顾。
IF 2 Pub Date : 2026-03-13 DOI: 10.2196/93250
Naga Sasidhar Kanaparthy, Yenny Villuendas-Rey, Tolulope Bakare, Zihan Diao, Mark Iscoe, Andrew Loza, Donald Wright, Conrad Safranek, Isaac V Faustino, Alexandria Brackett, Edward R Melnick, R Andrew Taylor
{"title":"Correction: Real-World Evidence Synthesis of Digital Scribes Using Ambient Listening and Generative Artificial Intelligence for Clinician Documentation Workflows: Rapid Review.","authors":"Naga Sasidhar Kanaparthy, Yenny Villuendas-Rey, Tolulope Bakare, Zihan Diao, Mark Iscoe, Andrew Loza, Donald Wright, Conrad Safranek, Isaac V Faustino, Alexandria Brackett, Edward R Melnick, R Andrew Taylor","doi":"10.2196/93250","DOIUrl":"https://doi.org/10.2196/93250","url":null,"abstract":"","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"5 ","pages":"e93250"},"PeriodicalIF":2.0,"publicationDate":"2026-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12986773/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147461213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward Retrieval-Grounded Evaluation for Conversational Large Language Model-Based Risk Assessment. 基于检索的会话式大语言模型风险评估研究。
IF 2 Pub Date : 2026-03-12 DOI: 10.2196/90759
Yihan Hu
{"title":"Toward Retrieval-Grounded Evaluation for Conversational Large Language Model-Based Risk Assessment.","authors":"Yihan Hu","doi":"10.2196/90759","DOIUrl":"10.2196/90759","url":null,"abstract":"","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"5 ","pages":"e90759"},"PeriodicalIF":2.0,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12981538/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147446207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large Language Model-Based Agents for Physical Activity and Cognitive Training: Scoping Review. 基于大语言模型的体育活动和认知训练代理:范围审查。
IF 2 Pub Date : 2026-03-12 DOI: 10.2196/80123
Alessandro Silacci, Benedetta Giachetti, Leonardo Angelini, Nicola Francesco Lopomo, Giuseppe Andreoni, Elena Mugellini, Mauro Cherubini, Maurizio Caon

Background: Large language model (LLM)-based conversational agents have been increasingly used in digital health interventions. However, their specific application to physical activity (PA) and cognitive training-two critical well-being domains-has not been systematically mapped. In fact, these domains share an important need for personalized, adaptive support and conversational engagement, making them relevant targets for examining how LLM-based agents are currently conceptualized and deployed.

Objective: This scoping review aimed to map the extent, characteristics, and design practices of LLM-based conversational agents supporting PA or cognitive training, specifically analyzing their application contexts, social roles, and technological features.

Methods: Following PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines, we searched Web of Science, Scopus, PubMed, ACM Digital Library, and IEEE Xplore for studies published between January 2018 and December 2024. We included eligible studies that described LLM-based conversational agents designed for PA or cognitive training. Two reviewers independently screened records and extracted data. Descriptive synthesis and framework analysis were used to characterize intervention domains, agent roles, prompting strategies, model types, and reported outcomes.

Results: Of 357 records screened, 10 studies met eligibility criteria (7 on PA and 3 on cognitive training). Applications predominantly involved coaching roles for PA and companion or scaffolding roles in cognitive domains. The agent landscape was dominated by proprietary LLMs (GPT-3.5, GPT-4, and Bard), with limited use of open-weight models. Prompt engineering emerged as a central yet inconsistently documented design mechanism. Reported outcomes mainly focused on perceived usefulness, engagement, or content quality, with few quantitative behavioral outcomes.

Conclusions: LLM-based conversational agents have demonstrated early promise for supporting PA and emerging approaches to cognitive training, yet the current evidence remains exploratory and methodologically limited. Key challenges persist, including inconsistent reporting of prompts, reliance on proprietary models with limited reproducibility, and a lack of standardized outcome measures. More rigorous and transparently documented evaluations of these tools are required to strengthen the evidence base and guide future development.

背景:基于大语言模型(LLM)的会话代理已越来越多地用于数字健康干预。然而,它们在身体活动(PA)和认知训练(两个关键的幸福领域)上的具体应用尚未被系统地绘制出来。事实上,这些领域对个性化、自适应支持和会话参与都有重要的需求,这使得它们成为研究当前如何概念化和部署基于llm的代理的相关目标。目的:本综述旨在绘制支持PA或认知训练的基于llm的会话代理的范围、特征和设计实践,具体分析其应用背景、社会角色和技术特征。方法:根据PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and meta - analysis extension for Scoping Reviews)指南,检索Web of Science、Scopus、PubMed、ACM Digital Library和IEEE Xplore,检索2018年1月至2024年12月间发表的研究。我们纳入了描述为PA或认知训练设计的基于llm的会话代理的合格研究。两名审稿人独立筛选记录并提取数据。描述性综合和框架分析用于描述干预领域、代理角色、提示策略、模型类型和报告结果。结果:在筛选的357项记录中,有10项研究符合资格标准(7项关于PA, 3项关于认知训练)。应用主要涉及PA的指导角色和认知领域的同伴或脚手架角色。代理领域由专有llm (GPT-3.5、GPT-4和Bard)主导,开放重量模型的使用有限。提示工程作为一种核心但不一致的设计机制出现。报告的结果主要集中在感知有用性、参与度或内容质量上,很少有量化的行为结果。结论:基于llm的会话代理已经显示出支持PA和新兴认知训练方法的早期承诺,但目前的证据仍然是探索性的和方法上的限制。关键的挑战仍然存在,包括不一致的提示报告,对可重复性有限的专有模型的依赖,以及缺乏标准化的结果测量。需要对这些工具进行更严格和更透明的文件化评价,以加强证据基础并指导未来的发展。
{"title":"Large Language Model-Based Agents for Physical Activity and Cognitive Training: Scoping Review.","authors":"Alessandro Silacci, Benedetta Giachetti, Leonardo Angelini, Nicola Francesco Lopomo, Giuseppe Andreoni, Elena Mugellini, Mauro Cherubini, Maurizio Caon","doi":"10.2196/80123","DOIUrl":"10.2196/80123","url":null,"abstract":"<p><strong>Background: </strong>Large language model (LLM)-based conversational agents have been increasingly used in digital health interventions. However, their specific application to physical activity (PA) and cognitive training-two critical well-being domains-has not been systematically mapped. In fact, these domains share an important need for personalized, adaptive support and conversational engagement, making them relevant targets for examining how LLM-based agents are currently conceptualized and deployed.</p><p><strong>Objective: </strong>This scoping review aimed to map the extent, characteristics, and design practices of LLM-based conversational agents supporting PA or cognitive training, specifically analyzing their application contexts, social roles, and technological features.</p><p><strong>Methods: </strong>Following PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines, we searched Web of Science, Scopus, PubMed, ACM Digital Library, and IEEE Xplore for studies published between January 2018 and December 2024. We included eligible studies that described LLM-based conversational agents designed for PA or cognitive training. Two reviewers independently screened records and extracted data. Descriptive synthesis and framework analysis were used to characterize intervention domains, agent roles, prompting strategies, model types, and reported outcomes.</p><p><strong>Results: </strong>Of 357 records screened, 10 studies met eligibility criteria (7 on PA and 3 on cognitive training). Applications predominantly involved coaching roles for PA and companion or scaffolding roles in cognitive domains. The agent landscape was dominated by proprietary LLMs (GPT-3.5, GPT-4, and Bard), with limited use of open-weight models. Prompt engineering emerged as a central yet inconsistently documented design mechanism. Reported outcomes mainly focused on perceived usefulness, engagement, or content quality, with few quantitative behavioral outcomes.</p><p><strong>Conclusions: </strong>LLM-based conversational agents have demonstrated early promise for supporting PA and emerging approaches to cognitive training, yet the current evidence remains exploratory and methodologically limited. Key challenges persist, including inconsistent reporting of prompts, reliance on proprietary models with limited reproducibility, and a lack of standardized outcome measures. More rigorous and transparently documented evaluations of these tools are required to strengthen the evidence base and guide future development.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"5 ","pages":"e80123"},"PeriodicalIF":2.0,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12981376/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147446183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Authors' Reply: Toward Retrieval-Grounded Evaluation for Conversational Large Language Model-Based Risk Assessment. 作者回复:基于检索的基于会话大语言模型的风险评估。
IF 2 Pub Date : 2026-03-12 DOI: 10.2196/91981
Mohammad Amin Roshani, Xiangyu Zhou, Yao Qiang, Srinivasan Suresh, Steve Hicks, Usha Sethuraman, Dongxiao Zhu
{"title":"Authors' Reply: Toward Retrieval-Grounded Evaluation for Conversational Large Language Model-Based Risk Assessment.","authors":"Mohammad Amin Roshani, Xiangyu Zhou, Yao Qiang, Srinivasan Suresh, Steve Hicks, Usha Sethuraman, Dongxiao Zhu","doi":"10.2196/91981","DOIUrl":"10.2196/91981","url":null,"abstract":"","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"5 ","pages":"e91981"},"PeriodicalIF":2.0,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12982700/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147446190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI-Based Personalized Therapy With Clinical Intelligence and Radiomics (SPOILS) for Patients With Low Back Pain: Prospective Observational Study. 基于人工智能的个性化治疗与临床智能和放射组学(SPOILS)治疗腰痛患者:前瞻性观察研究。
IF 2 Pub Date : 2026-03-11 DOI: 10.2196/83322
Purushottam Kumar, Suyash Singh, Bunil Kumar Balabantaray

Background: Low back pain (LBP) is a leading cause of disability worldwide, affecting people of all ages while showing increasing prevalence among younger demographics. Patients may present with different symptoms and treatment responses despite identical magnetic resonance imaging results, making it difficult to determine whether surgical and medical interventions are appropriate.

Objective: This study aimed to develop SPOILS (Software to Predict Outcome in Lumbar Spondylosis), an artificial intelligence-based decision support tool that merges clinical intelligence and radiomics to generate customized therapy plans for patients with LBP.

Methods: The SPOILS system used deep learning models to perform automated segmentation, enabling the extraction of geometrical parameters, including disk height, disk width, vertebrae height, vertebrae width, canal diameter, disk height index, signal intensity, and disk volume. A labeled dataset was created using expert-verified Pfirrmann and spondylosis severity gradings to address the clinical issues stemming from manual grading variability and subjectivity. Machine learning algorithms were used with this combined dataset to predict outcomes and recommend personalized treatment plans.

Results: The DeepLabV3+ segmentation model with a ResNet50 encoder achieved 95.5% accuracy, which increased to 98.7% after 8-fold cross-validation and simultaneously improved precision (96.95%), recall (97.1%), Dice coefficient (96.9%), and intersection over union (IoU; 94.8%). The convolutional neural network with MobileNetV2 achieved 97.84% accuracy and 96.76% IoU for spondylosis severity prediction after cross-validation. The Gradient Boost classifier demonstrated the best results with geometrical data by achieving 91.65% accuracy and 84.59% IoU.

Conclusions: SPOILS introduced an innovative method to customize LBP treatment through the combination of artificial intelligence technology with radiological data and clinical expertise.

背景:腰痛(LBP)是世界范围内致残的主要原因,影响所有年龄段的人,同时在年轻人群中患病率越来越高。尽管磁共振成像结果相同,但患者可能表现出不同的症状和治疗反应,这使得难以确定手术和医疗干预是否合适。目的:本研究旨在开发SPOILS(预测腰椎病预后的软件),这是一种基于人工智能的决策支持工具,将临床智能和放射组学相结合,为腰痛患者生成定制的治疗计划。方法:SPOILS系统利用深度学习模型进行自动分割,提取几何参数,包括椎间盘高度、椎间盘宽度、椎骨高度、椎骨宽度、椎管直径、椎间盘高度指数、信号强度和椎间盘体积。使用专家验证的Pfirrmann和颈椎病严重程度分级创建标记数据集,以解决人工分级可变性和主观性引起的临床问题。机器学习算法与该组合数据集一起用于预测结果并推荐个性化治疗方案。结果:采用ResNet50编码器的DeepLabV3+分割模型准确率达到95.5%,经过8倍交叉验证后提高到98.7%,同时提高了精度(96.95%)、召回率(97.1%)、Dice系数(96.9%)和交集/联合(IoU; 94.8%)。经交叉验证,MobileNetV2卷积神经网络预测颈椎病严重程度的准确率为97.84%,IoU为96.76%。在几何数据上,Gradient Boost分类器的准确率达到91.65%,IoU达到84.59%。结论:SPOILS通过将人工智能技术与放射学数据和临床专业知识相结合,引入了一种创新的方法来定制LBP治疗。
{"title":"AI-Based Personalized Therapy With Clinical Intelligence and Radiomics (SPOILS) for Patients With Low Back Pain: Prospective Observational Study.","authors":"Purushottam Kumar, Suyash Singh, Bunil Kumar Balabantaray","doi":"10.2196/83322","DOIUrl":"10.2196/83322","url":null,"abstract":"<p><strong>Background: </strong>Low back pain (LBP) is a leading cause of disability worldwide, affecting people of all ages while showing increasing prevalence among younger demographics. Patients may present with different symptoms and treatment responses despite identical magnetic resonance imaging results, making it difficult to determine whether surgical and medical interventions are appropriate.</p><p><strong>Objective: </strong>This study aimed to develop SPOILS (Software to Predict Outcome in Lumbar Spondylosis), an artificial intelligence-based decision support tool that merges clinical intelligence and radiomics to generate customized therapy plans for patients with LBP.</p><p><strong>Methods: </strong>The SPOILS system used deep learning models to perform automated segmentation, enabling the extraction of geometrical parameters, including disk height, disk width, vertebrae height, vertebrae width, canal diameter, disk height index, signal intensity, and disk volume. A labeled dataset was created using expert-verified Pfirrmann and spondylosis severity gradings to address the clinical issues stemming from manual grading variability and subjectivity. Machine learning algorithms were used with this combined dataset to predict outcomes and recommend personalized treatment plans.</p><p><strong>Results: </strong>The DeepLabV3+ segmentation model with a ResNet50 encoder achieved 95.5% accuracy, which increased to 98.7% after 8-fold cross-validation and simultaneously improved precision (96.95%), recall (97.1%), Dice coefficient (96.9%), and intersection over union (IoU; 94.8%). The convolutional neural network with MobileNetV2 achieved 97.84% accuracy and 96.76% IoU for spondylosis severity prediction after cross-validation. The Gradient Boost classifier demonstrated the best results with geometrical data by achieving 91.65% accuracy and 84.59% IoU.</p><p><strong>Conclusions: </strong>SPOILS introduced an innovative method to customize LBP treatment through the combination of artificial intelligence technology with radiological data and clinical expertise.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"5 ","pages":"e83322"},"PeriodicalIF":2.0,"publicationDate":"2026-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12978536/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147437924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI-Enabled Personalization of Semaglutide Therapy in Type 2 Diabetes: Systematic Review With an Integration Framework. 人工智能支持的西马鲁肽治疗2型糖尿病的个性化:整合框架的系统评价。
IF 2 Pub Date : 2026-03-09 DOI: 10.2196/86960
Ghinwa Barakat, Samer El Hajj Hassan, Hanane Akhdar, Nghia Duong-Trung, Wiam Ramadan

Background: Type 2 diabetes mellitus (T2D) is a rapidly growing global health concern requiring innovative treatment methods. Ozempic (semaglutide), a glucagon-like peptide-1 receptor agonist, has proven consistent effectiveness in lowering blood glucose levels, supporting weight loss, and minimizing cardiovascular complications. In parallel, artificial intelligence (AI) elevates diabetes care yet complements these efforts by converting raw data from wearable devices, electronic health records, and medical imaging into practical insights for efficient, tailored, and customized treatment plans.

Objective: The objective of this systematic review is to examine current evidence of AI-driven methods to optimize Ozempic-based T2D therapy.

Methods: A total of 18 peer-reviewed articles were identified, revealing four dominant thematic clusters: (1) patient stratification and risk prediction, (2) AI-enhanced imaging for body composition changes, (3) cardiovascular and metabolic risk assessment, and (4) personalized AI-driven dosage.

Results: Across multiple metrics, such as glycated hemoglobin reduction, weight loss, cardiovascular benefits, and adverse event mitigation, AI-based approaches outperformed standard fixed-dose regimens. A theoretical framework is proposed for AI-Ozempic integration, with continuous data collection, AI processing, clinical decision support, real-time support, and real-time feedback and modeling iteration refinement cycles.

Conclusions: Significant gaps remain a persistent challenge, including the need for large-scale randomized controlled trials, longer follow-up periods, explainable AI models, regulatory validation, and practical strategies for routine clinical implementation. The findings emphasize the AI's potential to transform semaglutide therapy while delineating important paths for future research.

背景:2型糖尿病(T2D)是一个快速增长的全球健康问题,需要创新的治疗方法。Ozempic (semaglutide)是一种胰高血糖素样肽-1受体激动剂,已被证明在降低血糖水平、支持减肥和减少心血管并发症方面具有一致的有效性。与此同时,人工智能(AI)提高了糖尿病护理水平,并通过将来自可穿戴设备、电子健康记录和医学成像的原始数据转化为有效、量身定制和定制治疗计划的实际见解来补充这些努力。目的:本系统综述的目的是检查人工智能驱动方法优化基于ozempic的T2D治疗的现有证据。方法:共收集了18篇同行评议的文章,揭示了四个主要的主题集群:(1)患者分层和风险预测,(2)人工智能增强的身体成分变化成像,(3)心血管和代谢风险评估,(4)人工智能驱动的个性化剂量。结果:在多个指标上,如糖化血红蛋白降低、体重减轻、心血管益处和不良事件缓解,基于人工智能的方法优于标准的固定剂量方案。提出了一个具有连续数据采集、人工智能处理、临床决策支持、实时支持、实时反馈和建模迭代细化周期的AI- ozempic集成理论框架。结论:重大差距仍然是一个持续的挑战,包括需要大规模随机对照试验、更长的随访期、可解释的人工智能模型、监管验证和常规临床实施的实用策略。这些发现强调了人工智能在改变西马鲁肽治疗方面的潜力,同时为未来的研究描绘了重要的途径。
{"title":"AI-Enabled Personalization of Semaglutide Therapy in Type 2 Diabetes: Systematic Review With an Integration Framework.","authors":"Ghinwa Barakat, Samer El Hajj Hassan, Hanane Akhdar, Nghia Duong-Trung, Wiam Ramadan","doi":"10.2196/86960","DOIUrl":"10.2196/86960","url":null,"abstract":"<p><strong>Background: </strong>Type 2 diabetes mellitus (T2D) is a rapidly growing global health concern requiring innovative treatment methods. Ozempic (semaglutide), a glucagon-like peptide-1 receptor agonist, has proven consistent effectiveness in lowering blood glucose levels, supporting weight loss, and minimizing cardiovascular complications. In parallel, artificial intelligence (AI) elevates diabetes care yet complements these efforts by converting raw data from wearable devices, electronic health records, and medical imaging into practical insights for efficient, tailored, and customized treatment plans.</p><p><strong>Objective: </strong>The objective of this systematic review is to examine current evidence of AI-driven methods to optimize Ozempic-based T2D therapy.</p><p><strong>Methods: </strong>A total of 18 peer-reviewed articles were identified, revealing four dominant thematic clusters: (1) patient stratification and risk prediction, (2) AI-enhanced imaging for body composition changes, (3) cardiovascular and metabolic risk assessment, and (4) personalized AI-driven dosage.</p><p><strong>Results: </strong>Across multiple metrics, such as glycated hemoglobin reduction, weight loss, cardiovascular benefits, and adverse event mitigation, AI-based approaches outperformed standard fixed-dose regimens. A theoretical framework is proposed for AI-Ozempic integration, with continuous data collection, AI processing, clinical decision support, real-time support, and real-time feedback and modeling iteration refinement cycles.</p><p><strong>Conclusions: </strong>Significant gaps remain a persistent challenge, including the need for large-scale randomized controlled trials, longer follow-up periods, explainable AI models, regulatory validation, and practical strategies for routine clinical implementation. The findings emphasize the AI's potential to transform semaglutide therapy while delineating important paths for future research.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"5 ","pages":"e86960"},"PeriodicalIF":2.0,"publicationDate":"2026-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13010075/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147391589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance of 5 AI Models on United States Medical Licensing Examination Step 1 Questions: Comparative Observational Study. 五种人工智能模型在USMLE上的表现step1问题:一项比较观察研究
IF 2 Pub Date : 2026-03-09 DOI: 10.2196/76928
Dania El Natour, Mohamad Abou Alfa, Ahmad Chaaban, Reda Assi, Toufic Dally, Bahaa Bou Dargham

Background: Artificial intelligence (AI) models are increasingly being used in medical education. Although models like ChatGPT have previously demonstrated strong performance on United States Medical Licensing Examination (USMLE)-style questions, newer AI tools with enhanced capabilities are now available, necessitating comparative evaluations of their accuracy and reliability across different medical domains and question formats.

Objective: This study aimed to evaluate and compare the performance of 5 publicly available AI models: Grok, ChatGPT-4, Copilot, Gemini, and DeepSeek, on the USMLE Step 1 free 120-question set, assessing their accuracy and consistency across question types and medical subjects.

Methods: This cross-sectional observational study was conducted between February 10 and March 5, 2025. Each of the 119 USMLE-style questions (excluding 1 audio-based item) was presented to each AI model by using a standardized prompt cycle. Models answered each question 3 times to assess confidence and consistency. Questions were categorized as text-based or image-based and as case-based or information-based. Statistical analysis was performed using chi-square and Fisher exact tests, with Bonferroni adjustment for pairwise comparisons.

Results: Grok achieved the highest score (109/119, 91.6%), followed by Copilot (101/119, 84.9%), Gemini (100/119, 84%), ChatGPT-4 (95/119, 79.8%), and DeepSeek (86/119, 72.3%). DeepSeek's lower score was due to an inability to process visual media, resulting in 0% accuracy on image-based items. When limited to text-only questions (n=96), DeepSeek's accuracy increased to 89.6% (86/96), matching Copilot. Grok showed the highest accuracy on image-based (21/23, 91.3%) and case-based questions (70/78, 89.7%), with statistically significant differences observed between Grok and DeepSeek on case-based items (P=.01). The models performed best in biostatistics and epidemiology (5.8/6, 96.7%) and worst in musculoskeletal, skin, and connective tissue (4.4/7, 62.9%). Grok maintained 100% consistency in responses, while Copilot demonstrated the most self-correction (112/119, 94.1% consistency), improving its accuracy to 89.9% (107/119) on the third attempt.

Conclusions: AI models showed varying strengths across domains, with Grok demonstrating the highest accuracy and consistency in this dataset, particularly for image-based and reasoning-heavy questions. Although ChatGPT-4 remains widely used, newer models like Grok and Copilot also performed competitively. Continuous evaluation is essential as AI tools rapidly evolve.

背景:人工智能(AI)模型在医学教育中的应用越来越广泛。虽然ChatGPT等模型之前在usmle风格的问题上表现出色,但现在有了功能增强的新人工智能工具,需要对不同医学领域和问题格式的准确性和可靠性进行比较评估。目的:评估和比较五种公开可用的人工智能模型:Grok、ChatGPT-4、Copilot、Gemini和DeepSeek在USMLE Step 1 Free 120个问题集上的性能,检查它们在问题类型和医学主题上的准确性和一致性。方法:本横断面观察研究于2025年2月10日至3月5日进行。119个usmle风格的问题(不包括一个基于音频的问题)中的每一个都使用标准化的提示周期呈现给每个AI模型。模型回答每个问题三次,以评估信心和一致性。问题分为基于文本或基于图像,基于案例或基于信息。统计分析采用卡方检验和Fisher精确检验,两两比较采用Bonferroni调整。结果:Grok得分最高(91.6%),其次是Copilot(84.9%)、Gemini(84.0%)、ChatGPT-4(79.8%)和DeepSeek(72.3%)。DeepSeek的较低分数是由于无法处理视觉媒体,导致基于图像的项目的准确率为0%。当仅限于文本问题(n = 96)时,DeepSeek的准确率提高到89.6%,与Copilot相当。Grok在基于图像的问题(91.3%)和基于案例的问题(89.7%)上的准确率最高,Grok和DeepSeek在基于案例的问题上的差异有统计学意义(p = 0.011)。模型在生物统计学和流行病学方面表现最好(96.7%),在肌肉骨骼、皮肤和结缔组织方面表现最差(62.9%)。Grok在回答中保持了100%的一致性,而Copilot表现出了最高的自我纠正(一致性为94.1%),在第三次尝试时将其准确性提高到89.9%。结论:人工智能模型在不同领域表现出不同的优势,Grok在该数据集中表现出最高的准确性和一致性,特别是对于基于图像和推理重的问题。虽然ChatGPT-4仍然被广泛使用,但Grok和Copilot等较新的模型也表现得很有竞争力。随着人工智能工具的快速发展,持续评估是必不可少的。临床试验:
{"title":"Performance of 5 AI Models on United States Medical Licensing Examination Step 1 Questions: Comparative Observational Study.","authors":"Dania El Natour, Mohamad Abou Alfa, Ahmad Chaaban, Reda Assi, Toufic Dally, Bahaa Bou Dargham","doi":"10.2196/76928","DOIUrl":"10.2196/76928","url":null,"abstract":"<p><strong>Background: </strong>Artificial intelligence (AI) models are increasingly being used in medical education. Although models like ChatGPT have previously demonstrated strong performance on United States Medical Licensing Examination (USMLE)-style questions, newer AI tools with enhanced capabilities are now available, necessitating comparative evaluations of their accuracy and reliability across different medical domains and question formats.</p><p><strong>Objective: </strong>This study aimed to evaluate and compare the performance of 5 publicly available AI models: Grok, ChatGPT-4, Copilot, Gemini, and DeepSeek, on the USMLE Step 1 free 120-question set, assessing their accuracy and consistency across question types and medical subjects.</p><p><strong>Methods: </strong>This cross-sectional observational study was conducted between February 10 and March 5, 2025. Each of the 119 USMLE-style questions (excluding 1 audio-based item) was presented to each AI model by using a standardized prompt cycle. Models answered each question 3 times to assess confidence and consistency. Questions were categorized as text-based or image-based and as case-based or information-based. Statistical analysis was performed using chi-square and Fisher exact tests, with Bonferroni adjustment for pairwise comparisons.</p><p><strong>Results: </strong>Grok achieved the highest score (109/119, 91.6%), followed by Copilot (101/119, 84.9%), Gemini (100/119, 84%), ChatGPT-4 (95/119, 79.8%), and DeepSeek (86/119, 72.3%). DeepSeek's lower score was due to an inability to process visual media, resulting in 0% accuracy on image-based items. When limited to text-only questions (n=96), DeepSeek's accuracy increased to 89.6% (86/96), matching Copilot. Grok showed the highest accuracy on image-based (21/23, 91.3%) and case-based questions (70/78, 89.7%), with statistically significant differences observed between Grok and DeepSeek on case-based items (P=.01). The models performed best in biostatistics and epidemiology (5.8/6, 96.7%) and worst in musculoskeletal, skin, and connective tissue (4.4/7, 62.9%). Grok maintained 100% consistency in responses, while Copilot demonstrated the most self-correction (112/119, 94.1% consistency), improving its accuracy to 89.9% (107/119) on the third attempt.</p><p><strong>Conclusions: </strong>AI models showed varying strengths across domains, with Grok demonstrating the highest accuracy and consistency in this dataset, particularly for image-based and reasoning-heavy questions. Although ChatGPT-4 remains widely used, newer models like Grok and Copilot also performed competitively. Continuous evaluation is essential as AI tools rapidly evolve.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":" ","pages":"e76928"},"PeriodicalIF":2.0,"publicationDate":"2026-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13010076/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146151374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ChatGPT as an AI-Enabled Educational Resource in Nursing Practice: Scoping Review of Uses, Outcomes, and Implementation Challenges. ChatGPT作为护理实践中人工智能支持的教育资源:用途,结果和实施挑战的范围审查。
IF 2 Pub Date : 2026-03-09 DOI: 10.2196/79551
Selviana Anwar, Saldy Yusuf, Maria Kurnyata Rante Kada, Farawansah Mustafa

Background: High-quality nursing services are essential for improving patient satisfaction and health outcomes. Today, artificial intelligence (AI) applications such as ChatGPT offer potential solutions to enhance patient education and assist nurses in providing more accurate and personalized information. Despite its promising potential in nursing education, concerns regarding information accuracy, privacy, and ethical considerations must be addressed.

Objective: This scoping review aimed to map the current evidence on the use of ChatGPT (OpenAI) as an educational resource in nursing practice, focusing on its educational functions, reported outcomes, and implementation challenges.

Methods: The literature search was conducted using 3 databases (PubMed, Scopus, and ProQuest). Following the Population, Concept, and Context framework. Inclusion criteria encompassed studies published between 2019 and 2025, in English, and available in full text. The PRISMA-ScR (Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews) guideline was used to guide the screening and selection process.

Results: We included 20 articles and synthesized four main findings: (1) AI in patient education and simplification of medical information, (2) AI in clinical decision-making and patient monitoring, (3) AI in nursing education, and (4) challenges and prospects of AI in nursing. Across studies, commonly reported limitations involved response accuracy inconsistencies, ethical concerns, and the absence of standardized implementation guidelines.

Conclusions: ChatGPT shows promise as an adjunct educational resource in nursing practice, particularly for information accessibility and learner engagement. Nevertheless, its use requires professional oversight, ethical safeguards, and further implementation-focused research.

背景:高质量的护理服务对提高患者满意度和健康结果至关重要。如今,ChatGPT等人工智能(AI)应用程序提供了潜在的解决方案,以加强患者教育,并帮助护士提供更准确和个性化的信息。尽管它在护理教育中有很大的潜力,但关于信息准确性、隐私和道德考虑的问题必须得到解决。目的:本综述旨在总结目前在护理实践中使用ChatGPT (OpenAI)作为教育资源的证据,重点关注其教育功能、报告结果和实施挑战。方法:采用PubMed、Scopus、ProQuest 3个数据库进行文献检索。遵循人口、概念和上下文框架。纳入标准包括2019年至2025年间发表的英文研究,并提供全文。PRISMA-ScR(首选报告项目的系统评价和荟萃分析扩展范围评价)指南用于指导筛选和选择过程。结果:我们纳入了20篇文章,综合了4个主要发现:(1)AI在患者教育和医疗信息简化中的应用;(2)AI在临床决策和患者监护中的应用;(3)AI在护理教育中的应用;(4)AI在护理中的挑战和前景。在所有研究中,通常报告的局限性包括反应准确性不一致、伦理问题和缺乏标准化的实施指南。结论:ChatGPT显示了作为护理实践辅助教育资源的前景,特别是在信息可及性和学习者参与方面。然而,它的使用需要专业监督、道德保障和进一步以实施为重点的研究。
{"title":"ChatGPT as an AI-Enabled Educational Resource in Nursing Practice: Scoping Review of Uses, Outcomes, and Implementation Challenges.","authors":"Selviana Anwar, Saldy Yusuf, Maria Kurnyata Rante Kada, Farawansah Mustafa","doi":"10.2196/79551","DOIUrl":"10.2196/79551","url":null,"abstract":"<p><strong>Background: </strong>High-quality nursing services are essential for improving patient satisfaction and health outcomes. Today, artificial intelligence (AI) applications such as ChatGPT offer potential solutions to enhance patient education and assist nurses in providing more accurate and personalized information. Despite its promising potential in nursing education, concerns regarding information accuracy, privacy, and ethical considerations must be addressed.</p><p><strong>Objective: </strong>This scoping review aimed to map the current evidence on the use of ChatGPT (OpenAI) as an educational resource in nursing practice, focusing on its educational functions, reported outcomes, and implementation challenges.</p><p><strong>Methods: </strong>The literature search was conducted using 3 databases (PubMed, Scopus, and ProQuest). Following the Population, Concept, and Context framework. Inclusion criteria encompassed studies published between 2019 and 2025, in English, and available in full text. The PRISMA-ScR (Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews) guideline was used to guide the screening and selection process.</p><p><strong>Results: </strong>We included 20 articles and synthesized four main findings: (1) AI in patient education and simplification of medical information, (2) AI in clinical decision-making and patient monitoring, (3) AI in nursing education, and (4) challenges and prospects of AI in nursing. Across studies, commonly reported limitations involved response accuracy inconsistencies, ethical concerns, and the absence of standardized implementation guidelines.</p><p><strong>Conclusions: </strong>ChatGPT shows promise as an adjunct educational resource in nursing practice, particularly for information accessibility and learner engagement. Nevertheless, its use requires professional oversight, ethical safeguards, and further implementation-focused research.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"5 ","pages":"e79551"},"PeriodicalIF":2.0,"publicationDate":"2026-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12977163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147438076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Important Ethical, Technical, and Epidemiological Considerations in an AI Tool Production (ETEPAI): Scoping Review. 人工智能工具生产(ETEPAI)中重要的伦理、技术和流行病学考虑:范围审查。
IF 2 Pub Date : 2026-03-05 DOI: 10.2196/80340
Boon How Chew, Kee Yuan Ngiam

Background: Artificial intelligence (AI) tools are being developed in a rapidly evolving technology. The convergence of ethical, technical, and research methods' considerations is crucial for multidisciplinary teams aiming to produce effective AI tools. The success of these tools postdeployment hinges on the intricate interplay between the AI system's development on its output through rigorous decision-making processes and stakeholders' capacity to act on the AI's recommendations.

Objective: This paper synthesizes ethical, technical, and epidemiological considerations for all involved in artificial intelligence tool production (ETEPAI), based on established guidelines, checklists, and frameworks.

Methods: Relevant guidelines, checklists, frameworks, and expert recommendations were systematically identified and synthesized into ETEPAI, an ethical, technical, and epidemiological framework for AI tool development in health care.

Results: From 30 reviewed frameworks, ETEPAI integrates critical considerations across 4 stages (design, development, deployment, and postdeployment) and 3 domains (ethics, technical, and epidemiological), providing a compact yet comprehensive guide. It includes probing questions, key indicators, and common pitfalls to support high-quality, ethically sound, and clinically relevant AI tools. ETEPAI aligns with European Union trustworthiness standards and is supported by a research proposal template and supplementary references to aid implementation and adoption. We present probing questions and critical pointers across 4 stages from the design, development, deployment, and postdeployment, highlighting their relevance in health care settings. The designing stage aligns with epidemiologic research methodologies, while the development stage emphasizes transparent project execution. Deployment and postdeployment stages focus on real-world implementation. Additionally included are common pitfalls and challenges to emphasize the importance of due attention to the importance of ETEPAI considerations to avoid serious consequences.

Conclusions: Applying ETEPAI ensures comprehensive, complete, compact, and crisp consideration from conception to execution, promoting high-quality, ethically sound, and clinically relevant AI tools. The brevity and conciseness of ETEPAI might be adequate for trained personnel and serve as clear signposts to unprepared stakeholders.

背景:人工智能(AI)工具是一项快速发展的技术。伦理、技术和研究方法的融合对于旨在生产有效人工智能工具的多学科团队至关重要。这些工具在部署后的成功取决于人工智能系统通过严格的决策过程对其输出的开发与利益相关者根据人工智能建议采取行动的能力之间复杂的相互作用。目的:本文基于已建立的指南、清单和框架,综合了人工智能工具生产(ETEPAI)中所有涉及的伦理、技术和流行病学考虑因素。方法:系统地确定相关指南、清单、框架和专家建议,并将其综合到ETEPAI中,这是卫生保健中人工智能工具开发的伦理、技术和流行病学框架。结果:从30个审查框架中,ETEPAI整合了4个阶段(设计、开发、部署和部署后)和3个领域(伦理、技术和流行病学)的关键考虑因素,提供了一个紧凑而全面的指南。它包括探索性问题、关键指标和常见陷阱,以支持高质量、合乎道德和临床相关的人工智能工具。ETEPAI符合欧洲联盟的可信度标准,并得到研究建议模板和辅助参考资料的支持,以帮助实施和采用。我们提出了从设计、开发、部署和部署后的4个阶段的探索性问题和关键指针,强调了它们在医疗保健环境中的相关性。设计阶段与流行病学研究方法保持一致,而开发阶段强调透明的项目执行。部署和部署后阶段侧重于现实世界的实现。此外,还包括常见的陷阱和挑战,以强调对ETEPAI考虑因素的重要性给予应有的重视,以避免严重后果。结论:应用ETEPAI确保了从构思到执行的全面、完整、紧凑、清晰的考虑,促进了高质量、合乎伦理、临床相关的人工智能工具。ETEPAI的简洁性和简洁性对于训练有素的人员来说可能是足够的,并且可以作为没有准备的利益相关者的明确路标。
{"title":"Important Ethical, Technical, and Epidemiological Considerations in an AI Tool Production (ETEPAI): Scoping Review.","authors":"Boon How Chew, Kee Yuan Ngiam","doi":"10.2196/80340","DOIUrl":"10.2196/80340","url":null,"abstract":"<p><strong>Background: </strong>Artificial intelligence (AI) tools are being developed in a rapidly evolving technology. The convergence of ethical, technical, and research methods' considerations is crucial for multidisciplinary teams aiming to produce effective AI tools. The success of these tools postdeployment hinges on the intricate interplay between the AI system's development on its output through rigorous decision-making processes and stakeholders' capacity to act on the AI's recommendations.</p><p><strong>Objective: </strong>This paper synthesizes ethical, technical, and epidemiological considerations for all involved in artificial intelligence tool production (ETEPAI), based on established guidelines, checklists, and frameworks.</p><p><strong>Methods: </strong>Relevant guidelines, checklists, frameworks, and expert recommendations were systematically identified and synthesized into ETEPAI, an ethical, technical, and epidemiological framework for AI tool development in health care.</p><p><strong>Results: </strong>From 30 reviewed frameworks, ETEPAI integrates critical considerations across 4 stages (design, development, deployment, and postdeployment) and 3 domains (ethics, technical, and epidemiological), providing a compact yet comprehensive guide. It includes probing questions, key indicators, and common pitfalls to support high-quality, ethically sound, and clinically relevant AI tools. ETEPAI aligns with European Union trustworthiness standards and is supported by a research proposal template and supplementary references to aid implementation and adoption. We present probing questions and critical pointers across 4 stages from the design, development, deployment, and postdeployment, highlighting their relevance in health care settings. The designing stage aligns with epidemiologic research methodologies, while the development stage emphasizes transparent project execution. Deployment and postdeployment stages focus on real-world implementation. Additionally included are common pitfalls and challenges to emphasize the importance of due attention to the importance of ETEPAI considerations to avoid serious consequences.</p><p><strong>Conclusions: </strong>Applying ETEPAI ensures comprehensive, complete, compact, and crisp consideration from conception to execution, promoting high-quality, ethically sound, and clinically relevant AI tools. The brevity and conciseness of ETEPAI might be adequate for trained personnel and serve as clear signposts to unprepared stakeholders.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"5 ","pages":"e80340"},"PeriodicalIF":2.0,"publicationDate":"2026-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12977167/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147438102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing COVID-19 Screening Models With Epidemiological and Mobility Features: Machine-Learning Model Study. 增强具有流行病学和流动性特征的COVID-19筛查模型:机器学习模型研究。
IF 2 Pub Date : 2026-03-05 DOI: 10.2196/54956
Hyunwoo Choo, Dohyung Lee, Soo-Yong Shin, Jiwoo Lee, Duhun Lee, Eonji Kim, Namsoo Oh, Christina Kim, Myeongchan Kim, Hyo Jung Kim

Background: Despite the significant post-COVID-19 pandemic surge in research using symptom data and machine learning (ML) for patient screening, data on patient trajectories and epidemiological conditions, although crucial, have remained underused.

Objective: This study aimed to enhance the performance of ML models for COVID-19 screening by incorporating mobility and epidemic information in addition to patient symptom data.

Methods: Data, including daily self-reported symptoms, location information, and test results, were collected from 48,798 individuals using a smartphone app. These data were then combined with Our World in Data and national government epidemic information to train 5 ML-based screening models to classify patient infection status. The models were logistic regression, extreme gradient boosting, light gradient boosting machine, tabular data network, and Google AutoML.

Results: The addition of mobility and epidemic data significantly improved the performance of all 5 models. The highest area under the receiver operating characteristic curve score increased from 0.8712 without mobility and epidemic data to 0.9104 with mobility and epidemic data. This highlights the considerable impact of external information on enhancing the performance of ML models.

Conclusions: This study demonstrated the potential of using mobility and epidemic data, such as location information and epidemic data, in combination with patient symptom data to improve the accuracy of ML models for diagnosing COVID-19. Considering additional contextual information can enhance the ability to screen for COVID-19.

背景:尽管在covid -19大流行后,使用症状数据和机器学习(ML)进行患者筛查的研究大幅增加,但关于患者轨迹和流行病学状况的数据虽然至关重要,但仍未得到充分利用。目的:本研究旨在通过在患者症状数据的基础上纳入流动性和流行信息,提高ML模型在COVID-19筛查中的性能。方法:使用智能手机应用程序收集48,798人的日常自我报告症状、位置信息和检测结果等数据。然后将这些数据与Our World in Data和国家政府流行病信息相结合,训练5个基于ml的筛查模型,对患者感染状况进行分类。模型包括逻辑回归、极端梯度增强、轻梯度增强机、表格数据网络和谷歌AutoML。结果:流动性和流行病数据的加入显著提高了5个模型的性能。受试者工作特征曲线下的最高面积由无流动性和流行数据的0.8712增加到有流动性和流行数据的0.9104。这突出了外部信息对增强ML模型性能的巨大影响。结论:本研究表明,将位置信息和流行数据等流动性和流行数据与患者症状数据相结合,可以提高ML模型诊断COVID-19的准确性。考虑更多的背景信息可以增强筛查COVID-19的能力。
{"title":"Enhancing COVID-19 Screening Models With Epidemiological and Mobility Features: Machine-Learning Model Study.","authors":"Hyunwoo Choo, Dohyung Lee, Soo-Yong Shin, Jiwoo Lee, Duhun Lee, Eonji Kim, Namsoo Oh, Christina Kim, Myeongchan Kim, Hyo Jung Kim","doi":"10.2196/54956","DOIUrl":"10.2196/54956","url":null,"abstract":"<p><strong>Background: </strong>Despite the significant post-COVID-19 pandemic surge in research using symptom data and machine learning (ML) for patient screening, data on patient trajectories and epidemiological conditions, although crucial, have remained underused.</p><p><strong>Objective: </strong>This study aimed to enhance the performance of ML models for COVID-19 screening by incorporating mobility and epidemic information in addition to patient symptom data.</p><p><strong>Methods: </strong>Data, including daily self-reported symptoms, location information, and test results, were collected from 48,798 individuals using a smartphone app. These data were then combined with Our World in Data and national government epidemic information to train 5 ML-based screening models to classify patient infection status. The models were logistic regression, extreme gradient boosting, light gradient boosting machine, tabular data network, and Google AutoML.</p><p><strong>Results: </strong>The addition of mobility and epidemic data significantly improved the performance of all 5 models. The highest area under the receiver operating characteristic curve score increased from 0.8712 without mobility and epidemic data to 0.9104 with mobility and epidemic data. This highlights the considerable impact of external information on enhancing the performance of ML models.</p><p><strong>Conclusions: </strong>This study demonstrated the potential of using mobility and epidemic data, such as location information and epidemic data, in combination with patient symptom data to improve the accuracy of ML models for diagnosing COVID-19. Considering additional contextual information can enhance the ability to screen for COVID-19.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"5 ","pages":"e54956"},"PeriodicalIF":2.0,"publicationDate":"2026-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12978548/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147438042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JMIR AI
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1