首页 > 最新文献

JMIR Medical Informatics最新文献

英文 中文
Large Language Model-Enabled Editing of Patient Audio Interviews From "This Is My Story" Conversations: Comparative Study. “这是我的故事”对话中患者音频访谈的大型语言模型编辑:比较研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-09 DOI: 10.2196/80205
Bikram Bains, Sampath Rapuri, Edgar Robitaille, Jonathan Wang, Arnav Khera, Catalina Gomez, Eduardo Reyes, Cole Perry, Jason Wilson, Elizabeth Tracey
<p><strong>Background: </strong>This Is My Story (TIMS) was started by Chaplain Elizabeth Tracey to promote a humanistic approach to medicine. Patients in the TIMS program are the subject of a guided conversation in which a chaplain interviews either the patient or their loved one. They are asked four questions to elicit clinically actionable information that has been shown to improve communication between patients and medical providers, strengthening medical providers' empathy. The original recorded conversation is edited into a condensed audio file approximately 1 minute and 15 seconds in length and placed in the electronic health record where it is easily accessible by all providers caring for the patient.</p><p><strong>Objective: </strong>TIMS is active at the Johns Hopkins Hospital and has shown value in assisting with provider empathy and communication. It is unique in using audio recordings to accomplish this purpose. As the program expands, there exists a barrier to adoption due to limited time and resources needed to manually edit audio conversations. To address this, we propose an automated solution using a large language model to create meaningful and concise audio summaries.</p><p><strong>Methods: </strong>We analyzed 24 TIMS audio interviews and created three edited versions of each: (1) expert-edited, (2) artificial intelligence (AI)-edited using a fully automated large language model pipeline, and (3) novice-edited by two medical students trained by the expert. A second expert, blinded to the editor, rated the audio interviews in a randomized order. This expert scored both the audio quality and content quality of each interview on 5-point Likert scales. We quantified transcript similarity to the expert-edited reference using lexical and semantic similarity metrics and identified omitted content relative to that same expert interview.</p><p><strong>Results: </strong>Audio quality (flow, pacing, clarity) and content quality (coherence, relevance, nuance) were each rated on 5-point Likert scales. Expert-edited interviews received the highest mean ratings for both audio quality (4.84) and content quality (4.83). Novice-edited scored moderately (3.84 audio, 3.63 content), while AI-edited scored slightly lower (3.49 audio, 3.20 content). Novice and AI edits were rated significantly lower than the expert edits (P<.001), but not significantly different from each other. AI and novice-edited interview transcripts had comparable overlap with the expert reference transcript, while qualitative review found frequent omissions of patient identity, actionable insights, and overall context in both the AI and novice-edited interviews. AI editing was fully automated and significantly reduced the editing time compared to both human editors.</p><p><strong>Conclusions: </strong>An AI-based editing pipeline can generate TIMS audio summaries with comparable content and audio quality to novice human editors with one hour of training. AI significantly reduc
背景:这是我的故事(TIMS)是由牧师伊丽莎白·特雷西(Elizabeth Tracey)发起的,旨在推广以人为本的医学方法。在TIMS项目中,患者是由牧师与患者或他们所爱的人进行引导对话的对象。他们被问及四个问题,以引出临床可操作的信息,这些信息已被证明可以改善患者和医疗提供者之间的沟通,加强医疗提供者的同理心。原始记录的谈话被编辑成一个长度约为1分15秒的压缩音频文件,并放置在电子健康记录中,以便所有照顾患者的提供者都可以轻松访问。目的:TIMS在约翰霍普金斯医院很活跃,并在协助提供者移情和沟通方面显示出价值。它在使用录音来实现这一目的方面是独一无二的。随着程序的扩展,由于手动编辑音频对话所需的时间和资源有限,采用存在障碍。为了解决这个问题,我们提出了一个自动化的解决方案,使用一个大的语言模型来创建有意义和简洁的音频摘要。方法:我们分析了24个TIMS音频访谈,并创建了三个编辑版本:(1)专家编辑,(2)人工智能(AI)使用全自动大型语言模型管道编辑,(3)由专家培训的两名医学生编辑。另一位对编辑不知情的专家,以随机顺序对音频采访进行评级。这位专家对每次采访的音频质量和内容质量都进行了5分李克特评分。我们使用词汇和语义相似性度量量化了与专家编辑的参考文献的抄本相似性,并确定了相对于同一专家访谈的遗漏内容。结果:音频质量(流畅度、节奏、清晰度)和内容质量(连贯性、相关性、细微差别)均以5分李克特量表进行评分。专家编辑的访谈在音频质量(4.84)和内容质量(4.83)方面都获得了最高的平均评分。新手编辑得分中等(音频3.84,内容3.63),而人工智能编辑得分略低(音频3.49,内容3.20)。新手和人工智能编辑的评分明显低于专家编辑(p结论:基于人工智能的编辑管道可以生成内容和音频质量与新手编辑相当的TIMS音频摘要,只需经过一小时的培训。人工智能大大减少了编辑时间,消除了人工训练的需要;经过进一步验证,它可以提供一种解决方案,将TIMS扩展到更大范围的医疗保健环境。
{"title":"Large Language Model-Enabled Editing of Patient Audio Interviews From \"This Is My Story\" Conversations: Comparative Study.","authors":"Bikram Bains, Sampath Rapuri, Edgar Robitaille, Jonathan Wang, Arnav Khera, Catalina Gomez, Eduardo Reyes, Cole Perry, Jason Wilson, Elizabeth Tracey","doi":"10.2196/80205","DOIUrl":"10.2196/80205","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;This Is My Story (TIMS) was started by Chaplain Elizabeth Tracey to promote a humanistic approach to medicine. Patients in the TIMS program are the subject of a guided conversation in which a chaplain interviews either the patient or their loved one. They are asked four questions to elicit clinically actionable information that has been shown to improve communication between patients and medical providers, strengthening medical providers' empathy. The original recorded conversation is edited into a condensed audio file approximately 1 minute and 15 seconds in length and placed in the electronic health record where it is easily accessible by all providers caring for the patient.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;TIMS is active at the Johns Hopkins Hospital and has shown value in assisting with provider empathy and communication. It is unique in using audio recordings to accomplish this purpose. As the program expands, there exists a barrier to adoption due to limited time and resources needed to manually edit audio conversations. To address this, we propose an automated solution using a large language model to create meaningful and concise audio summaries.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We analyzed 24 TIMS audio interviews and created three edited versions of each: (1) expert-edited, (2) artificial intelligence (AI)-edited using a fully automated large language model pipeline, and (3) novice-edited by two medical students trained by the expert. A second expert, blinded to the editor, rated the audio interviews in a randomized order. This expert scored both the audio quality and content quality of each interview on 5-point Likert scales. We quantified transcript similarity to the expert-edited reference using lexical and semantic similarity metrics and identified omitted content relative to that same expert interview.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Audio quality (flow, pacing, clarity) and content quality (coherence, relevance, nuance) were each rated on 5-point Likert scales. Expert-edited interviews received the highest mean ratings for both audio quality (4.84) and content quality (4.83). Novice-edited scored moderately (3.84 audio, 3.63 content), while AI-edited scored slightly lower (3.49 audio, 3.20 content). Novice and AI edits were rated significantly lower than the expert edits (P&lt;.001), but not significantly different from each other. AI and novice-edited interview transcripts had comparable overlap with the expert reference transcript, while qualitative review found frequent omissions of patient identity, actionable insights, and overall context in both the AI and novice-edited interviews. AI editing was fully automated and significantly reduced the editing time compared to both human editors.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;An AI-based editing pipeline can generate TIMS audio summaries with comparable content and audio quality to novice human editors with one hour of training. AI significantly reduc","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e80205"},"PeriodicalIF":3.8,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12788710/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Applicability of Existing Gender Scores for German Clinical Research Data: Scoping Review and Data Mapping. 现有性别评分对德国临床研究数据的适用性:范围审查和数据映射。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-08 DOI: 10.2196/74162
Lea Schindler, Hilke Beelich, Elpiniki Katsari, Daniele Liprandi, Sylvia Stracke, Dagmar Waltemath

Background: Considering sex and gender improves research quality, innovation, and social equity, while ignoring them leads to inaccuracies and inefficiency in study results. Despite increasing attention on sex- and gender-sensitive medicine, challenges remain with accurately representing gender due to its dynamic and context-specific nature.

Objective: This work aims to contribute to the implementation of a standard for collecting and assessing gender-specific data in German university hospitals and associated research facilities.

Methods: We carried out a review to identify and categorize state-of-the-art gender scores. We systematically assessed 22 publications regarding the applicability and practicability of their proposed gender scores. Specifically, we evaluated the use of these gender scores on German research data from routine clinical practice, using the Medical Informatics Initiative core dataset (MII CDS).

Results: Different methods for assessing gender have been proposed, but no standardized and validated gender score is available for health research. Most gender scores target epidemiological or public health research where questions about social aspects and life habits are already part of the questionnaires. However, it is challenging to apply concepts for gender scoring on clinical data. The MII CDS, for example, lacks all variables currently being recorded in gender scores. Although some of the required variables are indeed present in routine clinical data, they need to become part of the MII CDS.

Conclusions: To enable gender-specific retrospective analysis of routine clinical data, we recommend updating and expanding the MII CDS by including more gender-relevant information. For this purpose, we provide concrete action steps on how gender-related variables can be captured in routine clinical practice and represented in a machine-readable way.

背景:考虑性别和社会性别可以提高研究质量、创新和社会公平,忽视性别和社会性别会导致研究结果的不准确和低效率。尽管越来越多的人关注性别和性别敏感的医学,但由于其动态和具体情况的性质,在准确代表性别方面仍然存在挑战。目的:这项工作的目的是促进在德国大学医院和相关研究机构中执行一项收集和评估具体性别数据的标准。方法:我们进行了一项综述,以确定和分类最新的性别得分。我们系统地评估了22份出版物关于他们提出的性别分数的适用性和实用性。具体而言,我们使用医学信息学倡议核心数据集(MII CDS)评估了这些性别评分在德国常规临床实践研究数据中的使用情况。结果:人们提出了不同的性别评估方法,但没有标准化和有效的性别评分用于健康研究。大多数性别评分针对流行病学或公共卫生研究,在这些研究中,关于社会方面和生活习惯的问题已经是问卷的一部分。然而,将性别评分概念应用于临床数据是具有挑战性的。例如,MII的CDS缺乏目前在性别分数中记录的所有变量。虽然一些必要的变量确实存在于常规临床数据中,但它们需要成为MII CDS的一部分。结论:为了能够对常规临床数据进行针对性别的回顾性分析,我们建议更新和扩展MII CDS,包括更多与性别相关的信息。为此,我们提供了具体的行动步骤,说明如何在常规临床实践中捕获与性别相关的变量,并以机器可读的方式表示。
{"title":"Applicability of Existing Gender Scores for German Clinical Research Data: Scoping Review and Data Mapping.","authors":"Lea Schindler, Hilke Beelich, Elpiniki Katsari, Daniele Liprandi, Sylvia Stracke, Dagmar Waltemath","doi":"10.2196/74162","DOIUrl":"10.2196/74162","url":null,"abstract":"<p><strong>Background: </strong>Considering sex and gender improves research quality, innovation, and social equity, while ignoring them leads to inaccuracies and inefficiency in study results. Despite increasing attention on sex- and gender-sensitive medicine, challenges remain with accurately representing gender due to its dynamic and context-specific nature.</p><p><strong>Objective: </strong>This work aims to contribute to the implementation of a standard for collecting and assessing gender-specific data in German university hospitals and associated research facilities.</p><p><strong>Methods: </strong>We carried out a review to identify and categorize state-of-the-art gender scores. We systematically assessed 22 publications regarding the applicability and practicability of their proposed gender scores. Specifically, we evaluated the use of these gender scores on German research data from routine clinical practice, using the Medical Informatics Initiative core dataset (MII CDS).</p><p><strong>Results: </strong>Different methods for assessing gender have been proposed, but no standardized and validated gender score is available for health research. Most gender scores target epidemiological or public health research where questions about social aspects and life habits are already part of the questionnaires. However, it is challenging to apply concepts for gender scoring on clinical data. The MII CDS, for example, lacks all variables currently being recorded in gender scores. Although some of the required variables are indeed present in routine clinical data, they need to become part of the MII CDS.</p><p><strong>Conclusions: </strong>To enable gender-specific retrospective analysis of routine clinical data, we recommend updating and expanding the MII CDS by including more gender-relevant information. For this purpose, we provide concrete action steps on how gender-related variables can be captured in routine clinical practice and represented in a machine-readable way.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e74162"},"PeriodicalIF":3.8,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12782135/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145936443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large Language Models in Patient Health Communication for Atherosclerotic Cardiovascular Disease: Pilot Cross-Sectional Comparative Analysis. 动脉粥样硬化性心血管疾病患者健康交流的大语言模型:试点横断面比较分析。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-07 DOI: 10.2196/81422
Pengfei Li, Yinfei Xu, Xiang Liu, Zhean Shen, Yi Wang, Xinyi Lv, Ziyi Lu, Hui Wu, Jiaqi Zhuang, Yan Chen
<p><strong>Background: </strong>Large language models (LLMs) have emerged as promising tools for enhancing public access to medical information, particularly for chronic diseases such as atherosclerotic cardiovascular disease (ASCVD). However, their effectiveness in patient-centered health communication remains underexplored, especially in multilingual contexts.</p><p><strong>Objective: </strong>Our study aimed to conduct a comparative evaluation of 3 advanced LLMs-DeepSeek R1, ChatGPT-4o, and Gemini-in generating responses to ASCVD-related patient queries in both English and Chinese, assessing their performance across the domains of accuracy, completeness, and comprehensibility.</p><p><strong>Methods: </strong>We conducted a cross-sectional evaluation based on 25 clinically validated ASCVD questions spanning 5 domains-definitions, diagnosis, treatment, prevention, and lifestyle. Each question was submitted 5 times to each of the 3 LLMs in both English and Chinese, yielding 750 responses in total, all generated under default settings to approximate real-world conditions. Three board-certified cardiologists blinded to model identity independently scored the responses using standardized Likert scales with predefined anchors. The assessment followed a rigorous multistage process that incorporated randomization, washout periods, and final consensus scoring.</p><p><strong>Results: </strong>DeepSeek R1 achieved the highest "good response" rates (24/25, 96% in both English and Chinese), substantially outperforming ChatGPT-4o (21/25, 84%) and Gemini (12/25, 48% in English and 17/25, 68% in Chinese). DeepSeek R1 demonstrated superior median accuracy scores (6, IQR 6-6 in both languages) and completeness scores (3, IQR 2-3 in both languages) compared to the other models (P<.001). All models had a median comprehensibility score of 3; however, in English, DeepSeek R1 and ChatGPT-4o were rated significantly clearer than Gemini (P=.006 and P=.03, respectively), whereas no significant between-model differences were observed in Chinese (P=.08). Interrater reliability was moderate (Kendall W: accuracy=0.578; completeness=0.565; comprehensibility=0.486). Performance was consistently stronger for definitional and diagnostic questions than for treatment and prevention topics across all models. Specifically, none of the models consistently provided responses aligned with the latest clinical guidelines for the following key guideline-facing question "What is the standard treatment regimen for ASCVD?"</p><p><strong>Conclusions: </strong>DeepSeek R1 exhibited promising and consistent performance in generating high-quality, patient-facing ASCVD information across both English and Chinese, highlighting the potential of open-source LLMs in promoting digital health literacy and equitable access to chronic disease information. However, a clinically critical weakness was observed in guideline-sensitive treatment: the models did not reliably provide guideline-concordant standa
背景:大型语言模型(LLMs)已成为增强公众获取医疗信息的有前途的工具,特别是对于慢性疾病,如动脉粥样硬化性心血管疾病(ASCVD)。然而,它们在以患者为中心的健康沟通中的有效性仍未得到充分探索,特别是在多语言背景下。目的:我们的研究旨在对3种先进的llms (deepseek R1、chatgpt - 40和gemini)进行比较评估,以生成ascvd相关患者查询的中英文回复,评估它们在准确性、完整性和可理解性方面的表现。方法:我们基于25个临床验证的ASCVD问题进行了横断面评估,涉及5个领域:定义、诊断、治疗、预防和生活方式。每个问题向3位法学硕士分别提交了5次英文和中文,总共得到750个回答,所有回答都是在默认设置下生成的,以近似真实情况。三名委员会认证的心脏病专家对模型身份一无所知,他们使用带有预定义锚点的标准化李克特量表独立地对反应进行评分。评估遵循严格的多阶段过程,包括随机化、洗脱期和最终共识评分。结果:DeepSeek R1获得了最高的“良好反应”率(24/25,英语和中文均为96%),大大优于chatgpt - 40(21/25, 84%)和Gemini(12/25,英语48%,17/25,中文68%)。与其他模型相比,DeepSeek R1显示出更高的中位数准确性得分(6,两种语言的IQR 6-6)和完整性得分(3,两种语言的IQR 2-3)。结论:DeepSeek R1在生成高质量、面向患者的中英文ASCVD信息方面表现出有希望和一致的表现,突出了开源法学硕士在促进数字健康素养和公平获取慢性病信息方面的潜力。然而,在指南敏感治疗中观察到一个临床关键弱点:模型不能可靠地提供与指南一致的标准治疗方案,这表明LLM的使用应限于低风险的信息子查询(例如,定义,诊断和生活方式教育),除非有专家监督和安全控制。
{"title":"Large Language Models in Patient Health Communication for Atherosclerotic Cardiovascular Disease: Pilot Cross-Sectional Comparative Analysis.","authors":"Pengfei Li, Yinfei Xu, Xiang Liu, Zhean Shen, Yi Wang, Xinyi Lv, Ziyi Lu, Hui Wu, Jiaqi Zhuang, Yan Chen","doi":"10.2196/81422","DOIUrl":"10.2196/81422","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Large language models (LLMs) have emerged as promising tools for enhancing public access to medical information, particularly for chronic diseases such as atherosclerotic cardiovascular disease (ASCVD). However, their effectiveness in patient-centered health communication remains underexplored, especially in multilingual contexts.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;Our study aimed to conduct a comparative evaluation of 3 advanced LLMs-DeepSeek R1, ChatGPT-4o, and Gemini-in generating responses to ASCVD-related patient queries in both English and Chinese, assessing their performance across the domains of accuracy, completeness, and comprehensibility.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We conducted a cross-sectional evaluation based on 25 clinically validated ASCVD questions spanning 5 domains-definitions, diagnosis, treatment, prevention, and lifestyle. Each question was submitted 5 times to each of the 3 LLMs in both English and Chinese, yielding 750 responses in total, all generated under default settings to approximate real-world conditions. Three board-certified cardiologists blinded to model identity independently scored the responses using standardized Likert scales with predefined anchors. The assessment followed a rigorous multistage process that incorporated randomization, washout periods, and final consensus scoring.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;DeepSeek R1 achieved the highest \"good response\" rates (24/25, 96% in both English and Chinese), substantially outperforming ChatGPT-4o (21/25, 84%) and Gemini (12/25, 48% in English and 17/25, 68% in Chinese). DeepSeek R1 demonstrated superior median accuracy scores (6, IQR 6-6 in both languages) and completeness scores (3, IQR 2-3 in both languages) compared to the other models (P&lt;.001). All models had a median comprehensibility score of 3; however, in English, DeepSeek R1 and ChatGPT-4o were rated significantly clearer than Gemini (P=.006 and P=.03, respectively), whereas no significant between-model differences were observed in Chinese (P=.08). Interrater reliability was moderate (Kendall W: accuracy=0.578; completeness=0.565; comprehensibility=0.486). Performance was consistently stronger for definitional and diagnostic questions than for treatment and prevention topics across all models. Specifically, none of the models consistently provided responses aligned with the latest clinical guidelines for the following key guideline-facing question \"What is the standard treatment regimen for ASCVD?\"&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;DeepSeek R1 exhibited promising and consistent performance in generating high-quality, patient-facing ASCVD information across both English and Chinese, highlighting the potential of open-source LLMs in promoting digital health literacy and equitable access to chronic disease information. However, a clinically critical weakness was observed in guideline-sensitive treatment: the models did not reliably provide guideline-concordant standa","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e81422"},"PeriodicalIF":3.8,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145913955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Learning for Dynamic Prognostic Prediction in Minimally Invasive Surgery for Intracerebral Hemorrhage: Model Development and Validation Study. 深度学习在脑出血微创手术中的动态预后预测:模型开发和验证研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-07 DOI: 10.2196/86327
Jingxuan Wang, Jian Shi, Qing Ye, Danyang Chen, Yuhao Sun, Chao Pan, Yingxin Tang, Ping Zhang, Zhouping Tang
<p><strong>Background: </strong>The pathological and physiological state of patients with intracerebral hemorrhage (ICH) after minimally invasive surgery (MIS) is a dynamic evolution, and the traditional models cannot dynamically predict prognosis. Clinical data at multiple time points often show the characteristics of different categories, different numbers, and missing data. The existing models lack methods to deal with imbalanced data.</p><p><strong>Objective: </strong>This study aims to develop and validate a dynamic prognostic model using multi-time point data from patients with ICH undergoing MIS to predict survival and functional outcomes.</p><p><strong>Methods: </strong>In this study, 287 patients who underwent MIS for ICH were retrospectively collected on the day of surgery, days 1, 3, 7, and 14 after surgery, and the day of drainage tube removal. Their general information, vital signs, laboratory test findings, neurological function scores, head hematoma volume, and MIS-related indicators were collected. In addition, this study proposes a multistep attention model, namely the MultiStep Transformer. The model can simultaneously output 3 types of prediction probabilities for 30-day survival probability, 180-day survival probability, and 180-day favorable functional outcome (modified Rankin Scale [mRS] 0-3) probability. Five-fold cross-validation was used to evaluate the performance of the model and compare it with mainstream models and traditional scores. The main evaluation indexes included accuracy, precision, recall, and F<sub>1</sub>-score. The predictive performance of the model was evaluated using receiver operating characteristic (ROC) curves; its calibration was assessed via calibration curves; and its clinical utility was examined using decision curve analysis (DCA). Attributable value analysis was conducted to assess the key predictive features.</p><p><strong>Results: </strong>The 30‑day survival rate, 180‑day survival rate, and 180‑day favorable functional outcome rate among 287 patients were 92.3%, 88.8%, and 52.3%, respectively. In terms of predictive efficacy for survival and functional outcomes, the MultiStep Transformer model showed a remarkable superiority over traditional scoring systems and other deep learning models. For these three outcomes, the model achieved areas under the receiver operating characteristic curves (AUROCs) of 0.87 (95% CI 0.82-0.92), 0.85 (95% CI 0.77-0.93), and 0.75 (95% CI 0.72-0.78), with corresponding Brier scores of 0.1041, 0.1115, and 0.231. DCA confirmed that the model provided a definite clinical net benefit when threshold probabilities ranged within 0.06-0.26, 0.04-0.5, and 0.21-0.71.</p><p><strong>Conclusions: </strong>The MultiStep Transformer model proposed in this study can effectively use imbalanced data to construct a model. It possesses good dynamic prediction ability for short-term and long-term survival and functional outcome of patients with ICH undergoing MIS, providing a novel t
背景:脑出血(ICH)患者微创手术(MIS)后的病理生理状态是一个动态演变的过程,传统模型无法动态预测预后。多个时间点的临床资料往往表现为不同类别、不同数量、缺失数据的特点。现有模型缺乏处理不平衡数据的方法。目的:本研究旨在开发和验证一个动态预后模型,该模型使用脑出血患者接受MIS的多时间点数据来预测生存和功能结局。方法:回顾性收集287例因脑出血行MIS的患者,分别于手术当日、术后第1、3、7、14天及拔除引流管当日。收集他们的一般信息、生命体征、实验室检查结果、神经功能评分、头部血肿量和mis相关指标。此外,本研究提出了一个多步骤注意模型,即multistep Transformer。该模型可同时输出30天生存概率、180天生存概率和180天功能预后良好(修正Rankin量表[mRS] 0-3)概率3种预测概率。采用五重交叉验证对模型的性能进行评价,并与主流模型和传统分数进行比较。主要评价指标包括正确率、精密度、召回率和f1分。采用受试者工作特征(ROC)曲线评价模型的预测性能;通过标定曲线对其进行标定;并采用决策曲线分析(DCA)检验其临床应用价值。进行归因价值分析以评估关键预测特征。结果:287例患者的30天生存率为92.3%,180天生存率为88.8%,180天良好功能转归率为52.3%。就生存和功能结果的预测效果而言,MultiStep Transformer模型比传统评分系统和其他深度学习模型显示出显著的优势。对于这三个结果,该模型在受试者工作特征曲线(auroc)下的面积分别为0.87 (95% CI 0.82-0.92)、0.85 (95% CI 0.77-0.93)和0.75 (95% CI 0.72-0.78),对应的Brier评分分别为0.1041、0.1115和0.231。DCA证实,当阈值概率范围在0.06-0.26、0.04-0.5和0.21-0.71之间时,该模型提供了明确的临床净效益。结论:本研究提出的多步变压器模型可以有效地利用不平衡数据构建模型。对脑出血行MIS患者的短期和长期生存及功能结局具有良好的动态预测能力,为脑出血行MIS患者预后的个体化评估提供了一种新的工具。
{"title":"Deep Learning for Dynamic Prognostic Prediction in Minimally Invasive Surgery for Intracerebral Hemorrhage: Model Development and Validation Study.","authors":"Jingxuan Wang, Jian Shi, Qing Ye, Danyang Chen, Yuhao Sun, Chao Pan, Yingxin Tang, Ping Zhang, Zhouping Tang","doi":"10.2196/86327","DOIUrl":"10.2196/86327","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;The pathological and physiological state of patients with intracerebral hemorrhage (ICH) after minimally invasive surgery (MIS) is a dynamic evolution, and the traditional models cannot dynamically predict prognosis. Clinical data at multiple time points often show the characteristics of different categories, different numbers, and missing data. The existing models lack methods to deal with imbalanced data.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to develop and validate a dynamic prognostic model using multi-time point data from patients with ICH undergoing MIS to predict survival and functional outcomes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;In this study, 287 patients who underwent MIS for ICH were retrospectively collected on the day of surgery, days 1, 3, 7, and 14 after surgery, and the day of drainage tube removal. Their general information, vital signs, laboratory test findings, neurological function scores, head hematoma volume, and MIS-related indicators were collected. In addition, this study proposes a multistep attention model, namely the MultiStep Transformer. The model can simultaneously output 3 types of prediction probabilities for 30-day survival probability, 180-day survival probability, and 180-day favorable functional outcome (modified Rankin Scale [mRS] 0-3) probability. Five-fold cross-validation was used to evaluate the performance of the model and compare it with mainstream models and traditional scores. The main evaluation indexes included accuracy, precision, recall, and F&lt;sub&gt;1&lt;/sub&gt;-score. The predictive performance of the model was evaluated using receiver operating characteristic (ROC) curves; its calibration was assessed via calibration curves; and its clinical utility was examined using decision curve analysis (DCA). Attributable value analysis was conducted to assess the key predictive features.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;The 30‑day survival rate, 180‑day survival rate, and 180‑day favorable functional outcome rate among 287 patients were 92.3%, 88.8%, and 52.3%, respectively. In terms of predictive efficacy for survival and functional outcomes, the MultiStep Transformer model showed a remarkable superiority over traditional scoring systems and other deep learning models. For these three outcomes, the model achieved areas under the receiver operating characteristic curves (AUROCs) of 0.87 (95% CI 0.82-0.92), 0.85 (95% CI 0.77-0.93), and 0.75 (95% CI 0.72-0.78), with corresponding Brier scores of 0.1041, 0.1115, and 0.231. DCA confirmed that the model provided a definite clinical net benefit when threshold probabilities ranged within 0.06-0.26, 0.04-0.5, and 0.21-0.71.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;The MultiStep Transformer model proposed in this study can effectively use imbalanced data to construct a model. It possesses good dynamic prediction ability for short-term and long-term survival and functional outcome of patients with ICH undergoing MIS, providing a novel t","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e86327"},"PeriodicalIF":3.8,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145913926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impact of Age on Hospital Outcomes Following Minimally Invasive Posterior Lumbar Interbody Fusion: Retrospective Analysis of the Nationwide Inpatient Sample Database from 2016 to 2020. 年龄对微创后路腰椎椎体间融合术住院疗效的影响:2016 - 2020年全国住院患者样本数据库的回顾性分析
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-06 DOI: 10.2196/76424
Yu-Jun Lin, Fu-Yuan Shih, Jin-Fu Huang, Chun-Wei Ting, Yu-Chin Tsai, Lin Chang, Yu-Hua Huang, Ming-Jung Chuang

Background: Minimally invasive posterior lumbar interbody fusion (MIS-PLIF) is commonly performed to treat degenerative lumbar spinal conditions. Patients of advanced age often present with multiple comorbidities and reduced physiological reserves, influencing surgical risks and recovery. The growing aging population has led to a rising demand for care for older adults, posing significant challenges for health care systems worldwide.

Objective: This study aimed to identify the associations between different age groups and MIS-PLIF outcomes.

Methods: This study retrospectively analyzed data from the United States Nationwide Inpatient Sample collected between 2016 and 2020. Patients aged ≥60 years who underwent MIS-PLIF were eligible for inclusion in this study. Patients were categorized into age groups (60-69, 70-79, and ≥80 y). Logistic and linear regressions were used to determine the associations between the study variables and outcomes, including in-hospital mortality, complications, nonroutine discharge, and length of stay.

Results: A total of 785 patients aged ≥60 (mean age 69.4, SD 0.2) years who underwent MIS-PLIF were included in the analysis, and 18.7% (147/785) experienced at least one complication. After adjustment, compared with patients aged 60 to 69 years, the risk of nonroutine discharge was significantly increased in patients aged 70 to 79 years (adjusted odds ratio 2.33, 95% CI 1.57-3.46; P<.001) and ≥80 years (adjusted odds ratio 4.79, 95% CI 2.64-8.67; P<.001). No significant differences in the risk of complications or length of hospital stay were observed across the age groups.

Conclusions: In older patients undergoing MIS-PLIF, advanced age is an independent predictor of nonroutine discharge. Furthermore, our findings suggest that age alone is not an independent risk factor for complications or extended hospital stays among older patients. These findings underscore that MIS-PLIF is a viable option for older patients, for whom extra attention may still be needed for postoperative care. Implementing age-stratified management for older patients undergoing MIS-PLIF may have important clinical policy implications.

背景:微创后路腰椎椎体间融合术(MIS-PLIF)通常用于治疗腰椎退行性疾病。高龄患者常伴有多种合并症和生理储备减少,影响手术风险和恢复。日益增长的老龄化人口导致对老年人护理的需求不断增加,对世界各地的卫生保健系统构成了重大挑战。目的:本研究旨在确定不同年龄组与miss - plif预后之间的关系。方法:本研究回顾性分析了2016 - 2020年美国全国住院患者样本的数据。年龄≥60岁的miss - plif患者符合纳入本研究的条件。患者按年龄分组(60-69岁、70-79岁和≥80岁)。使用逻辑回归和线性回归来确定研究变量和结果之间的关系,包括住院死亡率、并发症、非常规出院和住院时间。结果:共有785例年龄≥60岁(平均年龄69.4岁,SD 0.2)的miss - plif患者被纳入分析,18.7%(147/785)的患者至少出现了一种并发症。调整后,与60 ~ 69岁患者相比,70 ~ 79岁患者的非常规出院风险显著增加(调整后优势比2.33,95% CI 1.57-3.46; p结论:在接受MIS-PLIF的老年患者中,高龄是非常规出院的独立预测因子。此外,我们的研究结果表明,年龄本身并不是老年患者并发症或延长住院时间的独立危险因素。这些发现强调了miss - plif对于老年患者来说是一个可行的选择,对于他们来说,术后护理可能仍然需要额外的关注。对MIS-PLIF的老年患者实施年龄分层管理可能具有重要的临床政策意义。
{"title":"Impact of Age on Hospital Outcomes Following Minimally Invasive Posterior Lumbar Interbody Fusion: Retrospective Analysis of the Nationwide Inpatient Sample Database from 2016 to 2020.","authors":"Yu-Jun Lin, Fu-Yuan Shih, Jin-Fu Huang, Chun-Wei Ting, Yu-Chin Tsai, Lin Chang, Yu-Hua Huang, Ming-Jung Chuang","doi":"10.2196/76424","DOIUrl":"https://doi.org/10.2196/76424","url":null,"abstract":"<p><strong>Background: </strong>Minimally invasive posterior lumbar interbody fusion (MIS-PLIF) is commonly performed to treat degenerative lumbar spinal conditions. Patients of advanced age often present with multiple comorbidities and reduced physiological reserves, influencing surgical risks and recovery. The growing aging population has led to a rising demand for care for older adults, posing significant challenges for health care systems worldwide.</p><p><strong>Objective: </strong>This study aimed to identify the associations between different age groups and MIS-PLIF outcomes.</p><p><strong>Methods: </strong>This study retrospectively analyzed data from the United States Nationwide Inpatient Sample collected between 2016 and 2020. Patients aged ≥60 years who underwent MIS-PLIF were eligible for inclusion in this study. Patients were categorized into age groups (60-69, 70-79, and ≥80 y). Logistic and linear regressions were used to determine the associations between the study variables and outcomes, including in-hospital mortality, complications, nonroutine discharge, and length of stay.</p><p><strong>Results: </strong>A total of 785 patients aged ≥60 (mean age 69.4, SD 0.2) years who underwent MIS-PLIF were included in the analysis, and 18.7% (147/785) experienced at least one complication. After adjustment, compared with patients aged 60 to 69 years, the risk of nonroutine discharge was significantly increased in patients aged 70 to 79 years (adjusted odds ratio 2.33, 95% CI 1.57-3.46; P<.001) and ≥80 years (adjusted odds ratio 4.79, 95% CI 2.64-8.67; P<.001). No significant differences in the risk of complications or length of hospital stay were observed across the age groups.</p><p><strong>Conclusions: </strong>In older patients undergoing MIS-PLIF, advanced age is an independent predictor of nonroutine discharge. Furthermore, our findings suggest that age alone is not an independent risk factor for complications or extended hospital stays among older patients. These findings underscore that MIS-PLIF is a viable option for older patients, for whom extra attention may still be needed for postoperative care. Implementing age-stratified management for older patients undergoing MIS-PLIF may have important clinical policy implications.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e76424"},"PeriodicalIF":3.8,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145913939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Classification of Lymphoma Subtypes From Histopathological Images Using a U-Net Deep Learning Model: Comparative Evaluation Study. 使用U-Net深度学习模型从组织病理学图像中自动分类淋巴瘤亚型:比较评估研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-06 DOI: 10.2196/72679
Jin Zhao, Xiaolian Wen, Li Ma, Liping Su
<p><strong>Background: </strong>Accurate classification and grading of lymphoma subtypes are essential for treatment planning. Traditional diagnostic methods face challenges of subjectivity and inefficiency, highlighting the need for automated solutions based on deep learning techniques.</p><p><strong>Objective: </strong>This study aimed to investigate the application of deep learning technology, specifically the U-Net model, in classifying and grading lymphoma subtypes to enhance diagnostic precision and efficiency.</p><p><strong>Methods: </strong>In this study, the U-Net model was used as the primary tool for image segmentation integrated with attention mechanisms and residual networks for feature extraction and classification. A total of 620 high-quality histopathological images representing 3 major lymphoma subtypes were collected from The Cancer Genome Atlas and the Cancer Imaging Archive. All images underwent standardized preprocessing, including Gaussian filtering for noise reduction, histogram equalization, and normalization. Data augmentation techniques such as rotation, flipping, and scaling were applied to improve the model's generalization capability. The dataset was divided into training (70%), validation (15%), and test (15%) subsets. Five-fold cross-validation was used to assess model robustness. Performance was benchmarked against mainstream convolutional neural network architectures, including fully convolutional network, SegNet, and DeepLabv3+.</p><p><strong>Results: </strong>The U-Net model achieved high segmentation accuracy, effectively delineating lesion regions and improving the quality of input for classification and grading. The incorporation of attention mechanisms further improved the model's ability to extract key features, whereas the residual structure of the residual network enhanced classification accuracy for complex images. In the test set (N=1250), the proposed fusion model achieved an accuracy of 92% (1150/1250), a sensitivity of 91.04% (1138/1250), a specificity of 89.04% (1113/1250), and an F1-score of 90% (1125/1250) for the classification of the 3 lymphoma subtypes, with an area under the receiver operating characteristic curve of 0.95 (95% CI 0.93-0.97). The high sensitivity and specificity of the model indicate strong clinical applicability, particularly as an assistive diagnostic tool.</p><p><strong>Conclusions: </strong>Deep learning techniques based on the U-Net architecture offer considerable advantages in the automated classification and grading of lymphoma subtypes. The proposed model significantly improved diagnostic accuracy and accelerated pathological evaluation, providing efficient and precise support for clinical decision-making. Future work may focus on enhancing model robustness through integration with advanced algorithms and validating performance across multicenter clinical datasets. The model also holds promise for deployment in digital pathology platforms and artificial intelligence-ass
背景:淋巴瘤亚型的准确分类和分级对治疗计划至关重要。传统的诊断方法面临主观性和低效率的挑战,突出了对基于深度学习技术的自动化解决方案的需求。目的:本研究旨在探讨深度学习技术,特别是U-Net模型在淋巴瘤亚型分类和分级中的应用,以提高诊断的准确性和效率。方法:采用U-Net模型作为图像分割的主要工具,结合注意机制和残差网络进行特征提取和分类。从癌症基因组图谱和癌症影像档案中收集了代表3种主要淋巴瘤亚型的620张高质量的组织病理学图像。所有图像都经过标准化的预处理,包括高斯滤波降噪、直方图均衡化和归一化。采用旋转、翻转、缩放等数据增强技术提高模型的泛化能力。数据集被分为训练子集(70%)、验证子集(15%)和测试子集(15%)。采用五重交叉验证来评估模型的稳健性。对主流卷积神经网络架构进行了性能基准测试,包括全卷积网络、SegNet和DeepLabv3+。结果:U-Net模型获得了较高的分割精度,有效地描绘了病变区域,提高了分类分级输入的质量。注意机制的加入进一步提高了模型提取关键特征的能力,残差网络的残差结构提高了对复杂图像的分类精度。在测试集(N=1250)中,该融合模型对3种淋巴瘤亚型的分类准确率为92%(1150/1250),灵敏度为91.04%(1138/1250),特异性为89.04% (1113/1250),f1评分为90%(1125/1250),受试者工作特征曲线下面积为0.95 (95% CI 0.93-0.97)。该模型的高灵敏度和特异性表明了很强的临床适用性,特别是作为辅助诊断工具。结论:基于U-Net架构的深度学习技术在淋巴瘤亚型的自动分类和分级方面具有相当大的优势。该模型显著提高了诊断准确率,加快了病理评估速度,为临床决策提供了高效、精准的支持。未来的工作可能侧重于通过集成先进的算法和验证跨多中心临床数据集的性能来增强模型的鲁棒性。该模型还有望部署在数字病理平台和人工智能辅助诊断工作流程中,提高筛查效率并促进病理分类的一致性。
{"title":"Automated Classification of Lymphoma Subtypes From Histopathological Images Using a U-Net Deep Learning Model: Comparative Evaluation Study.","authors":"Jin Zhao, Xiaolian Wen, Li Ma, Liping Su","doi":"10.2196/72679","DOIUrl":"10.2196/72679","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Accurate classification and grading of lymphoma subtypes are essential for treatment planning. Traditional diagnostic methods face challenges of subjectivity and inefficiency, highlighting the need for automated solutions based on deep learning techniques.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aimed to investigate the application of deep learning technology, specifically the U-Net model, in classifying and grading lymphoma subtypes to enhance diagnostic precision and efficiency.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;In this study, the U-Net model was used as the primary tool for image segmentation integrated with attention mechanisms and residual networks for feature extraction and classification. A total of 620 high-quality histopathological images representing 3 major lymphoma subtypes were collected from The Cancer Genome Atlas and the Cancer Imaging Archive. All images underwent standardized preprocessing, including Gaussian filtering for noise reduction, histogram equalization, and normalization. Data augmentation techniques such as rotation, flipping, and scaling were applied to improve the model's generalization capability. The dataset was divided into training (70%), validation (15%), and test (15%) subsets. Five-fold cross-validation was used to assess model robustness. Performance was benchmarked against mainstream convolutional neural network architectures, including fully convolutional network, SegNet, and DeepLabv3+.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;The U-Net model achieved high segmentation accuracy, effectively delineating lesion regions and improving the quality of input for classification and grading. The incorporation of attention mechanisms further improved the model's ability to extract key features, whereas the residual structure of the residual network enhanced classification accuracy for complex images. In the test set (N=1250), the proposed fusion model achieved an accuracy of 92% (1150/1250), a sensitivity of 91.04% (1138/1250), a specificity of 89.04% (1113/1250), and an F1-score of 90% (1125/1250) for the classification of the 3 lymphoma subtypes, with an area under the receiver operating characteristic curve of 0.95 (95% CI 0.93-0.97). The high sensitivity and specificity of the model indicate strong clinical applicability, particularly as an assistive diagnostic tool.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Deep learning techniques based on the U-Net architecture offer considerable advantages in the automated classification and grading of lymphoma subtypes. The proposed model significantly improved diagnostic accuracy and accelerated pathological evaluation, providing efficient and precise support for clinical decision-making. Future work may focus on enhancing model robustness through integration with advanced algorithms and validating performance across multicenter clinical datasets. The model also holds promise for deployment in digital pathology platforms and artificial intelligence-ass","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e72679"},"PeriodicalIF":3.8,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12773696/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145913966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scaling Wireless Continuous Vital Signs Monitoring Across an Eight-Hospital Health System: A Digital Health Implementation Report. 在八家医院的健康系统中扩展无线连续生命体征监测:数字健康实施报告。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-03 DOI: 10.2196/78216
Ngoc-Anh Anh Nguyen, Grace Lee, Brendan M Holderread, Terrie Holman, Sarah N Pletcher, Roberta Schwartz
<p><strong>Background: </strong>Frequent vital signs (VS) monitoring is central to inpatient safety but is traditionally measured manually every four hours, a century-old practice that can miss early deterioration, disrupt patient sleep, and impose a heavy documentation burden on nursing staff. Continuous vital signs monitoring (CVSM) using wearable remote patient monitoring (RPM) devices enables near real-time, high-frequency VS measurement while reducing manual workload and preserving patient rest.</p><p><strong>Objective: </strong>This implementation report describes the large-scale implementation of CVSM across an eight-hospital health system. The initiative aimed to: (1) enhance earlier detection of patient deterioration through continuous, algorithm-driven monitoring; (2) improve nursing workflow efficiency by reducing reliance on manual VS checks; and (3) minimize nighttime disruptions to support patient rest and recovery.</p><p><strong>Methods: </strong>The program was designed for system-wide scalability and executed from 2022 to 2024 using a four-phase framework: strategic program design, program planning, go-live preparation, and implementation and optimization. The FDA-cleared wearable device (BioButton®; BioIntelliSense, Golden, CO, USA) continuously measured heart rate (HR), respiratory rate (RR), and skin temperature, with integration into Epic and 24/7 oversight through a centralized Virtual Operations Center (VOC). Rollout followed a staggered playbook across ~2,700 adult non-ICU beds, supported by leadership engagement, supply chain readiness, training, and phased superuser-led adoption.</p><p><strong>Results: </strong>All eight hospitals achieved full deployment between April 2023 and February 2024, with >95% device utilization rates and 100% nursing staff training completion. A standardized escalation workflow filtered ~50% of alerts at the VOC review step, substantially reducing frontline alert burden. Operational refinements included revised HR and RR thresholds and removal of temperature as a single alert trigger. Several units extended overnight manual VS intervals from every four to every six-eight hours, with staff estimating ~4 hours saved per nursing shift. Patient care assistants redirected time toward mobility and personal needs, while staff reported growing confidence in device performance.</p><p><strong>Conclusions: </strong>This initiative represents the first system-wide deployment of CVSM across a diverse, multi-hospital health system. Success was enabled by early strategic alignment, phased rollout, robust IT and monitoring infrastructure, and iterative optimization. The program demonstrates the feasibility of embedding CVSM into routine inpatient care to improve efficiency and patient experience. Transferable strategies, including phased rollouts, centralized monitoring, and structured change management, may inform other health systems pursuing digital vital signs redesign. Future work should rigorously evalua
背景:频繁监测生命体征(VS)对住院患者安全至关重要,但传统上每4小时人工测量一次,这是一种有百年历史的做法,可能错过早期恶化,扰乱患者睡眠,并给护理人员带来沉重的文件负担。使用可穿戴式远程患者监测(RPM)设备的连续生命体征监测(CVSM)可以实现近实时、高频的VS测量,同时减少人工工作量并保证患者休息。目的:本实施报告描述了在八家医院的卫生系统中大规模实施CVSM。该计划旨在:(1)通过持续的、算法驱动的监测,加强对患者病情恶化的早期发现;(2)减少对人工VS检查的依赖,提高护理工作流程效率;(3)尽量减少夜间干扰,以支持患者休息和恢复。方法:采用战略方案设计、方案规划、上线准备、实施与优化四阶段框架,从2022年至2024年实施全系统可扩展性方案。fda批准的可穿戴设备(BioButton®;BioIntelliSense, Golden, CO, USA)连续测量心率(HR),呼吸频率(RR)和皮肤温度,并通过集中式虚拟操作中心(VOC)集成到Epic和24/7监督。在领导参与、供应链准备、培训和分阶段的超级用户主导采用的支持下,在约2700张成人非icu病床上错开了剧本。结果:8家医院均于2023年4月至2024年2月实现全面部署,设备使用率达95%,护理人员培训完成率达100%。标准化的升级工作流程在VOC审查步骤中过滤了约50%的警报,大大减少了一线警报负担。操作改进包括修改了HR和RR阈值,并取消了作为单一警报触发器的温度。一些单位将夜间手动VS间隔从每4小时延长到每6 - 8小时,工作人员估计每个护理班次节省约4小时。病人护理助理将时间重新分配到移动性和个人需求上,而工作人员则报告对设备性能的信心日益增强。结论:这一举措代表了CVSM在一个多样化、多医院的卫生系统中的首次全系统部署。成功是通过早期的战略调整、分阶段推出、健壮的IT和监视基础设施以及迭代优化实现的。该计划证明了将CVSM嵌入日常住院护理以提高效率和患者体验的可行性。可转移的策略,包括分阶段推广、集中监测和结构化变更管理,可以为其他追求数字生命体征重新设计的卫生系统提供信息。未来的工作应严格评估对患者预后的影响,成本效益,以及对急性后和门诊护理的适用性。临床试验:
{"title":"Scaling Wireless Continuous Vital Signs Monitoring Across an Eight-Hospital Health System: A Digital Health Implementation Report.","authors":"Ngoc-Anh Anh Nguyen, Grace Lee, Brendan M Holderread, Terrie Holman, Sarah N Pletcher, Roberta Schwartz","doi":"10.2196/78216","DOIUrl":"https://doi.org/10.2196/78216","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Frequent vital signs (VS) monitoring is central to inpatient safety but is traditionally measured manually every four hours, a century-old practice that can miss early deterioration, disrupt patient sleep, and impose a heavy documentation burden on nursing staff. Continuous vital signs monitoring (CVSM) using wearable remote patient monitoring (RPM) devices enables near real-time, high-frequency VS measurement while reducing manual workload and preserving patient rest.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This implementation report describes the large-scale implementation of CVSM across an eight-hospital health system. The initiative aimed to: (1) enhance earlier detection of patient deterioration through continuous, algorithm-driven monitoring; (2) improve nursing workflow efficiency by reducing reliance on manual VS checks; and (3) minimize nighttime disruptions to support patient rest and recovery.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;The program was designed for system-wide scalability and executed from 2022 to 2024 using a four-phase framework: strategic program design, program planning, go-live preparation, and implementation and optimization. The FDA-cleared wearable device (BioButton®; BioIntelliSense, Golden, CO, USA) continuously measured heart rate (HR), respiratory rate (RR), and skin temperature, with integration into Epic and 24/7 oversight through a centralized Virtual Operations Center (VOC). Rollout followed a staggered playbook across ~2,700 adult non-ICU beds, supported by leadership engagement, supply chain readiness, training, and phased superuser-led adoption.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;All eight hospitals achieved full deployment between April 2023 and February 2024, with &gt;95% device utilization rates and 100% nursing staff training completion. A standardized escalation workflow filtered ~50% of alerts at the VOC review step, substantially reducing frontline alert burden. Operational refinements included revised HR and RR thresholds and removal of temperature as a single alert trigger. Several units extended overnight manual VS intervals from every four to every six-eight hours, with staff estimating ~4 hours saved per nursing shift. Patient care assistants redirected time toward mobility and personal needs, while staff reported growing confidence in device performance.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;This initiative represents the first system-wide deployment of CVSM across a diverse, multi-hospital health system. Success was enabled by early strategic alignment, phased rollout, robust IT and monitoring infrastructure, and iterative optimization. The program demonstrates the feasibility of embedding CVSM into routine inpatient care to improve efficiency and patient experience. Transferable strategies, including phased rollouts, centralized monitoring, and structured change management, may inform other health systems pursuing digital vital signs redesign. Future work should rigorously evalua","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":" ","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Clinical Decision-Making in Treating Airway Diseases with an Expert System Built Upon the Free AI Tool Google NotebookLM®. 基于免费人工智能工具b谷歌NotebookLM®的专家系统改善气道疾病治疗的临床决策
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-03 DOI: 10.2196/78567
Cheng-Hao Hsu, Ching-Li Hsu, Chih-Hsiang Tsou, Kuo-Fang Hsu, Hung-Yu Yang

Unstructured: Objective: We employed the free artificial intelligence (AI) tool Google NotebookLM®, powered by the large language model (LLM) Gemini 2.0, to construct a medical decision-making aid for diagnosing and managing airway diseases, and subsequently evaluated its functionality and performance in clinical workflow. Methods: After feeding this tool with relevant published clinical guidelines for these diseases, we evaluated the feasibility of the system regarding its behavior, ability, and potential, and made simulated cases and used this system to solve associated medical problems. The test and simulation questions were designed by a pulmonologist, and the appropriateness (focusing on accuracy and completeness) of AI responses were judged by three pulmonologists independently. The system was then deployed in an emergency department (ED) setting, where it was tested by medical staff (n=20) to see how it affected the process of clinical consultation. Test opinions were collected through questionnaire. Results: Most (58/84=66.7%) of the specialists' ratings regarding AI responses were above average. The inter-rater reliability was moderate on accuracy (Intraclass correlation coefficient (ICC)=0.612, P<.001) and good on completeness (ICC=0.773, P<.001). When deployed in an ED setting, this system could respond with reasonable answers, enhance the literacy of personnel about these diseases. The potential to save the time spent in consultation did not reach statistical significance (Kolmogorov-Smirnov D=.223, P=.237>.05) across all participants, but indicated a favorable outcome if we analyzed only physicians' responses. Conclusions: This system is customizable, cost-efficient, and accessible by clinicians and allied professionals without any computer coding experience in treating airway diseases. It provides convincing guideline-based recommendations, increases the staff's medical literacy, and potentially saves physicians' time spent on consultation. It warrants further evaluation in other medical disciplines and healthcare environments.

目的:采用免费的人工智能(AI)工具谷歌NotebookLM®,基于大语言模型(LLM) Gemini 2.0,构建用于气道疾病诊断和管理的医疗决策辅助系统,并评估其在临床工作流程中的功能和性能。方法:将已发表的相关疾病临床指南输入该工具,从行为、能力、潜力等方面评估该系统的可行性,并制作模拟病例,应用该系统解决相关医疗问题。测试和模拟问题由一名肺科医生设计,人工智能回答的适当性(注重准确性和完整性)由三名肺科医生独立判断。该系统随后被部署在急诊科(ED)环境中,在那里由医务人员(n=20)进行测试,以了解它如何影响临床咨询过程。通过问卷调查收集测试意见。结果:大多数专家(58/84=66.7%)对人工智能反应的评分高于平均水平。在所有参与者中,评估者之间的信度在准确性上是中等的(类内相关系数(ICC)=0.612, P.05),但如果我们只分析医生的反应,则表明结果是有利的。结论:该系统可定制,成本效益高,临床医生和相关专业人员在治疗气道疾病方面没有任何计算机编码经验。它提供了令人信服的基于指南的建议,提高了工作人员的医学素养,并可能节省医生花在咨询上的时间。它值得在其他医学学科和保健环境中进一步评估。
{"title":"Improving Clinical Decision-Making in Treating Airway Diseases with an Expert System Built Upon the Free AI Tool Google NotebookLM®.","authors":"Cheng-Hao Hsu, Ching-Li Hsu, Chih-Hsiang Tsou, Kuo-Fang Hsu, Hung-Yu Yang","doi":"10.2196/78567","DOIUrl":"https://doi.org/10.2196/78567","url":null,"abstract":"<p><strong>Unstructured: </strong>Objective: We employed the free artificial intelligence (AI) tool Google NotebookLM®, powered by the large language model (LLM) Gemini 2.0, to construct a medical decision-making aid for diagnosing and managing airway diseases, and subsequently evaluated its functionality and performance in clinical workflow. Methods: After feeding this tool with relevant published clinical guidelines for these diseases, we evaluated the feasibility of the system regarding its behavior, ability, and potential, and made simulated cases and used this system to solve associated medical problems. The test and simulation questions were designed by a pulmonologist, and the appropriateness (focusing on accuracy and completeness) of AI responses were judged by three pulmonologists independently. The system was then deployed in an emergency department (ED) setting, where it was tested by medical staff (n=20) to see how it affected the process of clinical consultation. Test opinions were collected through questionnaire. Results: Most (58/84=66.7%) of the specialists' ratings regarding AI responses were above average. The inter-rater reliability was moderate on accuracy (Intraclass correlation coefficient (ICC)=0.612, P<.001) and good on completeness (ICC=0.773, P<.001). When deployed in an ED setting, this system could respond with reasonable answers, enhance the literacy of personnel about these diseases. The potential to save the time spent in consultation did not reach statistical significance (Kolmogorov-Smirnov D=.223, P=.237>.05) across all participants, but indicated a favorable outcome if we analyzed only physicians' responses. Conclusions: This system is customizable, cost-efficient, and accessible by clinicians and allied professionals without any computer coding experience in treating airway diseases. It provides convincing guideline-based recommendations, increases the staff's medical literacy, and potentially saves physicians' time spent on consultation. It warrants further evaluation in other medical disciplines and healthcare environments.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":" ","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145896990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large Language Model-Based Virtual Patient Systems for History-Taking in Medical Education: Comprehensive Systematic Review. 基于大语言模型的医学教育历史记录虚拟病人系统:综合系统综述。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-02 DOI: 10.2196/79039
Dongliang Li, Syaheerah Lebai Lutfi

Background: Large language models (LLMs), such as GPT-3.5 and GPT-4 (OpenAI), have been transforming virtual patient systems in medical education by providing scalable and cost-effective alternatives to standardized patients. However, systematic evaluations of their performance, particularly for multimorbidity scenarios involving multiple coexisting diseases, are still limited.

Objective: This systematic review aimed to evaluate LLM-based virtual patient systems for medical history-taking, addressing four research questions: (1) simulated patient types and disease scope, (2) performance-enhancing techniques, (3) experimental designs and evaluation metrics, and (4) dataset characteristics and availability.

Methods: Following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020, 9 databases were searched (January 1, 2020, to August 18, 2025). Nontransformer LLMs and non-history-taking tasks were excluded. Multidimensional quality and bias assessments were conducted.

Results: A total of 39 studies were included, screened by one computer science researcher under supervision. LLM-based virtual patient systems mainly simulated internal medicine and mental health disorders, with many addressing distinct single disease types but few covering multimorbidity or rare conditions. Techniques like role-based prompts, few-shot learning, multiagent frameworks, knowledge graph (KG) integration (top-k accuracy 16.02%), and fine-tuning enhanced dialogue and diagnostic accuracy. Multimodal inputs (eg, speech and imaging) improved immersion and realism. Evaluations, typically involving 10-50 students and 3-10 experts, demonstrated strong performance (top-k accuracy: 0.45-0.98, hallucination rate: 0.31%-5%, System Usability Scale [SUS] ≥80). However, small samples, inconsistent metrics, and limited controls restricted generalizability. Common datasets such as MIMIC-III (Medical Information Mart for Intensive Care-III) exhibited intensive care unit (ICU) bias and lacked diversity, affecting reproducibility and external validity.

Conclusions: Included studies showed moderate risk of bias, inconsistent metrics, small cohorts, and limited dataset transparency. LLM-based virtual patient systems excel in simulating multiple disease types but lack multimorbidity patient representation. KGs improve top-k accuracy and support structured disease representation and reasoning. Future research should prioritize hybrid KG-chain-of-thought architectures integrated with open-source KGs (eg, UMLS [Unified Medical Language System] and SNOMED-CT [Systematized Nomenclature of Medicine - Clinical Terms]), parameter-efficient fine-tuning, dialogue compression, multimodal LLMs, standardized metrics, larger cohorts, and open-access multimodal datasets to further enhance realism, diagnostic accuracy, fairness, and educational utility.

背景:大型语言模型(llm),如GPT-3.5和GPT-4 (OpenAI),通过为标准化患者提供可扩展且具有成本效益的替代方案,已经改变了医学教育中的虚拟患者系统。然而,对其性能的系统评价,特别是对涉及多种并存疾病的多发病情况的评价仍然有限。目的:本系统综述旨在评估基于法学硕士的虚拟患者病史采集系统,解决四个研究问题:(1)模拟患者类型和疾病范围,(2)性能增强技术,(3)实验设计和评估指标,以及(4)数据集特征和可用性。方法:按照PRISMA (Preferred Reporting Items for Systematic Reviews and meta - analysis) 2020,检索9个数据库(2020年1月1日至2025年8月18日)。非变压器llm和非历史记录任务被排除在外。进行了多维质量和偏倚评估。结果:共纳入39项研究,由一名计算机科学研究人员在监督下筛选。基于法学硕士的虚拟患者系统主要模拟内科和精神健康障碍,其中许多针对不同的单一疾病类型,但很少涵盖多病或罕见疾病。基于角色的提示、少镜头学习、多智能体框架、知识图(KG)集成(top-k准确率16.02%)和微调等技术提高了对话和诊断的准确性。多模态输入(如语音和图像)提高了沉浸感和真实感。评估通常涉及10-50名学生和3-10名专家,表现出很强的表现(最高准确率:0.45-0.98,幻觉率:0.31%-5%,系统可用性量表[SUS]≥80)。然而,小样本,不一致的指标和有限的控制限制了推广。常见的数据集如MIMIC-III(重症监护医疗信息市场- iii)显示重症监护病房(ICU)偏倚,缺乏多样性,影响了可重复性和外部效度。结论:纳入的研究显示偏倚风险中等,指标不一致,队列较小,数据集透明度有限。基于法学硕士的虚拟患者系统在模拟多种疾病类型方面表现出色,但缺乏多病症患者的代表。KGs提高了top-k的准确性,并支持结构化的疾病表示和推理。未来的研究应优先考虑将混合kg -思维链架构与开源kg(例如,UMLS[统一医学语言系统]和SNOMED-CT[系统化医学术语-临床术语])、参数高效的精细调整、对话压缩、多模态llm、标准化指标、更大的队列和开放访问的多模态数据集集成在一起,以进一步提高真实性、诊断准确性、公平性和教育效用。
{"title":"Large Language Model-Based Virtual Patient Systems for History-Taking in Medical Education: Comprehensive Systematic Review.","authors":"Dongliang Li, Syaheerah Lebai Lutfi","doi":"10.2196/79039","DOIUrl":"10.2196/79039","url":null,"abstract":"<p><strong>Background: </strong>Large language models (LLMs), such as GPT-3.5 and GPT-4 (OpenAI), have been transforming virtual patient systems in medical education by providing scalable and cost-effective alternatives to standardized patients. However, systematic evaluations of their performance, particularly for multimorbidity scenarios involving multiple coexisting diseases, are still limited.</p><p><strong>Objective: </strong>This systematic review aimed to evaluate LLM-based virtual patient systems for medical history-taking, addressing four research questions: (1) simulated patient types and disease scope, (2) performance-enhancing techniques, (3) experimental designs and evaluation metrics, and (4) dataset characteristics and availability.</p><p><strong>Methods: </strong>Following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020, 9 databases were searched (January 1, 2020, to August 18, 2025). Nontransformer LLMs and non-history-taking tasks were excluded. Multidimensional quality and bias assessments were conducted.</p><p><strong>Results: </strong>A total of 39 studies were included, screened by one computer science researcher under supervision. LLM-based virtual patient systems mainly simulated internal medicine and mental health disorders, with many addressing distinct single disease types but few covering multimorbidity or rare conditions. Techniques like role-based prompts, few-shot learning, multiagent frameworks, knowledge graph (KG) integration (top-k accuracy 16.02%), and fine-tuning enhanced dialogue and diagnostic accuracy. Multimodal inputs (eg, speech and imaging) improved immersion and realism. Evaluations, typically involving 10-50 students and 3-10 experts, demonstrated strong performance (top-k accuracy: 0.45-0.98, hallucination rate: 0.31%-5%, System Usability Scale [SUS] ≥80). However, small samples, inconsistent metrics, and limited controls restricted generalizability. Common datasets such as MIMIC-III (Medical Information Mart for Intensive Care-III) exhibited intensive care unit (ICU) bias and lacked diversity, affecting reproducibility and external validity.</p><p><strong>Conclusions: </strong>Included studies showed moderate risk of bias, inconsistent metrics, small cohorts, and limited dataset transparency. LLM-based virtual patient systems excel in simulating multiple disease types but lack multimorbidity patient representation. KGs improve top-k accuracy and support structured disease representation and reasoning. Future research should prioritize hybrid KG-chain-of-thought architectures integrated with open-source KGs (eg, UMLS [Unified Medical Language System] and SNOMED-CT [Systematized Nomenclature of Medicine - Clinical Terms]), parameter-efficient fine-tuning, dialogue compression, multimodal LLMs, standardized metrics, larger cohorts, and open-access multimodal datasets to further enhance realism, diagnostic accuracy, fairness, and educational utility.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e79039"},"PeriodicalIF":3.8,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12811743/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI Enabled CRM Platforms for Patient Services in Healthcare Early Lessons from Governance and Program Level Outcomes. 从治理和项目层面成果中获得的早期经验教训。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-01 DOI: 10.2196/83564
Anup Kant Gupta

Background: AI enabled CRM platforms are increasingly used in healthcare to improve patient services, but real world evidence about how these systems influence affordability, adherence, and access remains limited. Many enterprises adopt CRM workflows without clear governance, operational definitions, or measurement standards, which creates inconsistent outcomes and low adoption.

Objective: To summarize early operational lessons from four large enterprise implementations of AI enabled CRM platforms and describe program level changes in affordability support, therapy initiation time, and therapy discontinuation rates.

Methods: A case informed thematic analysis was conducted across four enterprise CRM implementations between 2019 and 2024. Programs included large national healthcare organizations serving more than 500,000 patients annually. Aggregated, de identified operational dashboards and governance documents were reviewed. Adoption was defined as the proportion of active CRM users among provisioned patient service users. Baseline values were taken from pre implementation operations and compared with stabilized post implementation periods. No patient level or identifiable data were used, and institutional review board approval was not required.

Results: Programs that aligned CRM workflows with patient centered outcomes showed higher adoption. Active user rates reached more than 85 percent compared with less than 60 percent in programs without structured governance. CRM supported affordability checks showed increased completion rates within service teams. Therapy initiation time improved in programs that used AI assisted triage. Program level therapy discontinuation rates decreased when proactive risk flags were incorporated into CRM workflows. These changes reflect descriptive pre post operational signals and not causal estimates.

Conclusions: AI enabled CRM platforms can support improvements in patient service operations when supported by clear governance and well defined metrics. Observed improvements in affordability support, initiation time, and discontinuation rates were program level trends that require further study with more rigorous designs. The findings provide early lessons for organizations implementing AI driven CRM systems in healthcare.

Clinicaltrial:

背景:人工智能支持的CRM平台越来越多地用于医疗保健领域,以改善患者服务,但关于这些系统如何影响可负担性、依从性和访问的现实证据仍然有限。许多采用CRM工作流的企业没有明确的治理、操作定义或度量标准,这就造成了不一致的结果和低采用率。目的:总结四家大型企业实施人工智能CRM平台的早期运营经验,并描述在可负担性支持、治疗开始时间和治疗中断率方面的项目水平变化。方法:对2019年至2024年间四家企业CRM实施情况进行案例知情专题分析。项目包括大型国家医疗机构,每年为超过50万名患者提供服务。审查了汇总的、确定的操作指示板和治理文档。采用被定义为CRM活跃用户在提供的患者服务用户中的比例。基线值取自实施前的行动,并与稳定的实施后时期进行比较。没有使用患者水平或可识别的数据,也不需要机构审查委员会的批准。结果:将CRM工作流程与以患者为中心的结果相结合的程序显示出更高的采用率。活跃用户比例达到85%以上,而在没有结构化管理的项目中,活跃用户比例不到60%。CRM支持的可负担性检查显示,服务团队的完成率有所提高。在使用人工智能辅助分诊的项目中,治疗开始时间有所改善。当主动风险标志被纳入CRM工作流程时,项目级治疗中断率降低。这些变化反映了描述性的行动前后信号,而不是因果估计。结论:在明确的治理和定义良好的指标的支持下,人工智能支持的CRM平台可以支持患者服务操作的改进。观察到的可负担性支持、启动时间和终止率的改善是项目水平的趋势,需要进一步研究更严格的设计。研究结果为在医疗保健领域实施人工智能驱动的CRM系统的组织提供了早期经验。临床试验:
{"title":"AI Enabled CRM Platforms for Patient Services in Healthcare Early Lessons from Governance and Program Level Outcomes.","authors":"Anup Kant Gupta","doi":"10.2196/83564","DOIUrl":"https://doi.org/10.2196/83564","url":null,"abstract":"<p><strong>Background: </strong>AI enabled CRM platforms are increasingly used in healthcare to improve patient services, but real world evidence about how these systems influence affordability, adherence, and access remains limited. Many enterprises adopt CRM workflows without clear governance, operational definitions, or measurement standards, which creates inconsistent outcomes and low adoption.</p><p><strong>Objective: </strong>To summarize early operational lessons from four large enterprise implementations of AI enabled CRM platforms and describe program level changes in affordability support, therapy initiation time, and therapy discontinuation rates.</p><p><strong>Methods: </strong>A case informed thematic analysis was conducted across four enterprise CRM implementations between 2019 and 2024. Programs included large national healthcare organizations serving more than 500,000 patients annually. Aggregated, de identified operational dashboards and governance documents were reviewed. Adoption was defined as the proportion of active CRM users among provisioned patient service users. Baseline values were taken from pre implementation operations and compared with stabilized post implementation periods. No patient level or identifiable data were used, and institutional review board approval was not required.</p><p><strong>Results: </strong>Programs that aligned CRM workflows with patient centered outcomes showed higher adoption. Active user rates reached more than 85 percent compared with less than 60 percent in programs without structured governance. CRM supported affordability checks showed increased completion rates within service teams. Therapy initiation time improved in programs that used AI assisted triage. Program level therapy discontinuation rates decreased when proactive risk flags were incorporated into CRM workflows. These changes reflect descriptive pre post operational signals and not causal estimates.</p><p><strong>Conclusions: </strong>AI enabled CRM platforms can support improvements in patient service operations when supported by clear governance and well defined metrics. Observed improvements in affordability support, initiation time, and discontinuation rates were program level trends that require further study with more rigorous designs. The findings provide early lessons for organizations implementing AI driven CRM systems in healthcare.</p><p><strong>Clinicaltrial: </strong></p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":" ","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JMIR Medical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1