首页 > 最新文献

Journal of the American Medical Informatics Association最新文献

英文 中文
SDoH-GPT: using large language models to extract social determinants of health. SDoH-GPT:使用大型语言模型提取健康的社会决定因素。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf094
Bernardo Consoli, Haoyang Wang, Xizhi Wu, Song Wang, Xinyu Zhao, Yanshan Wang, Justin Rousseau, Tom Hartvigsen, Li Shen, Huanmei Wu, Yifan Peng, Qi Long, Tianlong Chen, Ying Ding

Objective: Extracting social determinants of health (SDoHs) from medical notes depends heavily on labor-intensive annotations, which are typically task-specific, hampering reusability and limiting sharing. Here, we introduce SDoH-GPT, a novel framework leveraging few-shot learning large language models (LLMs) to automate the extraction of SDoH from unstructured text, aiming to improve both efficiency and generalizability.

Materials and methods: SDoH-GPT is a framework including the few-shot learning LLM methods to extract the SDoH from medical notes and the XGBoost classifiers which continue to classify SDoH using the annotations generated by the few-shot learning LLM methods as training datasets. The unique combination of the few-shot learning LLM methods with XGBoost utilizes the strength of LLMs as great few shot learners and the efficiency of XGBoost when the training dataset is sufficient. Therefore, SDoH-GPT can extract SDoH without relying on extensive medical annotations or costly human intervention.

Results: Our approach achieved tenfold and twentyfold reductions in time and cost, respectively, and superior consistency with human annotators measured by Cohen's kappa of up to 0.92. The innovative combination of LLM and XGBoost can ensure high accuracy and computational efficiency while consistently maintaining 0.90+ AUROC scores.

Discussion: This study has verified SDoH-GPT on three datasets and highlights the potential of leveraging LLM and XGBoost to revolutionize medical note classification, demonstrating its capability to achieve highly accurate classifications with significantly reduced time and cost.

Conclusion: The key contribution of this study is the integration of LLM with XGBoost, which enables cost-effective and high quality annotations of SDoH. This research sets the stage for SDoH can be more accessible, scalable, and impactful in driving future healthcare solutions.

目的:从医疗记录中提取健康的社会决定因素(SDoHs)在很大程度上依赖于劳动密集型的注释,这些注释通常是特定于任务的,阻碍了可重用性并限制了共享。在这里,我们介绍了SDoH- gpt,这是一个利用少量学习大型语言模型(llm)从非结构化文本中自动提取SDoH的新框架,旨在提高效率和泛化性。材料和方法:SDoH- gpt是一个框架,包括从医疗记录中提取SDoH的few-shot learning LLM方法,以及使用few-shot learning LLM方法生成的注释作为训练数据集继续对SDoH进行分类的XGBoost分类器。少镜头学习LLM方法与XGBoost的独特结合利用了LLM作为少镜头学习器的强度和XGBoost在训练数据集足够时的效率。因此,SDoH- gpt可以在不依赖大量医学注释或昂贵的人为干预的情况下提取SDoH。结果:我们的方法在时间和成本上分别减少了10倍和20倍,并且与人类注释器的一致性非常好,Cohen的kappa测量值高达0.92。LLM和XGBoost的创新组合可以确保高精度和计算效率,同时始终保持0.90+ AUROC分数。讨论:本研究在三个数据集上验证了SDoH-GPT,并强调了利用LLM和XGBoost彻底改变医疗记录分类的潜力,展示了其在显著减少时间和成本的情况下实现高度准确分类的能力。结论:本研究的关键贡献在于LLM与XGBoost的集成,实现了高成本、高质量的SDoH注释。这项研究为SDoH在推动未来医疗保健解决方案方面更易于访问、可扩展和更有影响力奠定了基础。
{"title":"SDoH-GPT: using large language models to extract social determinants of health.","authors":"Bernardo Consoli, Haoyang Wang, Xizhi Wu, Song Wang, Xinyu Zhao, Yanshan Wang, Justin Rousseau, Tom Hartvigsen, Li Shen, Huanmei Wu, Yifan Peng, Qi Long, Tianlong Chen, Ying Ding","doi":"10.1093/jamia/ocaf094","DOIUrl":"10.1093/jamia/ocaf094","url":null,"abstract":"<p><strong>Objective: </strong>Extracting social determinants of health (SDoHs) from medical notes depends heavily on labor-intensive annotations, which are typically task-specific, hampering reusability and limiting sharing. Here, we introduce SDoH-GPT, a novel framework leveraging few-shot learning large language models (LLMs) to automate the extraction of SDoH from unstructured text, aiming to improve both efficiency and generalizability.</p><p><strong>Materials and methods: </strong>SDoH-GPT is a framework including the few-shot learning LLM methods to extract the SDoH from medical notes and the XGBoost classifiers which continue to classify SDoH using the annotations generated by the few-shot learning LLM methods as training datasets. The unique combination of the few-shot learning LLM methods with XGBoost utilizes the strength of LLMs as great few shot learners and the efficiency of XGBoost when the training dataset is sufficient. Therefore, SDoH-GPT can extract SDoH without relying on extensive medical annotations or costly human intervention.</p><p><strong>Results: </strong>Our approach achieved tenfold and twentyfold reductions in time and cost, respectively, and superior consistency with human annotators measured by Cohen's kappa of up to 0.92. The innovative combination of LLM and XGBoost can ensure high accuracy and computational efficiency while consistently maintaining 0.90+ AUROC scores.</p><p><strong>Discussion: </strong>This study has verified SDoH-GPT on three datasets and highlights the potential of leveraging LLM and XGBoost to revolutionize medical note classification, demonstrating its capability to achieve highly accurate classifications with significantly reduced time and cost.</p><p><strong>Conclusion: </strong>The key contribution of this study is the integration of LLM with XGBoost, which enables cost-effective and high quality annotations of SDoH. This research sets the stage for SDoH can be more accessible, scalable, and impactful in driving future healthcare solutions.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"67-78"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758468/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144267837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dependence of premature ventricular complexes on heart rate-it's not that simple. 早衰心室复合体对心率的依赖——没那么简单。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf069
Adrien Osakwe, Noah Wightman, Marc W Deyell, Zachary Laksman, Alvin Shrier, Gil Bub, Leon Glass, Thomas M Bury

Objective: Frequent premature ventricular complexes (PVCs) can lead to adverse health conditions such as cardiomyopathy. The linear correlation between PVC frequency and heart rate (as positive, negative, or neutral) on a 24-hour Holter recording has been proposed as a way to classify patients and guide treatment with beta-blockers. Our objective was to evaluate the robustness of this classification to measurement methodology, different 24-hour periods, and nonlinear dependencies of PVCs on heart rate.

Materials and methods: We analyzed 82 multi-day Holter recordings (1-7 days) collected from 48 patients with frequent PVCs (burden 1%-44%). For each record, linear correlation between PVC frequency and heart rate was computed for different 24-hour periods and using different length intervals to determine PVC frequency.

Results: Using a 1-hour interval, the correlation between PVC frequency and heart rate was consistently positive, negative, or neutral on different days in only 36.6% of patients. Using shorter time intervals, the correlation was consistent in 56.1% of patients. Shorter time intervals revealed nonlinear and piecewise linear relationships between PVC frequency and heart rate in many patients.

Discussion: The variability of the correlation between PVC frequency and heart rate across different 24-hour periods and interval durations suggests that the relationship is neither strictly linear nor stationary. A better understanding of the mechanism driving the PVCs, combined with computational and biological models that represent these mechanisms, may provide insight into the observed nonlinear behavior and guide more robust classification strategies.

Conclusion: Linear correlation as a tool to classify patients with frequent PVCs should be used with caution. It is sensitive to the specific 24-hour period analyzed and the methodology used to segment the data. More sophisticated classification approaches that can capture nonlinear and time-varying dependencies should be developed and considered in clinical practice.

目的:频繁的室性早搏可导致不良的健康状况,如心肌病。在24小时动态心电图记录中,PVC频率与心率(阳性、阴性或中性)之间的线性相关性已被提出作为对患者进行分类和指导β受体阻滞剂治疗的一种方法。我们的目的是评估这种分类对测量方法、不同的24小时周期和室性早搏对心率的非线性依赖性的稳健性。材料和方法:我们分析了48例频繁室性早搏患者(负担1%-44%)的82天动态心电图记录(1-7天)。对于每一个记录,在不同的24小时周期内计算PVC频率与心率之间的线性相关性,并使用不同的长度间隔来确定PVC频率。结果:使用1小时的间隔,只有36.6%的患者在不同的日子里,PVC频率和心率之间的相关性始终为正、负或中性。使用较短的时间间隔,56.1%的患者的相关性是一致的。较短的时间间隔揭示了许多患者PVC频率与心率之间的非线性和分段线性关系。讨论:在不同的24小时周期和间隔时间内,PVC频率和心率之间的相关性的可变性表明,这种关系既不是严格的线性关系,也不是平稳的关系。更好地理解驱动pvc的机制,结合代表这些机制的计算和生物学模型,可以提供对观察到的非线性行为的洞察,并指导更稳健的分类策略。结论:线性相关性作为诊断频发室性早搏的工具应谨慎使用。它对分析的特定24小时期间和用于分割数据的方法很敏感。应该在临床实践中开发和考虑更复杂的分类方法,这些方法可以捕获非线性和时变的依赖关系。
{"title":"Dependence of premature ventricular complexes on heart rate-it's not that simple.","authors":"Adrien Osakwe, Noah Wightman, Marc W Deyell, Zachary Laksman, Alvin Shrier, Gil Bub, Leon Glass, Thomas M Bury","doi":"10.1093/jamia/ocaf069","DOIUrl":"10.1093/jamia/ocaf069","url":null,"abstract":"<p><strong>Objective: </strong>Frequent premature ventricular complexes (PVCs) can lead to adverse health conditions such as cardiomyopathy. The linear correlation between PVC frequency and heart rate (as positive, negative, or neutral) on a 24-hour Holter recording has been proposed as a way to classify patients and guide treatment with beta-blockers. Our objective was to evaluate the robustness of this classification to measurement methodology, different 24-hour periods, and nonlinear dependencies of PVCs on heart rate.</p><p><strong>Materials and methods: </strong>We analyzed 82 multi-day Holter recordings (1-7 days) collected from 48 patients with frequent PVCs (burden 1%-44%). For each record, linear correlation between PVC frequency and heart rate was computed for different 24-hour periods and using different length intervals to determine PVC frequency.</p><p><strong>Results: </strong>Using a 1-hour interval, the correlation between PVC frequency and heart rate was consistently positive, negative, or neutral on different days in only 36.6% of patients. Using shorter time intervals, the correlation was consistent in 56.1% of patients. Shorter time intervals revealed nonlinear and piecewise linear relationships between PVC frequency and heart rate in many patients.</p><p><strong>Discussion: </strong>The variability of the correlation between PVC frequency and heart rate across different 24-hour periods and interval durations suggests that the relationship is neither strictly linear nor stationary. A better understanding of the mechanism driving the PVCs, combined with computational and biological models that represent these mechanisms, may provide insight into the observed nonlinear behavior and guide more robust classification strategies.</p><p><strong>Conclusion: </strong>Linear correlation as a tool to classify patients with frequent PVCs should be used with caution. It is sensitive to the specific 24-hour period analyzed and the methodology used to segment the data. More sophisticated classification approaches that can capture nonlinear and time-varying dependencies should be developed and considered in clinical practice.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"90-97"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758478/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144055982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing end-stage renal disease outcome prediction: a multisourced data-driven approach. 加强终末期肾脏疾病结局预测:多来源数据驱动的方法
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf118
Yubo Li, Rema Padman

Objectives: To improve prediction of chronic kidney disease (CKD) progression to end-stage renal disease (ESRD) using machine learning (ML) and deep learning (DL) models applied to integrated clinical and claims data with varying observation windows, supported by explainable artificial intelligence (AI) to enhance interpretability and reduce bias.

Materials and methods: We utilized data from 10 326 CKD patients, combining clinical and claims information from 2009 to 2018. After preprocessing, cohort identification, and feature engineering, we evaluated multiple statistical, ML and DL models using 5 distinct observation windows. Feature importance and SHapley Additive exPlanations (SHAP) analysis were employed to understand key predictors. Models were tested for robustness, clinical relevance, misclassification patterns, and bias.

Results: Integrated data models outperformed single data source models, with long short-term memory achieving the highest area under the receiver operating characteristic curve (AUROC) (0.93) and F1 score (0.65). A 24-month observation window optimally balanced early detection and prediction accuracy. The 2021 estimated glomerular filtration rate (eGFR) equation improved prediction accuracy and reduced racial bias, particularly for African American patients.

Discussion: Improved prediction accuracy, interpretability, and bias mitigation strategies have the potential to enhance CKD management, support targeted interventions, and reduce health-care disparities.

Conclusion: This study presents a robust framework for predicting ESRD outcomes, improving clinical decision-making through integrated multisourced data and advanced analytics. Future research will expand data integration and extend this framework to other chronic diseases.

目的:利用机器学习(ML)和深度学习(DL)模型,提高慢性肾脏疾病(CKD)进展到终末期肾脏疾病(ESRD)的预测,这些模型应用于具有不同观察窗口的综合临床和索赔数据,并得到可解释的人工智能(AI)的支持,以增强可解释性并减少偏差。材料和方法:我们利用2009年至2018年10 326例CKD患者的数据,结合临床和索赔信息。经过预处理、队列识别和特征工程,我们使用5个不同的观察窗口评估了多个统计、ML和DL模型。采用特征重要性和SHapley加性解释(SHAP)分析来了解关键预测因子。对模型进行稳健性、临床相关性、错误分类模式和偏倚检验。结果:综合数据模型优于单一数据源模型,长短期记忆在受试者工作特征曲线下面积(AUROC)最高(0.93),F1得分最高(0.65)。24个月的观测窗口最佳地平衡了早期发现和预测精度。2021年估计的肾小球滤过率(eGFR)方程提高了预测准确性,减少了种族偏见,特别是对非洲裔美国患者。讨论:提高预测准确性、可解释性和减轻偏倚策略有可能加强CKD管理,支持有针对性的干预措施,并减少医疗保健差距。结论:本研究为预测ESRD结果提供了一个强大的框架,通过集成多源数据和高级分析改善临床决策。未来的研究将扩大数据整合,并将这一框架扩展到其他慢性疾病。
{"title":"Enhancing end-stage renal disease outcome prediction: a multisourced data-driven approach.","authors":"Yubo Li, Rema Padman","doi":"10.1093/jamia/ocaf118","DOIUrl":"10.1093/jamia/ocaf118","url":null,"abstract":"<p><strong>Objectives: </strong>To improve prediction of chronic kidney disease (CKD) progression to end-stage renal disease (ESRD) using machine learning (ML) and deep learning (DL) models applied to integrated clinical and claims data with varying observation windows, supported by explainable artificial intelligence (AI) to enhance interpretability and reduce bias.</p><p><strong>Materials and methods: </strong>We utilized data from 10 326 CKD patients, combining clinical and claims information from 2009 to 2018. After preprocessing, cohort identification, and feature engineering, we evaluated multiple statistical, ML and DL models using 5 distinct observation windows. Feature importance and SHapley Additive exPlanations (SHAP) analysis were employed to understand key predictors. Models were tested for robustness, clinical relevance, misclassification patterns, and bias.</p><p><strong>Results: </strong>Integrated data models outperformed single data source models, with long short-term memory achieving the highest area under the receiver operating characteristic curve (AUROC) (0.93) and F1 score (0.65). A 24-month observation window optimally balanced early detection and prediction accuracy. The 2021 estimated glomerular filtration rate (eGFR) equation improved prediction accuracy and reduced racial bias, particularly for African American patients.</p><p><strong>Discussion: </strong>Improved prediction accuracy, interpretability, and bias mitigation strategies have the potential to enhance CKD management, support targeted interventions, and reduce health-care disparities.</p><p><strong>Conclusion: </strong>This study presents a robust framework for predicting ESRD outcomes, improving clinical decision-making through integrated multisourced data and advanced analytics. Future research will expand data integration and extend this framework to other chronic diseases.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"26-36"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758457/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144838430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using transfer learning to improve prediction of suicide risk in acute care hospitals. 运用迁移学习改善急症护理医院自杀风险预测。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf126
Shane J Sacco, Kun Chen, Fei Wang, Steven C Rogers, Robert H Aseltine

Objective: Emerging efforts to identify patients at risk of suicide have focused on the development of predictive algorithms for use in healthcare settings. We address a major challenge in effective risk modeling in healthcare settings with insufficient data with which to create and apply risk models. This study aimed to improve risk prediction using transfer learning or data fusion by incorporating risk information from external data sources to augment the data available in particular clinical settings.

Materials and methods: In this retrospective study, we developed predictive models in individual Connecticut hospitals using medical claims data. We compared conventional models containing demographics and historical medical diagnosis codes with fusion models containing conventional features and fused risk information that described similarities in historical diagnosis codes between patients from the hospital and patients receiving care for suicide attempts at other hospitals.

Results: Our sample contained 27 hospitals and 636 758 18- to 64-year-old patients. Fusion improved prediction for 93% of hospitals, while slightly worsening prediction for 7%. Median areas under the ROC and precision-recall curves of conventional models were 77.6% and 3.4%, respectively. Fusion improved these metrics by a median of 3.3 and 0.3 points, respectively (Ps < .001). Median sensitivities and positive predictive values at 90% and 95% specificity were also improved (Ps < .001).

Discussion: This study provided strong evidence that data fusion improved model performance across hospitals. Improvement was of greatest magnitude in facilities treating relatively few suicidal patients.

Conclusion: Data fusion holds promise as a methodology to improve suicide risk prediction in healthcare settings with limited or incomplete data.

目的:新兴的努力,以确定有自杀风险的病人已经集中在预测算法的发展,用于医疗保健设置。我们解决了在医疗保健环境中有效风险建模的主要挑战,因为数据不足,无法创建和应用风险模型。本研究旨在通过整合来自外部数据源的风险信息来增强特定临床环境中可用的数据,从而利用迁移学习或数据融合来改进风险预测。材料和方法:在这项回顾性研究中,我们利用医疗索赔数据开发了康涅狄格州各医院的预测模型。我们将包含人口统计学和历史医学诊断代码的传统模型与包含传统特征和融合风险信息的融合模型进行了比较,融合风险信息描述了来自医院的患者和在其他医院接受治疗的自杀未遂患者之间历史诊断代码的相似性。结果:我们的样本包括27家医院和636758名18至64岁的患者。融合提高了93%的医院的预测,而7%的医院的预测略有下降。常规模型的ROC曲线下的中位数面积为77.6%,准确率-召回率曲线下的中位数面积为3.4%。融合将这些指标分别提高了3.3和0.3分(Ps讨论:本研究提供了强有力的证据,表明数据融合提高了医院的模型性能。在治疗相对较少自杀患者的设施中,改善幅度最大。结论:数据融合有望作为一种方法,在数据有限或不完整的医疗机构中改善自杀风险预测。
{"title":"Using transfer learning to improve prediction of suicide risk in acute care hospitals.","authors":"Shane J Sacco, Kun Chen, Fei Wang, Steven C Rogers, Robert H Aseltine","doi":"10.1093/jamia/ocaf126","DOIUrl":"10.1093/jamia/ocaf126","url":null,"abstract":"<p><strong>Objective: </strong>Emerging efforts to identify patients at risk of suicide have focused on the development of predictive algorithms for use in healthcare settings. We address a major challenge in effective risk modeling in healthcare settings with insufficient data with which to create and apply risk models. This study aimed to improve risk prediction using transfer learning or data fusion by incorporating risk information from external data sources to augment the data available in particular clinical settings.</p><p><strong>Materials and methods: </strong>In this retrospective study, we developed predictive models in individual Connecticut hospitals using medical claims data. We compared conventional models containing demographics and historical medical diagnosis codes with fusion models containing conventional features and fused risk information that described similarities in historical diagnosis codes between patients from the hospital and patients receiving care for suicide attempts at other hospitals.</p><p><strong>Results: </strong>Our sample contained 27 hospitals and 636 758 18- to 64-year-old patients. Fusion improved prediction for 93% of hospitals, while slightly worsening prediction for 7%. Median areas under the ROC and precision-recall curves of conventional models were 77.6% and 3.4%, respectively. Fusion improved these metrics by a median of 3.3 and 0.3 points, respectively (Ps < .001). Median sensitivities and positive predictive values at 90% and 95% specificity were also improved (Ps < .001).</p><p><strong>Discussion: </strong>This study provided strong evidence that data fusion improved model performance across hospitals. Improvement was of greatest magnitude in facilities treating relatively few suicidal patients.</p><p><strong>Conclusion: </strong>Data fusion holds promise as a methodology to improve suicide risk prediction in healthcare settings with limited or incomplete data.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"159-166"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758463/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144715164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Smart Imitator: Learning from Imperfect Clinical Decisions. 聪明的模仿者:从不完美的临床决策中学习。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocae320
Dilruk Perera, Siqi Liu, Kay Choong See, Mengling Feng

Objectives: This study introduces Smart Imitator (SI), a 2-phase reinforcement learning (RL) solution enhancing personalized treatment policies in healthcare, addressing challenges from imperfect clinician data and complex environments.

Materials and methods: Smart Imitator's first phase uses adversarial cooperative imitation learning with a novel sample selection schema to categorize clinician policies from optimal to nonoptimal. The second phase creates a parameterized reward function to guide the learning of superior treatment policies through RL. Smart Imitator's effectiveness was validated on 2 datasets: a sepsis dataset with 19 711 patient trajectories and a diabetes dataset with 7234 trajectories.

Results: Extensive quantitative and qualitative experiments showed that SI significantly outperformed state-of-the-art baselines in both datasets. For sepsis, SI reduced estimated mortality rates by 19.6% compared to the best baseline. For diabetes, SI reduced HbA1c-High rates by 12.2%. The learned policies aligned closely with successful clinical decisions and deviated strategically when necessary. These deviations aligned with recent clinical findings, suggesting improved outcomes.

Discussion: Smart Imitator advances RL applications by addressing challenges such as imperfect data and environmental complexities, demonstrating effectiveness within the tested conditions of sepsis and diabetes. Further validation across diverse conditions and exploration of additional RL algorithms are needed to enhance precision and generalizability.

Conclusion: This study shows potential in advancing personalized healthcare learning from clinician behaviors to improve treatment outcomes. Its methodology offers a robust approach for adaptive, personalized strategies in various complex and uncertain environments.

目的:本研究介绍了智能模仿者(SI),这是一种两阶段强化学习(RL)解决方案,可增强医疗保健中的个性化治疗政策,解决临床医生数据不完善和复杂环境带来的挑战。材料和方法:智能模仿者的第一阶段使用对抗性合作模仿学习和一种新的样本选择模式,将临床医生的策略从最优到非最优进行分类。第二阶段创建一个参数化的奖励函数,通过强化学习来指导更好的待遇政策的学习。Smart Imitator的有效性在2个数据集上得到了验证:脓毒症数据集(包含19711个患者轨迹)和糖尿病数据集(包含7234个轨迹)。结果:广泛的定量和定性实验表明,SI在两个数据集中都明显优于最先进的基线。对于败血症,与最佳基线相比,SI降低了19.6%的估计死亡率。对于糖尿病,SI使HbA1c-High率降低了12.2%。所学到的政策与成功的临床决策密切相关,必要时也会在战略上有所偏离。这些偏差与最近的临床发现一致,表明预后改善。讨论:智能模仿者通过解决数据不完善和环境复杂性等挑战来推进RL应用,并在败血症和糖尿病的测试条件下展示有效性。需要在不同条件下进一步验证和探索额外的强化学习算法,以提高精度和泛化性。结论:本研究显示了从临床医生行为中学习个性化医疗保健以改善治疗结果的潜力。它的方法为在各种复杂和不确定的环境中自适应、个性化的策略提供了一个强大的方法。
{"title":"Smart Imitator: Learning from Imperfect Clinical Decisions.","authors":"Dilruk Perera, Siqi Liu, Kay Choong See, Mengling Feng","doi":"10.1093/jamia/ocae320","DOIUrl":"10.1093/jamia/ocae320","url":null,"abstract":"<p><strong>Objectives: </strong>This study introduces Smart Imitator (SI), a 2-phase reinforcement learning (RL) solution enhancing personalized treatment policies in healthcare, addressing challenges from imperfect clinician data and complex environments.</p><p><strong>Materials and methods: </strong>Smart Imitator's first phase uses adversarial cooperative imitation learning with a novel sample selection schema to categorize clinician policies from optimal to nonoptimal. The second phase creates a parameterized reward function to guide the learning of superior treatment policies through RL. Smart Imitator's effectiveness was validated on 2 datasets: a sepsis dataset with 19 711 patient trajectories and a diabetes dataset with 7234 trajectories.</p><p><strong>Results: </strong>Extensive quantitative and qualitative experiments showed that SI significantly outperformed state-of-the-art baselines in both datasets. For sepsis, SI reduced estimated mortality rates by 19.6% compared to the best baseline. For diabetes, SI reduced HbA1c-High rates by 12.2%. The learned policies aligned closely with successful clinical decisions and deviated strategically when necessary. These deviations aligned with recent clinical findings, suggesting improved outcomes.</p><p><strong>Discussion: </strong>Smart Imitator advances RL applications by addressing challenges such as imperfect data and environmental complexities, demonstrating effectiveness within the tested conditions of sepsis and diabetes. Further validation across diverse conditions and exploration of additional RL algorithms are needed to enhance precision and generalizability.</p><p><strong>Conclusion: </strong>This study shows potential in advancing personalized healthcare learning from clinician behaviors to improve treatment outcomes. Its methodology offers a robust approach for adaptive, personalized strategies in various complex and uncertain environments.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"49-66"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758472/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142962554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting mortality in hospitalized influenza patients: integration of deep learning-based chest X-ray severity score (FluDeep-XR) and clinical variables. 预测住院流感患者的死亡率:基于深度学习的胸部 X 光严重程度评分 (FluDeep-XR) 与临床变量的整合。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocae286
Meng-Han Tsai, Sung-Chu Ko, Amy Huaishiuan Huang, Lorenzo Porta, Cecilia Ferretti, Clarissa Longhi, Wan-Ting Hsu, Yung-Han Chang, Jo-Ching Hsiung, Chin-Hua Su, Filippo Galbiati, Chien-Chang Lee

Objectives: To pioneer the first artificial intelligence system integrating radiological and objective clinical data, simulating the clinical reasoning process, for the early prediction of high-risk influenza patients.

Materials and methods: Our system was developed using a cohort from National Taiwan University Hospital in Taiwan, with external validation data from ASST Grande Ospedale Metropolitano Niguarda in Italy. Convolutional neural networks pretrained on ImageNet were regressively trained using a 5-point scale to develop the influenza chest X-ray (CXR) severity scoring model, FluDeep-XR. Early, late, and joint fusion structures, incorporating varying weights of CXR severity with clinical data, were designed to predict 30-day mortality and compared with models using only CXR or clinical data. The best-performing model was designated as FluDeep. The explainability of FluDeep-XR and FluDeep was illustrated through activation maps and SHapley Additive exPlanations (SHAP).

Results: The Xception-based model, FluDeep-XR, achieved a mean square error of 0.738 in the external validation dataset. The Random Forest-based late fusion model, FluDeep, outperformed all the other models, achieving an area under the receiver operating curve of 0.818 and a sensitivity of 0.706 in the external dataset. Activation maps highlighted clear lung fields. Shapley additive explanations identified age, C-reactive protein, hematocrit, heart rate, and respiratory rate as the top 5 important clinical features.

Discussion: The integration of medical imaging with objective clinical data outperformed single-modality models to predict 30-day mortality in influenza patients. We ensured the explainability of our models aligned with clinical knowledge and validated its applicability across foreign institutions.

Conclusion: FluDeep highlights the potential of combining radiological and clinical information in late fusion design, enhancing diagnostic accuracy and offering an explainable, and generalizable decision support system.

目的开创首个整合放射学和客观临床数据的人工智能系统,模拟临床推理过程,用于早期预测高危流感患者:我们的系统是利用台湾国立台湾大学医院的队列数据开发的,外部验证数据来自意大利的 ASST Grande Ospedale Metropolitano Niguarda。在 ImageNet 上预先训练的卷积神经网络使用 5 点量表进行回归训练,从而开发出流感胸部 X 光(CXR)严重程度评分模型 FluDeep-XR。设计了早期、晚期和联合融合结构,将不同权重的 CXR 严重程度与临床数据相结合,用于预测 30 天死亡率,并与仅使用 CXR 或临床数据的模型进行比较。表现最好的模型被命名为 FluDeep。结果表明,FluDeep-XR 和 FluDeep 可通过激活图和 SHapley Additive exPlanations(SHAP)进行解释:结果:基于 Xception 的模型 FluDeep-XR 在外部验证数据集中的均方误差为 0.738。基于随机森林的后期融合模型 FluDeep 的表现优于所有其他模型,在外部数据集中的接收器工作曲线下面积为 0.818,灵敏度为 0.706。激活图突出显示了清晰的肺野。夏普利加性解释将年龄、C 反应蛋白、血细胞比容、心率和呼吸频率确定为最重要的 5 个临床特征:在预测流感患者 30 天死亡率方面,医学影像与客观临床数据的整合优于单一模式。我们确保了模型的可解释性与临床知识的一致性,并验证了其在国外机构的适用性:FluDeep凸显了在后期融合设计中结合放射学和临床信息的潜力,提高了诊断准确性,并提供了一个可解释、可推广的决策支持系统。
{"title":"Predicting mortality in hospitalized influenza patients: integration of deep learning-based chest X-ray severity score (FluDeep-XR) and clinical variables.","authors":"Meng-Han Tsai, Sung-Chu Ko, Amy Huaishiuan Huang, Lorenzo Porta, Cecilia Ferretti, Clarissa Longhi, Wan-Ting Hsu, Yung-Han Chang, Jo-Ching Hsiung, Chin-Hua Su, Filippo Galbiati, Chien-Chang Lee","doi":"10.1093/jamia/ocae286","DOIUrl":"10.1093/jamia/ocae286","url":null,"abstract":"<p><strong>Objectives: </strong>To pioneer the first artificial intelligence system integrating radiological and objective clinical data, simulating the clinical reasoning process, for the early prediction of high-risk influenza patients.</p><p><strong>Materials and methods: </strong>Our system was developed using a cohort from National Taiwan University Hospital in Taiwan, with external validation data from ASST Grande Ospedale Metropolitano Niguarda in Italy. Convolutional neural networks pretrained on ImageNet were regressively trained using a 5-point scale to develop the influenza chest X-ray (CXR) severity scoring model, FluDeep-XR. Early, late, and joint fusion structures, incorporating varying weights of CXR severity with clinical data, were designed to predict 30-day mortality and compared with models using only CXR or clinical data. The best-performing model was designated as FluDeep. The explainability of FluDeep-XR and FluDeep was illustrated through activation maps and SHapley Additive exPlanations (SHAP).</p><p><strong>Results: </strong>The Xception-based model, FluDeep-XR, achieved a mean square error of 0.738 in the external validation dataset. The Random Forest-based late fusion model, FluDeep, outperformed all the other models, achieving an area under the receiver operating curve of 0.818 and a sensitivity of 0.706 in the external dataset. Activation maps highlighted clear lung fields. Shapley additive explanations identified age, C-reactive protein, hematocrit, heart rate, and respiratory rate as the top 5 important clinical features.</p><p><strong>Discussion: </strong>The integration of medical imaging with objective clinical data outperformed single-modality models to predict 30-day mortality in influenza patients. We ensured the explainability of our models aligned with clinical knowledge and validated its applicability across foreign institutions.</p><p><strong>Conclusion: </strong>FluDeep highlights the potential of combining radiological and clinical information in late fusion design, enhancing diagnostic accuracy and offering an explainable, and generalizable decision support system.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"133-143"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758471/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142689371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating state-space modeling, parameter estimation, deep learning, and docking techniques in drug repurposing: a case study on COVID-19 cytokine storm. 整合状态空间建模、参数估计、深度学习和对接技术在药物再利用中的应用——以COVID-19细胞因子风暴为例
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf035
Abhisek Bakshi, Kaustav Gangopadhyay, Sujit Basak, Rajat K De, Souvik Sengupta, Abhijit Dasgupta

Objective: This study addresses the significant challenges posed by emerging SARS-CoV-2 variants, particularly in developing diagnostics and therapeutics. Drug repurposing is investigated by identifying critical regulatory proteins impacted by the virus, providing rapid and effective therapeutic solutions for better disease management.

Materials and methods: We employed a comprehensive approach combining mathematical modeling and efficient parameter estimation to study the transient responses of regulatory proteins in both normal and virus-infected cells. Proportional-integral-derivative (PID) controllers were used to pinpoint specific protein targets for therapeutic intervention. Additionally, advanced deep learning models and molecular docking techniques were applied to analyse drug-target and drug-drug interactions, ensuring both efficacy and safety of the proposed treatments. This approach was applied to a case study focused on the cytokine storm in COVID-19, centering on Angiotensin-converting enzyme 2 (ACE2), which plays a key role in SARS-CoV-2 infection.

Results: Our findings suggest that activating ACE2 presents a promising therapeutic strategy, whereas inhibiting AT1R seems less effective. Deep learning models, combined with molecular docking, identified Lomefloxacin and Fostamatinib as stable drugs with no significant thermodynamic interactions, suggesting their safe concurrent use in managing COVID-19-induced cytokine storms.

Discussion: The results highlight the potential of ACE2 activation in mitigating lung injury and severe inflammation caused by SARS-CoV-2. This integrated approach accelerates the identification of safe and effective treatment options for emerging viral variants.

Conclusion: This framework provides an efficient method for identifying critical regulatory proteins and advancing drug repurposing, contributing to the rapid development of therapeutic strategies for COVID-19 and future global pandemics.

目的:本研究旨在解决新出现的SARS-CoV-2变体带来的重大挑战,特别是在开发诊断和治疗方法方面。通过鉴定受病毒影响的关键调节蛋白来研究药物再利用,为更好的疾病管理提供快速有效的治疗解决方案。材料和方法:采用数学建模和高效参数估计相结合的综合方法,研究了调节蛋白在正常细胞和病毒感染细胞中的瞬时反应。比例-积分-导数(PID)控制器用于精确定位治疗干预的特定蛋白质靶点。此外,采用先进的深度学习模型和分子对接技术分析药物-靶点和药物-药物相互作用,确保所提出治疗的有效性和安全性。该方法应用于一项以COVID-19细胞因子风暴为中心的案例研究,该研究以血管紧张素转换酶2 (ACE2)为中心,该酶在SARS-CoV-2感染中起关键作用。结果:我们的研究结果表明,激活ACE2是一种很有希望的治疗策略,而抑制AT1R似乎不太有效。深度学习模型结合分子对接,确定了洛美沙星和福斯塔马替尼是稳定的药物,没有明显的热力学相互作用,表明它们可以安全地同时用于控制covid -19诱导的细胞因子风暴。讨论:这些结果强调了ACE2激活在减轻SARS-CoV-2引起的肺损伤和严重炎症方面的潜力。这种综合方法加速了对新出现的病毒变体的安全有效治疗选择的确定。结论:该框架为鉴定关键调控蛋白和推进药物再利用提供了有效方法,有助于快速制定COVID-19和未来全球大流行的治疗策略。
{"title":"Integrating state-space modeling, parameter estimation, deep learning, and docking techniques in drug repurposing: a case study on COVID-19 cytokine storm.","authors":"Abhisek Bakshi, Kaustav Gangopadhyay, Sujit Basak, Rajat K De, Souvik Sengupta, Abhijit Dasgupta","doi":"10.1093/jamia/ocaf035","DOIUrl":"10.1093/jamia/ocaf035","url":null,"abstract":"<p><strong>Objective: </strong>This study addresses the significant challenges posed by emerging SARS-CoV-2 variants, particularly in developing diagnostics and therapeutics. Drug repurposing is investigated by identifying critical regulatory proteins impacted by the virus, providing rapid and effective therapeutic solutions for better disease management.</p><p><strong>Materials and methods: </strong>We employed a comprehensive approach combining mathematical modeling and efficient parameter estimation to study the transient responses of regulatory proteins in both normal and virus-infected cells. Proportional-integral-derivative (PID) controllers were used to pinpoint specific protein targets for therapeutic intervention. Additionally, advanced deep learning models and molecular docking techniques were applied to analyse drug-target and drug-drug interactions, ensuring both efficacy and safety of the proposed treatments. This approach was applied to a case study focused on the cytokine storm in COVID-19, centering on Angiotensin-converting enzyme 2 (ACE2), which plays a key role in SARS-CoV-2 infection.</p><p><strong>Results: </strong>Our findings suggest that activating ACE2 presents a promising therapeutic strategy, whereas inhibiting AT1R seems less effective. Deep learning models, combined with molecular docking, identified Lomefloxacin and Fostamatinib as stable drugs with no significant thermodynamic interactions, suggesting their safe concurrent use in managing COVID-19-induced cytokine storms.</p><p><strong>Discussion: </strong>The results highlight the potential of ACE2 activation in mitigating lung injury and severe inflammation caused by SARS-CoV-2. This integrated approach accelerates the identification of safe and effective treatment options for emerging viral variants.</p><p><strong>Conclusion: </strong>This framework provides an efficient method for identifying critical regulatory proteins and advancing drug repurposing, contributing to the rapid development of therapeutic strategies for COVID-19 and future global pandemics.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"193-209"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758467/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143450819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI-Techniques Loss-Based Algorithm for Severity Classification (ATLAS): a novel approach for continuous quantification of exertional symptoms during incremental exercise testing. 基于损失的严重程度分类算法(ATLAS):一种在增量运动试验中连续量化运动症状的新方法。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf051
Abed A Hijleh, Sophia Wang, Danilo C Berton, Igor Neder-Serafini, Sandra Vincent, Matthew James, Nicolle Domnik, Devin Phillips, Luiz E Nery, Denis E O'Donnell, J Alberto Neder

Objective: Heightened muscular effort and breathlessness (dyspnea) are disabling sensory experiences. We sought to improve the current approach of assessing these symptoms only at the maximal effort to new paradigms based on their continuous quantification throughout cardiopulmonary exercise testing (CPET).

Materials and methods: After establishing sex- and age-adjusted reference centiles (0-10 Borg scale), we developed a novel algorithm (AI-Techniques Loss-Based Algorithm for Severity Classification [ATLAS]) based on reciprocal exponential loss for CPET data from patients with chronic obstructive lung disease of varied severity.

Results: Categories of dyspnea intensity by ATLAS-but not dyspnea at peak exercise-correctly discriminated patients in progressively higher resting and exercise impairment (P < .05).

Discussion: This new AI-techniques approach will be translated to the care of disabled patients to uncover the seeds and consequences of their activity-related symptoms.

Conclusions: We used innovative informatics research to change paradigms in displaying, quantifying, and analyzing effort-related symptoms in patient populations.

目的:增强的肌肉用力和呼吸困难(呼吸困难)是致残的感觉体验。我们试图改进目前仅在最大程度上评估这些症状的方法,以在心肺运动试验(CPET)中持续量化这些症状的新范式。材料和方法:在建立了性别和年龄调整的参考百分位数(0-10 Borg量表)后,我们基于不同严重程度慢性阻塞性肺病患者CPET数据的倒数指数损失,开发了一种新的算法(AI-Techniques loss - based algorithm for Severity Classification [ATLAS])。结果:atlas的呼吸困难强度分类——但不是运动高峰时的呼吸困难——正确地区分了渐进式高静息和运动障碍患者(P讨论:这种新的人工智能技术方法将被转化为残疾患者的护理,以揭示其活动相关症状的根源和后果。结论:我们使用创新的信息学研究来改变患者群体中与努力相关的症状的显示、量化和分析范式。
{"title":"AI-Techniques Loss-Based Algorithm for Severity Classification (ATLAS): a novel approach for continuous quantification of exertional symptoms during incremental exercise testing.","authors":"Abed A Hijleh, Sophia Wang, Danilo C Berton, Igor Neder-Serafini, Sandra Vincent, Matthew James, Nicolle Domnik, Devin Phillips, Luiz E Nery, Denis E O'Donnell, J Alberto Neder","doi":"10.1093/jamia/ocaf051","DOIUrl":"10.1093/jamia/ocaf051","url":null,"abstract":"<p><strong>Objective: </strong>Heightened muscular effort and breathlessness (dyspnea) are disabling sensory experiences. We sought to improve the current approach of assessing these symptoms only at the maximal effort to new paradigms based on their continuous quantification throughout cardiopulmonary exercise testing (CPET).</p><p><strong>Materials and methods: </strong>After establishing sex- and age-adjusted reference centiles (0-10 Borg scale), we developed a novel algorithm (AI-Techniques Loss-Based Algorithm for Severity Classification [ATLAS]) based on reciprocal exponential loss for CPET data from patients with chronic obstructive lung disease of varied severity.</p><p><strong>Results: </strong>Categories of dyspnea intensity by ATLAS-but not dyspnea at peak exercise-correctly discriminated patients in progressively higher resting and exercise impairment (P < .05).</p><p><strong>Discussion: </strong>This new AI-techniques approach will be translated to the care of disabled patients to uncover the seeds and consequences of their activity-related symptoms.</p><p><strong>Conclusions: </strong>We used innovative informatics research to change paradigms in displaying, quantifying, and analyzing effort-related symptoms in patient populations.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"220-226"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758462/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143732704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Navigating the landscape of personalized oncology: overcoming challenges and expanding horizons with computational modeling. 导航个性化肿瘤学的景观:克服挑战和扩大视野与计算建模。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf144
Melike Sirlanci, David Albers, Jennifer Kwak, Clayton Smith, Tellen D Bennett, Steven M Bair

Objectives: We discuss challenges using computational modeling approaches for personalized prediction in clinical practice to predict treatment response for rare diseases treated by novel therapies using clinical oncology as an example context. Several challenges are discussed, including data scarcity, data sparsity, and difficulties in establishing interdisciplinary teams. Machine learning (ML), mechanistic modeling (MM), and hybrid modeling (HM) are discussed in the context of these challenges.

Materials and methods: We present an HM approach, combining ML and MM techniques for improved personalized model estimation in the context of chimeric antigen receptor T-cell therapy for aggressive lymphoma.

Results: The HM approach improved the root mean squared error by 61.27±23.21% compared to using MM alone (MM: 2.36*105∓1.68*105and HM: 9.57*104∓8.37*104, where the units are in cells), computed from 13 patients included in this study.

Discussion: By exploiting the complementary strengths of ML and MM approaches, the developed HM method addresses common limitations such as data scarcity and sparsity in medical settings, especially common for rare diseases.

Conclusion: The HM techniques are likely required to overcome data scarcity and sparsity issues in broad medical settings. Developing these techniques requires dedicated interdisciplinary teams.

目的:我们讨论了在临床实践中使用计算建模方法进行个性化预测的挑战,以临床肿瘤学为例,预测新疗法治疗罕见疾病的治疗反应。讨论了几个挑战,包括数据稀缺性、数据稀疏性和建立跨学科团队的困难。在这些挑战的背景下讨论了机器学习(ML)、机械建模(MM)和混合建模(HM)。材料和方法:我们提出了一种HM方法,结合ML和MM技术,在嵌合抗原受体t细胞治疗侵袭性淋巴瘤的背景下改进个性化模型估计。结果:与单独使用MM相比,HM方法的均方根误差提高了61.27±23.21% (MM: 2.36*105然/ / 1.68*105,HM: 9.57*104然/ / 8.37*104,其中单位为细胞),计算结果来自本研究纳入的13例患者。讨论:通过利用ML和MM方法的互补优势,开发的HM方法解决了医疗环境中的常见限制,例如数据稀缺性和稀疏性,特别是罕见疾病。结论:HM技术可能需要克服广泛医疗环境中的数据稀缺性和稀疏性问题。开发这些技术需要专门的跨学科团队。
{"title":"Navigating the landscape of personalized oncology: overcoming challenges and expanding horizons with computational modeling.","authors":"Melike Sirlanci, David Albers, Jennifer Kwak, Clayton Smith, Tellen D Bennett, Steven M Bair","doi":"10.1093/jamia/ocaf144","DOIUrl":"10.1093/jamia/ocaf144","url":null,"abstract":"<p><strong>Objectives: </strong>We discuss challenges using computational modeling approaches for personalized prediction in clinical practice to predict treatment response for rare diseases treated by novel therapies using clinical oncology as an example context. Several challenges are discussed, including data scarcity, data sparsity, and difficulties in establishing interdisciplinary teams. Machine learning (ML), mechanistic modeling (MM), and hybrid modeling (HM) are discussed in the context of these challenges.</p><p><strong>Materials and methods: </strong>We present an HM approach, combining ML and MM techniques for improved personalized model estimation in the context of chimeric antigen receptor T-cell therapy for aggressive lymphoma.</p><p><strong>Results: </strong>The HM approach improved the root mean squared error by 61.27±23.21% compared to using MM alone (MM: 2.36*105∓1.68*105and HM: 9.57*104∓8.37*104, where the units are in cells), computed from 13 patients included in this study.</p><p><strong>Discussion: </strong>By exploiting the complementary strengths of ML and MM approaches, the developed HM method addresses common limitations such as data scarcity and sparsity in medical settings, especially common for rare diseases.</p><p><strong>Conclusion: </strong>The HM techniques are likely required to overcome data scarcity and sparsity issues in broad medical settings. Developing these techniques requires dedicated interdisciplinary teams.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"242-251"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758477/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145001877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning-based risk prediction of outcomes in patients hospitalized with COVID-19 in Australia: the AUS-COVID Score. 基于机器学习的澳大利亚COVID-19住院患者预后风险预测:AUS-COVID评分
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf016
Hari P Sritharan, Harrison Nguyen, William van Gaal, Leonard Kritharides, Clara K Chow, Ravinay Bhindi

Objectives: We aimed to develop a highly interpretable and effective, machine learning (ML)-based risk prediction algorithm to predict in-hospital mortality, intubation, and adverse cardiovascular events in patients hospitalized with coronavirus disease 2019 (COVID-19) in Australia (AUS-COVID Score).

Materials and methods: This prospective study across 21 hospitals included 1714 consecutive patients aged ≥ 18 in their index hospitalization with COVID-19. The dataset was separated into training (80%) and test sets (20%). Eight supervised ML methods were used: least absolute shrinkage and selection operator (LASSO), ridge, elastic net (EN), decision tree, support vector machine, random forest, AdaBoost, and gradient boosting. A feature selection method was used to establish informative variables, which were considered in groups of 5/10/15/20/all. The final model was selected by balancing the optimal area under the curve (AUC) score with interpretability, through the number of included variables. The coefficients of the final models were used to build the AUS-COVID Score.

Results and discussion: Among the patients, 181 (10.6%) died in-hospital, 148 (8.6%) required intubation, and 90 (5.3%) had adverse cardiovascular events. The LASSO model performed best for predicting in-hospital mortality (AUC 0.85) using 5 variables: age, respiratory rate, COVID-19 features on chest X-ray, troponin elevation, and COVID-19 vaccination (≥1 dose). The EN model performed best for predicting intubation (AUC 0.75) and adverse cardiovascular events (AUC 0.64), each with 5 variables. A user-friendly web-based application was built for clinician use at the bedside.

Conclusion: The AUS-COVID Score is an accurate and practical, ML-based risk score to predict in-hospital mortality, intubation, and adverse cardiovascular events in hospitalized COVID-19 patients.

目的:我们旨在开发一种高度可解释且有效的基于机器学习的风险预测算法,以预测澳大利亚因COVID-19住院的患者的住院死亡率、插管和不良心血管事件(AUS-COVID Score)。材料和方法:本前瞻性研究纳入了21家医院1714例年龄≥18岁的COVID-19指数住院患者。数据集被分为训练集(80%)和测试集(20%)。使用了LASSO、ridge、elastic net (EN)、决策树、支持向量机、随机森林、AdaBoost和梯度增强等8种监督机器学习方法。采用特征选择方法建立信息变量,以5/10/15/20/all为组进行考虑。通过纳入变量的数量,权衡最优曲线下面积(AUC)得分与可解释性,选择最终模型。将最终模型的系数用于构建AUS-COVID评分。结果与讨论:住院死亡181例(10.6%),需要插管148例(8.6%),发生心血管不良事件90例(5.3%)。LASSO模型在预测院内死亡率(AUC 0.85)方面表现最佳,使用五个变量:年龄、呼吸频率、胸片(CXR)上的COVID-19特征、肌钙蛋白升高和COVID-19疫苗接种(≥1剂)。Elastic Net模型在预测插管(AUC为0.75)和不良心血管事件(AUC为0.64)方面表现最好,每个模型都有五个变量。建立了一个用户友好的基于web的应用程序,供临床医生在床边使用。结论:AUS-COVID评分是一种准确实用的基于机器学习的风险评分,可预测住院COVID-19患者的住院死亡率、插管率和心血管不良事件。
{"title":"Machine learning-based risk prediction of outcomes in patients hospitalized with COVID-19 in Australia: the AUS-COVID Score.","authors":"Hari P Sritharan, Harrison Nguyen, William van Gaal, Leonard Kritharides, Clara K Chow, Ravinay Bhindi","doi":"10.1093/jamia/ocaf016","DOIUrl":"10.1093/jamia/ocaf016","url":null,"abstract":"<p><strong>Objectives: </strong>We aimed to develop a highly interpretable and effective, machine learning (ML)-based risk prediction algorithm to predict in-hospital mortality, intubation, and adverse cardiovascular events in patients hospitalized with coronavirus disease 2019 (COVID-19) in Australia (AUS-COVID Score).</p><p><strong>Materials and methods: </strong>This prospective study across 21 hospitals included 1714 consecutive patients aged ≥ 18 in their index hospitalization with COVID-19. The dataset was separated into training (80%) and test sets (20%). Eight supervised ML methods were used: least absolute shrinkage and selection operator (LASSO), ridge, elastic net (EN), decision tree, support vector machine, random forest, AdaBoost, and gradient boosting. A feature selection method was used to establish informative variables, which were considered in groups of 5/10/15/20/all. The final model was selected by balancing the optimal area under the curve (AUC) score with interpretability, through the number of included variables. The coefficients of the final models were used to build the AUS-COVID Score.</p><p><strong>Results and discussion: </strong>Among the patients, 181 (10.6%) died in-hospital, 148 (8.6%) required intubation, and 90 (5.3%) had adverse cardiovascular events. The LASSO model performed best for predicting in-hospital mortality (AUC 0.85) using 5 variables: age, respiratory rate, COVID-19 features on chest X-ray, troponin elevation, and COVID-19 vaccination (≥1 dose). The EN model performed best for predicting intubation (AUC 0.75) and adverse cardiovascular events (AUC 0.64), each with 5 variables. A user-friendly web-based application was built for clinician use at the bedside.</p><p><strong>Conclusion: </strong>The AUS-COVID Score is an accurate and practical, ML-based risk score to predict in-hospital mortality, intubation, and adverse cardiovascular events in hospitalized COVID-19 patients.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"210-219"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758480/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143043172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the American Medical Informatics Association
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1