首页 > 最新文献

PLOS digital health最新文献

英文 中文
Use of a continuous single lead electrocardiogram analytic to predict patient deterioration requiring rapid response team activation. 使用连续单导联心电图分析仪预测需要启动快速反应小组的病人病情恶化情况。
Pub Date : 2024-10-24 eCollection Date: 2024-10-01 DOI: 10.1371/journal.pdig.0000465
Sooin Lee, Bryce Benson, Ashwin Belle, Richard P Medlin, David Jerkins, Foster Goss, Ashish K Khanna, Michael A DeVita, Kevin R Ward

Identifying the onset of patient deterioration is challenging despite the potential to respond to patients earlier with better vital sign monitoring and rapid response team (RRT) activation. In this study an ECG based software as a medical device, the Analytic for Hemodynamic Instability Predictive Index (AHI-PI), was compared to the vital signs of heart rate, blood pressure, and respiratory rate, evaluating how early it indicated risk before an RRT activation. A higher proportion of the events had risk indication by AHI-PI (92.71%) than by vital signs (41.67%). AHI-PI indicated risk early, with an average of over a day before RRT events. In events whose risks were indicated by both AHI-PI and vital signs, AHI-PI demonstrated earlier recognition of deterioration compared to vital signs. A case-control study showed that situations requiring RRTs were more likely to have AHI-PI risk indication than those that did not. The study derived several insights in support of AHI-PI's efficacy as a clinical decision support system. The findings demonstrated AHI-PI's potential to serve as a reliable predictor of future RRT events. It could potentially help clinicians recognize early clinical deterioration and respond to those unnoticed by vital signs, thereby helping clinicians improve clinical outcomes.

尽管通过更好的生命体征监测和快速反应小组(RRT)的启动可以更早地对患者做出反应,但识别患者病情恶化的起始时间仍具有挑战性。在这项研究中,将基于心电图软件的医疗设备--血流动力学不稳定性预测指数分析仪(AHI-PI)与心率、血压和呼吸频率等生命体征进行了比较,以评估在启动 RRT 之前,AHI-PI 能多早显示风险。与生命体征(41.67%)相比,AHI-PI(92.71%)能更早地提示风险。AHI-PI 提示风险的时间较早,平均比 RRT 事件早一天以上。在 AHI-PI 和生命体征均可提示风险的事件中,AHI-PI 比生命体征更早地识别出病情恶化。一项病例对照研究显示,需要 RRT 的情况比不需要 RRT 的情况更有可能出现 AHI-PI 风险提示。该研究得出了一些见解,支持 AHI-PI 作为临床决策支持系统的功效。研究结果表明,AHI-PI 有可能成为未来 RRT 事件的可靠预测指标。它有可能帮助临床医生识别早期临床恶化,并对生命体征未注意到的情况做出反应,从而帮助临床医生改善临床预后。
{"title":"Use of a continuous single lead electrocardiogram analytic to predict patient deterioration requiring rapid response team activation.","authors":"Sooin Lee, Bryce Benson, Ashwin Belle, Richard P Medlin, David Jerkins, Foster Goss, Ashish K Khanna, Michael A DeVita, Kevin R Ward","doi":"10.1371/journal.pdig.0000465","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000465","url":null,"abstract":"<p><p>Identifying the onset of patient deterioration is challenging despite the potential to respond to patients earlier with better vital sign monitoring and rapid response team (RRT) activation. In this study an ECG based software as a medical device, the Analytic for Hemodynamic Instability Predictive Index (AHI-PI), was compared to the vital signs of heart rate, blood pressure, and respiratory rate, evaluating how early it indicated risk before an RRT activation. A higher proportion of the events had risk indication by AHI-PI (92.71%) than by vital signs (41.67%). AHI-PI indicated risk early, with an average of over a day before RRT events. In events whose risks were indicated by both AHI-PI and vital signs, AHI-PI demonstrated earlier recognition of deterioration compared to vital signs. A case-control study showed that situations requiring RRTs were more likely to have AHI-PI risk indication than those that did not. The study derived several insights in support of AHI-PI's efficacy as a clinical decision support system. The findings demonstrated AHI-PI's potential to serve as a reliable predictor of future RRT events. It could potentially help clinicians recognize early clinical deterioration and respond to those unnoticed by vital signs, thereby helping clinicians improve clinical outcomes.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000465"},"PeriodicalIF":0.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11500862/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conceptualizing bias in EHR data: A case study in performance disparities by demographic subgroups for a pediatric obesity incidence classifier. 将电子病历数据中的偏差概念化:儿科肥胖症发病率分类器的人口亚群绩效差异案例研究。
Pub Date : 2024-10-23 eCollection Date: 2024-10-01 DOI: 10.1371/journal.pdig.0000642
Elizabeth A Campbell, Saurav Bose, Aaron J Masino

Electronic Health Records (EHRs) are increasingly used to develop machine learning models in predictive medicine. There has been limited research on utilizing machine learning methods to predict childhood obesity and related disparities in classifier performance among vulnerable patient subpopulations. In this work, classification models are developed to recognize pediatric obesity using temporal condition patterns obtained from patient EHR data in a U.S. study population. We trained four machine learning algorithms (Logistic Regression, Random Forest, Gradient Boosted Trees, and Neural Networks) to classify cases and controls as obesity positive or negative, and optimized hyperparameter settings through a bootstrapping methodology. To assess the classifiers for bias, we studied model performance by population subgroups then used permutation analysis to identify the most predictive features for each model and the demographic characteristics of patients with these features. Mean AUC-ROC values were consistent across classifiers, ranging from 0.72-0.80. Some evidence of bias was identified, although this was through the models performing better for minority subgroups (African Americans and patients enrolled in Medicaid). Permutation analysis revealed that patients from vulnerable population subgroups were over-represented among patients with the most predictive diagnostic patterns. We hypothesize that our models performed better on under-represented groups because the features more strongly associated with obesity were more commonly observed among minority patients. These findings highlight the complex ways that bias may arise in machine learning models and can be incorporated into future research to develop a thorough analytical approach to identify and mitigate bias that may arise from features and within EHR datasets when developing more equitable models.

电子健康记录(EHR)越来越多地被用于开发预测医学中的机器学习模型。在利用机器学习方法预测儿童肥胖症以及易受影响的患者亚群中分类器性能的相关差异方面,研究还很有限。在这项工作中,我们开发了分类模型,利用从美国研究人群的患者电子病历数据中获得的时间条件模式来识别小儿肥胖症。我们训练了四种机器学习算法(逻辑回归、随机森林、梯度提升树和神经网络),将病例和对照组划分为肥胖阳性或阴性,并通过引导方法优化了超参数设置。为了评估分类器的偏差,我们研究了不同人群亚群的模型性能,然后使用置换分析确定了每个模型最具预测性的特征以及具有这些特征的患者的人口统计学特征。不同分类器的平均 AUC-ROC 值一致,范围在 0.72-0.80 之间。发现了一些偏倚的证据,但这是通过模型对少数族裔亚群(非裔美国人和参加医疗补助的患者)的表现更好而发现的。置换分析表明,弱势人群亚群的患者在最具预测性诊断模式的患者中比例过高。我们假设,我们的模型在代表性不足的群体中表现更佳,因为在少数群体患者中更常观察到与肥胖关联更强的特征。这些发现凸显了机器学习模型中可能出现偏差的复杂方式,可将其纳入未来的研究中,以开发一种全面的分析方法,在开发更公平的模型时,识别并减轻可能来自特征和电子病历数据集的偏差。
{"title":"Conceptualizing bias in EHR data: A case study in performance disparities by demographic subgroups for a pediatric obesity incidence classifier.","authors":"Elizabeth A Campbell, Saurav Bose, Aaron J Masino","doi":"10.1371/journal.pdig.0000642","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000642","url":null,"abstract":"<p><p>Electronic Health Records (EHRs) are increasingly used to develop machine learning models in predictive medicine. There has been limited research on utilizing machine learning methods to predict childhood obesity and related disparities in classifier performance among vulnerable patient subpopulations. In this work, classification models are developed to recognize pediatric obesity using temporal condition patterns obtained from patient EHR data in a U.S. study population. We trained four machine learning algorithms (Logistic Regression, Random Forest, Gradient Boosted Trees, and Neural Networks) to classify cases and controls as obesity positive or negative, and optimized hyperparameter settings through a bootstrapping methodology. To assess the classifiers for bias, we studied model performance by population subgroups then used permutation analysis to identify the most predictive features for each model and the demographic characteristics of patients with these features. Mean AUC-ROC values were consistent across classifiers, ranging from 0.72-0.80. Some evidence of bias was identified, although this was through the models performing better for minority subgroups (African Americans and patients enrolled in Medicaid). Permutation analysis revealed that patients from vulnerable population subgroups were over-represented among patients with the most predictive diagnostic patterns. We hypothesize that our models performed better on under-represented groups because the features more strongly associated with obesity were more commonly observed among minority patients. These findings highlight the complex ways that bias may arise in machine learning models and can be incorporated into future research to develop a thorough analytical approach to identify and mitigate bias that may arise from features and within EHR datasets when developing more equitable models.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000642"},"PeriodicalIF":0.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11498669/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impact of observability period on the classification of COPD diagnosis timing among Medicare beneficiaries with lung cancer. 观察期对肺癌医疗保险受益人慢性阻塞性肺疾病诊断时间分类的影响。
Pub Date : 2024-10-22 eCollection Date: 2024-10-01 DOI: 10.1371/journal.pdig.0000633
Eman Metwally, Sarah E Soppe, Jennifer L Lund, Sharon Peacock Hinton, Caroline A Thompson

Background: Investigators often use claims data to estimate the diagnosis timing of chronic conditions. However, misclassification of chronic conditions is common due to variability in healthcare utilization and in claims history across patients.

Objective: We aimed to quantify the effect of various Medicare fee-for-service continuous enrollment period and lookback period (LBP) on misclassification of COPD and sample size.

Methods: A stepwise tutorial to classify COPD, based on its diagnosis timing relative to lung cancer diagnosis using the Surveillance Epidemiology and End Results cancer registry linked to Medicare insurance claims. We used 3 approaches varying the LBP and required continuous enrollment (i.e., observability) period between 1 to 5 years. Patients with lung cancer were classified based on their COPD related healthcare utilization into 3 groups: pre-existing COPD (diagnosis at least 3 months before lung cancer diagnosis), concurrent COPD (diagnosis during the -/+ 3months of lung cancer diagnosis), and non-COPD. Among those with 5 years of continuous enrollment, we estimated the sensitivity of the LBP to ascertain COPD diagnosis as the number of patients with pre-existing COPD using a shorter LBP divided by the number of patients with pre-existing COPD using a longer LBP.

Results: Extending the LBP from 1 to 5 years increased prevalence of pre-existing COPD from ~ 36% to 51%, decreased both concurrent COPD from ~ 34% to 23% and non-COPD from ~ 29% to 25%. There was minimal effect of extending the required continuous enrollment period beyond one year across various LBPs. In those with 5 years of continuous enrollment, sensitivity of COPD classification (95% CI) increased with longer LBP from 70.1% (69.7% to 70.4%) for one-year LBP to 100% for 5-years LBP.

Conclusion: The length of optimum LBP and continuous enrollment period depends on the context of the research question and the data generating mechanisms. Among Medicare beneficiaries, the best approach to identify diagnosis timing of COPD relative to lung cancer diagnosis is to use all available LBP with at least one year of required continuous enrollment.

背景:调查人员经常使用索赔数据来估算慢性病的诊断时间。然而,由于不同患者的医疗保健使用情况和报销历史存在差异,慢性病的错误分类很常见:我们的目的是量化各种医疗保险付费服务连续参保期和回溯期(LBP)对慢性阻塞性肺病分类错误和样本量的影响:通过使用与医疗保险索赔相关联的 "监测流行病学和最终结果 "癌症登记处,根据慢性阻塞性肺病与肺癌诊断的相对诊断时间,对慢性阻塞性肺病进行逐步分类。我们采用了 3 种不同的肺结核分类方法,并要求连续登记(即可观察性)时间在 1 到 5 年之间。肺癌患者根据其慢性阻塞性肺病相关的医疗保健使用情况分为 3 组:原有慢性阻塞性肺病(肺癌确诊前至少 3 个月确诊)、并发慢性阻塞性肺病(肺癌确诊前 -/+ 3 个月确诊)和非慢性阻塞性肺病。在连续登记 5 年的患者中,我们估算了枸橼酸脯氨酸酯酶对确定慢性阻塞性肺病诊断的灵敏度,即使用较短枸橼酸脯氨酸酯酶的原有慢性阻塞性肺病患者人数除以使用较长枸橼酸脯氨酸酯酶的原有慢性阻塞性肺病患者人数:结果:将枸橼酸脯氨酸苷的有效期从 1 年延长至 5 年,原有慢性阻塞性肺病的患病率从约 36% 上升至 51%,并发慢性阻塞性肺病的患病率从约 34% 下降至 23%,非慢性阻塞性肺病的患病率从约 29% 下降至 25%。在各种肺结核中,将所需的连续参保时间延长至一年以上的影响微乎其微。在连续登记 5 年的患者中,慢性阻塞性肺病分类的灵敏度(95% CI)随着枸杞期的延长而增加,从一年枸杞期的 70.1%(69.7% 至 70.4%)增加到 5 年枸杞期的 100%:最佳 LBP 和连续登记期的长度取决于研究问题的背景和数据生成机制。在医疗保险受益人中,确定 COPD 诊断时间与肺癌诊断时间的最佳方法是使用所有可用的 LBP,并要求至少有一年的连续登记时间。
{"title":"Impact of observability period on the classification of COPD diagnosis timing among Medicare beneficiaries with lung cancer.","authors":"Eman Metwally, Sarah E Soppe, Jennifer L Lund, Sharon Peacock Hinton, Caroline A Thompson","doi":"10.1371/journal.pdig.0000633","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000633","url":null,"abstract":"<p><strong>Background: </strong>Investigators often use claims data to estimate the diagnosis timing of chronic conditions. However, misclassification of chronic conditions is common due to variability in healthcare utilization and in claims history across patients.</p><p><strong>Objective: </strong>We aimed to quantify the effect of various Medicare fee-for-service continuous enrollment period and lookback period (LBP) on misclassification of COPD and sample size.</p><p><strong>Methods: </strong>A stepwise tutorial to classify COPD, based on its diagnosis timing relative to lung cancer diagnosis using the Surveillance Epidemiology and End Results cancer registry linked to Medicare insurance claims. We used 3 approaches varying the LBP and required continuous enrollment (i.e., observability) period between 1 to 5 years. Patients with lung cancer were classified based on their COPD related healthcare utilization into 3 groups: pre-existing COPD (diagnosis at least 3 months before lung cancer diagnosis), concurrent COPD (diagnosis during the -/+ 3months of lung cancer diagnosis), and non-COPD. Among those with 5 years of continuous enrollment, we estimated the sensitivity of the LBP to ascertain COPD diagnosis as the number of patients with pre-existing COPD using a shorter LBP divided by the number of patients with pre-existing COPD using a longer LBP.</p><p><strong>Results: </strong>Extending the LBP from 1 to 5 years increased prevalence of pre-existing COPD from ~ 36% to 51%, decreased both concurrent COPD from ~ 34% to 23% and non-COPD from ~ 29% to 25%. There was minimal effect of extending the required continuous enrollment period beyond one year across various LBPs. In those with 5 years of continuous enrollment, sensitivity of COPD classification (95% CI) increased with longer LBP from 70.1% (69.7% to 70.4%) for one-year LBP to 100% for 5-years LBP.</p><p><strong>Conclusion: </strong>The length of optimum LBP and continuous enrollment period depends on the context of the research question and the data generating mechanisms. Among Medicare beneficiaries, the best approach to identify diagnosis timing of COPD relative to lung cancer diagnosis is to use all available LBP with at least one year of required continuous enrollment.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000633"},"PeriodicalIF":0.0,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11495636/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning and diSentangling patient static information from time-series Electronic hEalth Records (STEER). 从时间序列电子健康记录(STEER)中学习和识别患者静态信息。
Pub Date : 2024-10-21 eCollection Date: 2024-10-01 DOI: 10.1371/journal.pdig.0000640
Wei Liao, Joel Voldman

Recent work in machine learning for healthcare has raised concerns about patient privacy and algorithmic fairness. Previous work has shown that self-reported race can be predicted from medical data that does not explicitly contain racial information. However, the extent of data identification is unknown, and we lack ways to develop models whose outcomes are minimally affected by such information. Here we systematically investigated the ability of time-series electronic health record data to predict patient static information. We found that not only the raw time-series data, but also learned representations from machine learning models, can be trained to predict a variety of static information with area under the receiver operating characteristic curve as high as 0.851 for biological sex, 0.869 for binarized age and 0.810 for self-reported race. Such high predictive performance can be extended to various comorbidity factors and exists even when the model was trained for different tasks, using different cohorts, using different model architectures and databases. Given the privacy and fairness concerns these findings pose, we develop a variational autoencoder-based approach that learns a structured latent space to disentangle patient-sensitive attributes from time-series data. Our work thoroughly investigates the ability of machine learning models to encode patient static information from time-series electronic health records and introduces a general approach to protect patient-sensitive information for downstream tasks.

最近在医疗保健领域开展的机器学习工作引起了人们对患者隐私和算法公平性的关注。之前的研究表明,自我报告的种族可以从不具种族信息的医疗数据中预测出来。然而,数据识别的程度尚不可知,我们也没有办法开发出其结果受此类信息影响最小的模型。在此,我们系统地研究了时间序列电子健康记录数据预测患者静态信息的能力。我们发现,不仅原始的时间序列数据,而且从机器学习模型中学习到的表征,都可以通过训练来预测各种静态信息,其接收者操作特征曲线下面积对生物性别的预测高达 0.851,对二进制年龄的预测高达 0.869,对自我报告的种族的预测高达 0.810。如此高的预测性能可以扩展到各种合并症因素,即使模型是针对不同的任务、使用不同的队列、使用不同的模型架构和数据库进行训练时也是如此。考虑到这些发现对隐私和公平性的影响,我们开发了一种基于变异自动编码器的方法,该方法可学习结构化潜空间,从时间序列数据中分离出患者敏感属性。我们的工作深入研究了机器学习模型从时间序列电子健康记录中编码患者静态信息的能力,并为下游任务引入了一种保护患者敏感信息的通用方法。
{"title":"Learning and diSentangling patient static information from time-series Electronic hEalth Records (STEER).","authors":"Wei Liao, Joel Voldman","doi":"10.1371/journal.pdig.0000640","DOIUrl":"10.1371/journal.pdig.0000640","url":null,"abstract":"<p><p>Recent work in machine learning for healthcare has raised concerns about patient privacy and algorithmic fairness. Previous work has shown that self-reported race can be predicted from medical data that does not explicitly contain racial information. However, the extent of data identification is unknown, and we lack ways to develop models whose outcomes are minimally affected by such information. Here we systematically investigated the ability of time-series electronic health record data to predict patient static information. We found that not only the raw time-series data, but also learned representations from machine learning models, can be trained to predict a variety of static information with area under the receiver operating characteristic curve as high as 0.851 for biological sex, 0.869 for binarized age and 0.810 for self-reported race. Such high predictive performance can be extended to various comorbidity factors and exists even when the model was trained for different tasks, using different cohorts, using different model architectures and databases. Given the privacy and fairness concerns these findings pose, we develop a variational autoencoder-based approach that learns a structured latent space to disentangle patient-sensitive attributes from time-series data. Our work thoroughly investigates the ability of machine learning models to encode patient static information from time-series electronic health records and introduces a general approach to protect patient-sensitive information for downstream tasks.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000640"},"PeriodicalIF":0.0,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11493250/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Derivation and validation of an algorithm to predict transitions from community to residential long-term care among persons with dementia-A retrospective cohort study. 痴呆症患者从社区向长期住院护理过渡的预测算法的推导和验证--一项回顾性队列研究。
Pub Date : 2024-10-18 eCollection Date: 2024-10-01 DOI: 10.1371/journal.pdig.0000441
Wenshan Li, Luke Turcotte, Amy T Hsu, Robert Talarico, Danial Qureshi, Colleen Webber, Steven Hawken, Peter Tanuseputro, Douglas G Manuel, Greg Huyer

Objectives: To develop and validate a model to predict time-to-LTC admissions among individuals with dementia.

Design: Population-based retrospective cohort study using health administrative data.

Setting and participants: Community-dwelling older adults (65+) in Ontario living with dementia and assessed with the Resident Assessment Instrument for Home Care (RAI-HC) between April 1, 2010 and March 31, 2017.

Methods: Individuals in the derivation cohort (n = 95,813; assessed before March 31, 2015) were followed for up to 360 days after the index RAI-HC assessment for admission into LTC. We used a multivariable Fine Gray sub-distribution hazard model to predict the cumulative incidence of LTC entry while accounting for all-cause mortality as a competing risk. The model was validated in 34,038 older adults with dementia with an index RAI-HC assessment between April 1, 2015 and March 31, 2017.

Results: Within one year of a RAI-HC assessment, 35,513 (37.1%) individuals in the derivation cohort and 10,735 (31.5%) in the validation cohort entered LTC. Our algorithm was well-calibrated (Emax = 0.119, ICIavg = 0.057) and achieved a c-statistic of 0.707 (95% confidence interval: 0.703-0.712) in the validation cohort.

Conclusions and implications: We developed an algorithm to predict time to LTC entry among individuals living with dementia. This tool can inform care planning for individuals with dementia and their family caregivers.

目的开发并验证一个模型,以预测痴呆症患者入住长期护理中心的时间:设计:基于人口的回顾性队列研究,使用健康管理数据:2010年4月1日至2017年3月31日期间,安大略省居住在社区的患有痴呆症的老年人(65岁以上),并使用家庭护理居民评估工具(RAI-HC)进行评估:对衍生队列(n = 95,813;2015 年 3 月 31 日之前评估)中的个人进行了长达 360 天的随访,随访时间为 RAI-HC 评估指数进入 LTC 后的 360 天。我们使用了一个多变量 Fine Gray 子分布危险模型来预测进入 LTC 的累积发病率,同时将全因死亡率作为竞争风险加以考虑。该模型在2015年4月1日至2017年3月31日期间进行了RAI-HC指数评估的34038名老年痴呆症患者中进行了验证:在RAI-HC评估后的一年内,推导队列中有35513人(37.1%)和验证队列中有10735人(31.5%)进入了LTC。我们的算法校准良好(Emax = 0.119,ICIavg = 0.057),验证队列中的 c 统计量为 0.707(95% 置信区间:0.703-0.712):我们开发了一种算法来预测痴呆症患者进入长期护理中心的时间。该工具可为痴呆症患者及其家庭护理者的护理规划提供参考。
{"title":"Derivation and validation of an algorithm to predict transitions from community to residential long-term care among persons with dementia-A retrospective cohort study.","authors":"Wenshan Li, Luke Turcotte, Amy T Hsu, Robert Talarico, Danial Qureshi, Colleen Webber, Steven Hawken, Peter Tanuseputro, Douglas G Manuel, Greg Huyer","doi":"10.1371/journal.pdig.0000441","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000441","url":null,"abstract":"<p><strong>Objectives: </strong>To develop and validate a model to predict time-to-LTC admissions among individuals with dementia.</p><p><strong>Design: </strong>Population-based retrospective cohort study using health administrative data.</p><p><strong>Setting and participants: </strong>Community-dwelling older adults (65+) in Ontario living with dementia and assessed with the Resident Assessment Instrument for Home Care (RAI-HC) between April 1, 2010 and March 31, 2017.</p><p><strong>Methods: </strong>Individuals in the derivation cohort (n = 95,813; assessed before March 31, 2015) were followed for up to 360 days after the index RAI-HC assessment for admission into LTC. We used a multivariable Fine Gray sub-distribution hazard model to predict the cumulative incidence of LTC entry while accounting for all-cause mortality as a competing risk. The model was validated in 34,038 older adults with dementia with an index RAI-HC assessment between April 1, 2015 and March 31, 2017.</p><p><strong>Results: </strong>Within one year of a RAI-HC assessment, 35,513 (37.1%) individuals in the derivation cohort and 10,735 (31.5%) in the validation cohort entered LTC. Our algorithm was well-calibrated (Emax = 0.119, ICIavg = 0.057) and achieved a c-statistic of 0.707 (95% confidence interval: 0.703-0.712) in the validation cohort.</p><p><strong>Conclusions and implications: </strong>We developed an algorithm to predict time to LTC entry among individuals living with dementia. This tool can inform care planning for individuals with dementia and their family caregivers.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000441"},"PeriodicalIF":0.0,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11488705/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using interpretable machine learning to predict bloodstream infection and antimicrobial resistance in patients admitted to ICU: Early alert predictors based on EHR data to guide antimicrobial stewardship. 利用可解释的机器学习预测重症监护室住院患者的血流感染和抗菌药物耐药性:基于电子病历数据的早期预警预测,指导抗菌药物管理。
Pub Date : 2024-10-16 eCollection Date: 2024-10-01 DOI: 10.1371/journal.pdig.0000641
Davide Ferrari, Pietro Arina, Jonathan Edgeworth, Vasa Curcin, Veronica Guidetti, Federica Mandreoli, Yanzhong Wang

Nosocomial infections and Antimicrobial Resistance (AMR) stand as formidable healthcare challenges on a global scale. To address these issues, various infection control protocols and personalized treatment strategies, guided by laboratory tests, aim to detect bloodstream infections (BSI) and assess the potential for AMR. In this study, we introduce a machine learning (ML) approach based on Multi-Objective Symbolic Regression (MOSR), an evolutionary approach to create ML models in the form of readable mathematical equations in a multi-objective way to overcome the limitation of standard single-objective approaches. This method leverages readily available clinical data collected upon admission to intensive care units, with the goal of predicting the presence of BSI and AMR. We further assess its performance by comparing it to established ML algorithms using both naturally imbalanced real-world data and data that has been balanced through oversampling techniques. Our findings reveal that traditional ML models exhibit subpar performance across all training scenarios. In contrast, MOSR, specifically configured to minimize false negatives by optimizing also for the F1-Score, outperforms other ML algorithms and consistently delivers reliable results, irrespective of the training set balance with F1-Score.22 and.28 higher than any other alternative. This research signifies a promising path forward in enhancing Antimicrobial Stewardship (AMS) strategies. Notably, the MOSR approach can be readily implemented on a large scale, offering a new ML tool to find solutions to these critical healthcare issues affected by limited data availability.

非医院感染和抗菌药物耐药性(AMR)是全球范围内医疗保健领域面临的严峻挑战。为了解决这些问题,在实验室检测的指导下,各种感染控制协议和个性化治疗策略旨在检测血流感染(BSI)并评估 AMR 的可能性。在本研究中,我们介绍了一种基于多目标符号回归(MOSR)的机器学习(ML)方法,这是一种以多目标方式创建可读数学方程形式的 ML 模型的进化方法,克服了标准单目标方法的局限性。这种方法利用了重症监护病房入院时收集的现成临床数据,目的是预测是否存在 BSI 和 AMR。我们使用自然失衡的真实世界数据和通过超采样技术实现平衡的数据,将其与成熟的 ML 算法进行比较,从而进一步评估其性能。我们的研究结果表明,传统的 ML 模型在所有训练场景中都表现不佳。与此相反,MOSR 通过对 F1 分数进行优化,将假阴性降到最低,其性能优于其他 ML 算法,无论训练集平衡与否,都能持续提供可靠的结果,其 F1 分数分别比其他任何算法高出 22 分和 28 分。这项研究为加强抗菌药物管理(AMS)战略开辟了一条充满希望的道路。值得注意的是,MOSR 方法可以很容易地大规模实施,它提供了一种新的 ML 工具,可以为这些受有限数据可用性影响的关键医疗保健问题找到解决方案。
{"title":"Using interpretable machine learning to predict bloodstream infection and antimicrobial resistance in patients admitted to ICU: Early alert predictors based on EHR data to guide antimicrobial stewardship.","authors":"Davide Ferrari, Pietro Arina, Jonathan Edgeworth, Vasa Curcin, Veronica Guidetti, Federica Mandreoli, Yanzhong Wang","doi":"10.1371/journal.pdig.0000641","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000641","url":null,"abstract":"<p><p>Nosocomial infections and Antimicrobial Resistance (AMR) stand as formidable healthcare challenges on a global scale. To address these issues, various infection control protocols and personalized treatment strategies, guided by laboratory tests, aim to detect bloodstream infections (BSI) and assess the potential for AMR. In this study, we introduce a machine learning (ML) approach based on Multi-Objective Symbolic Regression (MOSR), an evolutionary approach to create ML models in the form of readable mathematical equations in a multi-objective way to overcome the limitation of standard single-objective approaches. This method leverages readily available clinical data collected upon admission to intensive care units, with the goal of predicting the presence of BSI and AMR. We further assess its performance by comparing it to established ML algorithms using both naturally imbalanced real-world data and data that has been balanced through oversampling techniques. Our findings reveal that traditional ML models exhibit subpar performance across all training scenarios. In contrast, MOSR, specifically configured to minimize false negatives by optimizing also for the F1-Score, outperforms other ML algorithms and consistently delivers reliable results, irrespective of the training set balance with F1-Score.22 and.28 higher than any other alternative. This research signifies a promising path forward in enhancing Antimicrobial Stewardship (AMS) strategies. Notably, the MOSR approach can be readily implemented on a large scale, offering a new ML tool to find solutions to these critical healthcare issues affected by limited data availability.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000641"},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11482717/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the relationship between telehealth utilization and treatment burden among patients with chronic conditions: A cross-sectional study in Ontario, Canada. 探索慢性病患者使用远程保健与治疗负担之间的关系:加拿大安大略省横断面研究。
Pub Date : 2024-10-15 eCollection Date: 2024-10-01 DOI: 10.1371/journal.pdig.0000610
Farah Tahsin, Carolyn Steele Gray, Jay Shaw, Aviv Shachak

One in five Canadians lives with one or more chronic conditions. Patients with chronic conditions often experience a high treatment burden because of the work associated with managing care. Telehealth is considered a useful solution to reduce the treatment burden among patients with chronic conditions. However, telehealth can also increase the treatment burden by offloading responsibilities on patients. This cross-sectional study conducted in Ontario, Canada examines the association between telehealth utilization and treatment burden among patients with chronic conditions. This study aimed to explore whether and to what extent, telehealth use is associated with treatment burden among patients with chronic conditions. The secondary objective was to explore which sociodemographic variables are associated with patients' treatment burden. An online survey was administered to community-dwelling patients with one or more chronic conditions. The Treatment Burden Questionnaire (TBQ-15) was used to measure the patient's level of treatment burden, and a modified telehealth usage scale was developed and used to measure the frequency of telehealth use. Data was analyzed using descriptive statistics, correlations, analyses of variance, and hierarchical linear regression analysis. A total of 75 patients completed the survey. The participants' mean age was 64 (SD = 18.93) and 79% were female. The average reported treatment burden was 72.15 out of 150 (a higher score indicating a higher level of burden). When adjusted for demographic variables, a higher frequency of telehealth use was associated with experiencing a higher treatment burden, but the association was not statistically significant. Additionally, when adjusted for demographic variables, younger age, and the presence of an unpaid caregiver were positively related to a high treatment burden score. This finding demonstrates that some patient populations are more at risk of experiencing high treatment burden in the context of telehealth use; and hence, may require extra support to utilize telehealth technologies. The study highlights the need for further research to explore how to minimize the treatment burden among individuals with higher healthcare needs.

五分之一的加拿大人患有一种或多种慢性疾病。慢性病患者往往因为管理护理工作而承受沉重的治疗负担。远程保健被认为是减轻慢性病患者治疗负担的有效解决方案。然而,远程保健也可能因减轻患者的责任而增加治疗负担。这项在加拿大安大略省进行的横断面研究探讨了慢性病患者使用远程医疗与治疗负担之间的关系。本研究旨在探讨远程医疗的使用是否以及在多大程度上与慢性病患者的治疗负担有关。次要目标是探索哪些社会人口变量与患者的治疗负担有关。我们对居住在社区、患有一种或多种慢性疾病的患者进行了在线调查。治疗负担问卷(TBQ-15)用于测量患者的治疗负担水平,经修改的远程医疗使用量表用于测量远程医疗的使用频率。数据分析采用了描述性统计、相关性分析、方差分析和层次线性回归分析。共有 75 名患者完成了调查。参与者的平均年龄为 64 岁(SD = 18.93),79% 为女性。报告的平均治疗负担为 72.15 分(满分 150 分,分数越高,负担越重)。在对人口统计学变量进行调整后,远程医疗使用频率越高,治疗负担越重,但相关性在统计学上并不显著。此外,在对人口统计学变量进行调整后,年龄较小、有无偿照顾者与治疗负担得分较高呈正相关。这一发现表明,在使用远程医疗的情况下,一些患者群体更有可能承受高治疗负担;因此,在使用远程医疗技术时可能需要额外的支持。这项研究强调了进一步研究的必要性,以探讨如何最大限度地减轻有较高医疗保健需求的个人的治疗负担。
{"title":"Exploring the relationship between telehealth utilization and treatment burden among patients with chronic conditions: A cross-sectional study in Ontario, Canada.","authors":"Farah Tahsin, Carolyn Steele Gray, Jay Shaw, Aviv Shachak","doi":"10.1371/journal.pdig.0000610","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000610","url":null,"abstract":"<p><p>One in five Canadians lives with one or more chronic conditions. Patients with chronic conditions often experience a high treatment burden because of the work associated with managing care. Telehealth is considered a useful solution to reduce the treatment burden among patients with chronic conditions. However, telehealth can also increase the treatment burden by offloading responsibilities on patients. This cross-sectional study conducted in Ontario, Canada examines the association between telehealth utilization and treatment burden among patients with chronic conditions. This study aimed to explore whether and to what extent, telehealth use is associated with treatment burden among patients with chronic conditions. The secondary objective was to explore which sociodemographic variables are associated with patients' treatment burden. An online survey was administered to community-dwelling patients with one or more chronic conditions. The Treatment Burden Questionnaire (TBQ-15) was used to measure the patient's level of treatment burden, and a modified telehealth usage scale was developed and used to measure the frequency of telehealth use. Data was analyzed using descriptive statistics, correlations, analyses of variance, and hierarchical linear regression analysis. A total of 75 patients completed the survey. The participants' mean age was 64 (SD = 18.93) and 79% were female. The average reported treatment burden was 72.15 out of 150 (a higher score indicating a higher level of burden). When adjusted for demographic variables, a higher frequency of telehealth use was associated with experiencing a higher treatment burden, but the association was not statistically significant. Additionally, when adjusted for demographic variables, younger age, and the presence of an unpaid caregiver were positively related to a high treatment burden score. This finding demonstrates that some patient populations are more at risk of experiencing high treatment burden in the context of telehealth use; and hence, may require extra support to utilize telehealth technologies. The study highlights the need for further research to explore how to minimize the treatment burden among individuals with higher healthcare needs.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000610"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11478863/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards women's digital health equity: A qualitative inquiry into attitude and adoption of reproductive mHealth services in Bangladesh. 实现妇女的数字健康平等:对孟加拉国生殖移动保健服务的态度和采用情况的定性调查。
Pub Date : 2024-10-15 eCollection Date: 2024-10-01 DOI: 10.1371/journal.pdig.0000637
M Jonayed, Maruf Hasan Rumi
<p><p>Health equity in Bangladesh faces a large chasm over the economic conditions, socio-cultural factors and geographic location despite the push for digitalization of the health sector. While some research has been conducted assessing the viability of digital health solutions in Bangladesh, gender dynamics of digital healthcare have been absent. This study dived into healthcare equity for women with a focus on reproductive health services delivered through mobile devices. This paper reported the findings of a qualitative study employing in-depth interviews conducted among 26 women about their behavioral intention to use mHealth services for reproductive health and the underlying factors influencing this intention with the help of the Integrative Model of Planned Behavior (IMPB). A snowball sampling technique were used to interview those university educated women, aged 21-31, based on their familiarity and exposure of mHealth services from seven universities in Bangladesh. The findings suggested that users of mHealth services find it more convenient and secure compared to visiting healthcare facilities, especially for trivial issues and inquiries regarding their reproductive health. Although promoting such services is lagging behind traditional healthcare, the attitude toward reproductive health services in Bangladesh is generally favorable resulting increasing adoption and use. Because such information-related mobile services (apps, websites, and social media) served as a first base of knowledge on reproductive health among many young girls and women in Bangladesh, who are generally shy to share or talk about their menstruation or personal health problems with family members, peers, or even health professionals due to socio-cultural factors and stigmatization. Conversely, urban centric services, availability of experts, quality management, security of privacy, authenticity of the information, digital divide, lack of campaign initiatives, lack of equipment and technology, lack of sex education, and outdated apps and websites were identified as obstacles that constrain the widespread use of reproductive mHealth services in Bangladesh. This study also concluded that promotion will be crucial in reforming conservative norms, taboos, and misconceptions about women's health and recommended such endeavors to be initiated by the policy makers as there is a substantive need for a specific policy regulating emerging digital health market in Bangladesh. Notwithstanding, women-only sample, low sample size, narrow focus on mHealth users and absence of perspectives from healthcare providers were among shortcomings of this study which could be addressed in future research. Further quantitative explorations are must to determine the usage patterns of reproductive mHealth services and their effectiveness that would identify implementation challenges in terms of customization and personalization in reproductive healthcare in a developing country like Bangladesh
尽管孟加拉国正在推动卫生部门的数字化进程,但由于经济条件、社会文化因素和地理位置等原因,该国的卫生公平性仍面临巨大差距。虽然对孟加拉国数字医疗解决方案的可行性进行了一些评估研究,但对数字医疗的性别动态却缺乏研究。本研究以通过移动设备提供的生殖健康服务为重点,深入探讨了妇女的医疗保健平等问题。本文报告了一项定性研究的结果,在计划行为综合模型(IMPB)的帮助下,对 26 名妇女进行了深入访谈,了解她们使用移动医疗服务促进生殖健康的行为意向以及影响这一意向的潜在因素。研究采用滚雪球式抽样技术,根据孟加拉国七所大学中 21-31 岁受过大学教育的女性对移动医疗服务的熟悉程度和接触情况,对她们进行了访谈。研究结果表明,移动医疗服务的用户认为,与到医疗机构就诊相比,移动医疗服务更方便、更安全,尤其是在处理与生殖健康有关的琐碎问题和咨询时。虽然这类服务的推广落后于传统医疗保健,但孟加拉国对生殖健康服务的态度普遍良好,因此采用和使用率不断提高。因为这些与信息相关的移动服务(应用程序、网站和社交媒体)是孟加拉国许多年轻女孩和妇女了解生殖健康知识的第一站,由于社会文化因素和耻辱感,她们通常羞于与家人、同伴甚至卫生专业人员分享或谈论自己的月经或个人健康问题。相反,以城市为中心的服务、专家的可用性、质量管理、隐私安全、信息的真实性、数字鸿沟、缺乏活动倡议、缺乏设备和技术、缺乏性教育以及过时的应用程序和网站被认为是制约孟加拉国广泛使用生殖移动保健服务的障碍。本研究还得出结论,宣传对于改革有关妇女健康的保守规范、禁忌和误解至关重要,并建 议由政策制定者发起此类努力,因为孟加拉国亟需制定规范新兴数字保健市场的具体政策。尽管如此,本研究也存在一些不足之处,如仅以女性为样本、样本数量较少、对移动医疗用户的关注范围较窄以及缺乏医疗服务提供者的观点等,这些都可以在今后的研究中加以解决。必须进行进一步的定量探索,以确定生殖移动保健服务的使用模式及其有效性,从而确定在孟加拉国这样的发展中国家,生殖保健在定制化和个性化方面的实施挑战。
{"title":"Towards women's digital health equity: A qualitative inquiry into attitude and adoption of reproductive mHealth services in Bangladesh.","authors":"M Jonayed, Maruf Hasan Rumi","doi":"10.1371/journal.pdig.0000637","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000637","url":null,"abstract":"&lt;p&gt;&lt;p&gt;Health equity in Bangladesh faces a large chasm over the economic conditions, socio-cultural factors and geographic location despite the push for digitalization of the health sector. While some research has been conducted assessing the viability of digital health solutions in Bangladesh, gender dynamics of digital healthcare have been absent. This study dived into healthcare equity for women with a focus on reproductive health services delivered through mobile devices. This paper reported the findings of a qualitative study employing in-depth interviews conducted among 26 women about their behavioral intention to use mHealth services for reproductive health and the underlying factors influencing this intention with the help of the Integrative Model of Planned Behavior (IMPB). A snowball sampling technique were used to interview those university educated women, aged 21-31, based on their familiarity and exposure of mHealth services from seven universities in Bangladesh. The findings suggested that users of mHealth services find it more convenient and secure compared to visiting healthcare facilities, especially for trivial issues and inquiries regarding their reproductive health. Although promoting such services is lagging behind traditional healthcare, the attitude toward reproductive health services in Bangladesh is generally favorable resulting increasing adoption and use. Because such information-related mobile services (apps, websites, and social media) served as a first base of knowledge on reproductive health among many young girls and women in Bangladesh, who are generally shy to share or talk about their menstruation or personal health problems with family members, peers, or even health professionals due to socio-cultural factors and stigmatization. Conversely, urban centric services, availability of experts, quality management, security of privacy, authenticity of the information, digital divide, lack of campaign initiatives, lack of equipment and technology, lack of sex education, and outdated apps and websites were identified as obstacles that constrain the widespread use of reproductive mHealth services in Bangladesh. This study also concluded that promotion will be crucial in reforming conservative norms, taboos, and misconceptions about women's health and recommended such endeavors to be initiated by the policy makers as there is a substantive need for a specific policy regulating emerging digital health market in Bangladesh. Notwithstanding, women-only sample, low sample size, narrow focus on mHealth users and absence of perspectives from healthcare providers were among shortcomings of this study which could be addressed in future research. Further quantitative explorations are must to determine the usage patterns of reproductive mHealth services and their effectiveness that would identify implementation challenges in terms of customization and personalization in reproductive healthcare in a developing country like Bangladesh","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000637"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11478865/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised clustering of longitudinal clinical measurements in electronic health records. 电子健康记录中纵向临床测量的无监督聚类。
Pub Date : 2024-10-15 eCollection Date: 2024-10-01 DOI: 10.1371/journal.pdig.0000628
Arshiya Mariam, Hamed Javidi, Emily C Zabor, Ran Zhao, Tomas Radivoyevitch, Daniel M Rotroff

Longitudinal electronic health records (EHR) can be utilized to identify patterns of disease development and progression in real-world settings. Unsupervised temporal matching algorithms are being repurposed to EHR from signal processing- and protein-sequence alignment tasks where they have shown immense promise for gaining insight into disease. The robustness of these algorithms for classifying EHR clinical data remains to be determined. Timeseries compiled from clinical measurements, such as blood pressure, have far more irregularity in sampling and missingness than the data for which these algorithms were developed, necessitating a systematic evaluation of these methods. We applied 30 state-of-the-art unsupervised machine learning algorithms to 6,912 systematically generated simulated clinical datasets across five parameters. These algorithms included eight temporal matching algorithms with fourteen partitional and eight fuzzy clustering methods. Nemenyi tests were used to determine differences in accuracy using the Adjusted Rand Index (ARI). Dynamic time warping and its lower-bound variants had the highest accuracies across all cohorts (median ARI>0.70). All 30 methods were better at discriminating classes with differences in magnitude compared to differences in trajectory shapes. Missingness impacted accuracies only when classes were different by trajectory shape. The method with the highest ARI was then used to cluster a large pediatric metabolic syndrome (MetS) cohort (N = 43,426). We identified three unique childhood BMI patterns with high average cluster consensus (>70%). The algorithm identified a cluster with consistently high BMI which had the greatest risk of MetS, consistent with prior literature (OR = 4.87, 95% CI: 3.93-6.12). While these algorithms have been shown to have similar accuracies for regular timeseries, their accuracies in clinical applications vary substantially in discriminating differences in shape and especially with moderate to high missingness (>10%). This systematic assessment also shows that the most robust algorithms tested here can derive meaningful insights from longitudinal clinical data.

纵向电子健康记录(EHR)可用于识别现实世界中疾病的发展和进展模式。无监督时序匹配算法正从信号处理和蛋白质序列配准任务转用于电子病历,这些算法在深入了解疾病方面显示出巨大的前景。这些算法对电子病历临床数据分类的稳健性仍有待确定。根据血压等临床测量数据编制的时间序列在采样和遗漏方面的不规则性远远大于这些算法所针对的数据,因此有必要对这些方法进行系统评估。我们对 6,912 个系统生成的模拟临床数据集应用了 30 种最先进的无监督机器学习算法,涉及五个参数。这些算法包括八种时间匹配算法、十四种分区方法和八种模糊聚类方法。使用调整后的兰德指数(ARI)进行奈梅尼测试,以确定准确性的差异。在所有组群中,动态时间扭曲及其下限变体的准确度最高(ARI 中值>0.70)。与轨迹形状的差异相比,所有 30 种方法都更善于区分幅度差异的类别。只有在轨迹形状不同的类别中,缺失才会影响准确性。然后,我们使用 ARI 最高的方法对一个大型儿科代谢综合征(MetS)队列(N = 43,426)进行聚类。我们发现了三种独特的儿童 BMI 模式,其平均聚类共识度很高(>70%)。该算法确定了一个 BMI 值持续偏高的群组,该群组患 MetS 的风险最大,这与之前的文献一致(OR = 4.87,95% CI:3.93-6.12)。虽然这些算法在常规时间序列中具有相似的准确性,但在临床应用中,它们在判别形状差异,尤其是中高缺失率(>10%)时的准确性却有很大差异。这项系统评估还表明,这里测试的最稳健的算法可以从纵向临床数据中获得有意义的见解。
{"title":"Unsupervised clustering of longitudinal clinical measurements in electronic health records.","authors":"Arshiya Mariam, Hamed Javidi, Emily C Zabor, Ran Zhao, Tomas Radivoyevitch, Daniel M Rotroff","doi":"10.1371/journal.pdig.0000628","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000628","url":null,"abstract":"<p><p>Longitudinal electronic health records (EHR) can be utilized to identify patterns of disease development and progression in real-world settings. Unsupervised temporal matching algorithms are being repurposed to EHR from signal processing- and protein-sequence alignment tasks where they have shown immense promise for gaining insight into disease. The robustness of these algorithms for classifying EHR clinical data remains to be determined. Timeseries compiled from clinical measurements, such as blood pressure, have far more irregularity in sampling and missingness than the data for which these algorithms were developed, necessitating a systematic evaluation of these methods. We applied 30 state-of-the-art unsupervised machine learning algorithms to 6,912 systematically generated simulated clinical datasets across five parameters. These algorithms included eight temporal matching algorithms with fourteen partitional and eight fuzzy clustering methods. Nemenyi tests were used to determine differences in accuracy using the Adjusted Rand Index (ARI). Dynamic time warping and its lower-bound variants had the highest accuracies across all cohorts (median ARI>0.70). All 30 methods were better at discriminating classes with differences in magnitude compared to differences in trajectory shapes. Missingness impacted accuracies only when classes were different by trajectory shape. The method with the highest ARI was then used to cluster a large pediatric metabolic syndrome (MetS) cohort (N = 43,426). We identified three unique childhood BMI patterns with high average cluster consensus (>70%). The algorithm identified a cluster with consistently high BMI which had the greatest risk of MetS, consistent with prior literature (OR = 4.87, 95% CI: 3.93-6.12). While these algorithms have been shown to have similar accuracies for regular timeseries, their accuracies in clinical applications vary substantially in discriminating differences in shape and especially with moderate to high missingness (>10%). This systematic assessment also shows that the most robust algorithms tested here can derive meaningful insights from longitudinal clinical data.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000628"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11478862/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Use of large language models as a scalable approach to understanding public health discourse. 使用大型语言模型作为理解公共卫生话语的可扩展方法。
Pub Date : 2024-10-14 eCollection Date: 2024-10-01 DOI: 10.1371/journal.pdig.0000631
Laura Espinosa, Marcel Salathé

Online public health discourse is becoming more and more important in shaping public health dynamics. Large Language Models (LLMs) offer a scalable solution for analysing the vast amounts of unstructured text found on online platforms. Here, we explore the effectiveness of Large Language Models (LLMs), including GPT models and open-source alternatives, for extracting public stances towards vaccination from social media posts. Using an expert-annotated dataset of social media posts related to vaccination, we applied various LLMs and a rule-based sentiment analysis tool to classify the stance towards vaccination. We assessed the accuracy of these methods through comparisons with expert annotations and annotations obtained through crowdsourcing. Our results demonstrate that few-shot prompting of best-in-class LLMs are the best performing methods, and that all alternatives have significant risks of substantial misclassification. The study highlights the potential of LLMs as a scalable tool for public health professionals to quickly gauge public opinion on health policies and interventions, offering an efficient alternative to traditional data analysis methods. With the continuous advancement in LLM development, the integration of these models into public health surveillance systems could substantially improve our ability to monitor and respond to changing public health attitudes.

在线公共卫生讨论在塑造公共卫生动态方面正变得越来越重要。大型语言模型(LLM)为分析在线平台上的大量非结构化文本提供了一种可扩展的解决方案。在此,我们探讨了大型语言模型(LLM)(包括 GPT 模型和开源替代模型)从社交媒体帖子中提取公众对疫苗接种立场的有效性。我们使用专家标注的与疫苗接种相关的社交媒体帖子数据集,应用各种 LLM 和基于规则的情感分析工具对疫苗接种立场进行分类。我们通过与专家注释和通过众包获得的注释进行比较,评估了这些方法的准确性。我们的结果表明,对同类最佳 LLM 进行少量提示是性能最好的方法,而所有替代方法都存在大量误分类的重大风险。这项研究凸显了 LLM 作为一种可扩展工具的潜力,可帮助公共卫生专业人员快速了解公众对卫生政策和干预措施的意见,为传统数据分析方法提供了一种高效的替代方法。随着 LLM 开发的不断进步,将这些模型纳入公共卫生监测系统可大大提高我们监测和应对不断变化的公共卫生态度的能力。
{"title":"Use of large language models as a scalable approach to understanding public health discourse.","authors":"Laura Espinosa, Marcel Salathé","doi":"10.1371/journal.pdig.0000631","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000631","url":null,"abstract":"<p><p>Online public health discourse is becoming more and more important in shaping public health dynamics. Large Language Models (LLMs) offer a scalable solution for analysing the vast amounts of unstructured text found on online platforms. Here, we explore the effectiveness of Large Language Models (LLMs), including GPT models and open-source alternatives, for extracting public stances towards vaccination from social media posts. Using an expert-annotated dataset of social media posts related to vaccination, we applied various LLMs and a rule-based sentiment analysis tool to classify the stance towards vaccination. We assessed the accuracy of these methods through comparisons with expert annotations and annotations obtained through crowdsourcing. Our results demonstrate that few-shot prompting of best-in-class LLMs are the best performing methods, and that all alternatives have significant risks of substantial misclassification. The study highlights the potential of LLMs as a scalable tool for public health professionals to quickly gauge public opinion on health policies and interventions, offering an efficient alternative to traditional data analysis methods. With the continuous advancement in LLM development, the integration of these models into public health surveillance systems could substantially improve our ability to monitor and respond to changing public health attitudes.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000631"},"PeriodicalIF":0.0,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11472907/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
PLOS digital health
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1