International Journal of Medical Informatics最新文献_第9页

Probabilistic prediction of arrivals and hospitalizations in emergency departments in Île-de-France Île-de-France急诊科到达和住院的概率预测。

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Medical Informatics

Pub Date : 2024-12-04 DOI: 10.1016/j.ijmedinf.2024.105728

Herbert Susmann , Antoine Chambaz , Julie Josse , Philippe Aegerter , Mathias Wargon , Emmanuel Bacry

Background

Forecasts of future demand is foundational for effective resource allocation in emergency departments (EDs). As ED demand is inherently variable, it is important for forecasts to characterize the range of possible future demand. However, extant research focuses primarily on producing point forecasts using a wide variety of prediction algorithms. In this study, our objective is to generate point and interval predictions that accurately characterize the variability in ED demand using ensemble methods that combine predictions from multiple base algorithms based on their empirical performance.

Methods

Data consisted in daily arrivals and subsequent hospitalizations at 72 emergency departments in Île-de-France from 2014–2018. Additional explanatory variables were collected including public and school holidays, meteorological variables, and public health trends. One-day ahead point and 80% interval predictions of arrivals and hospitalizations were produced by predicting the 10%, 50%, and 90% quantiles of the forecast distribution. Quantile prediction algorithms included methods such as ARIMAX, variations of random forests, and generalized additive models. Ensemble predictions were then formed using Exponentially Weighted Averaging, Bernstein Online Aggregation, and Super Learning. Prediction intervals were post-processed using Adaptive Conformal Inference techniques. Point predictions were evaluated by their Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE), and 80% interval predictions by their empirical coverage and mean interval width.

Results

For point forecasts, ensemble methods achieved lower average MAE and MAPE than any of the base algorithms. All of the base algorithms and ensemble methods yielded prediction intervals with near optimal empirical coverage after conformalization. For hospitalizations, the shortest mean interval widths were achieved by the ensemble methods.

Conclusions

Ensemble methods yield joint point and prediction intervals that adapt to individual EDs and achieve better performance than individual algorithms. Conformal inference techniques improve the performance of the prediction intervals.

背景：预测未来的需求是有效分配急诊科资源的基础。由于电力需求本身是可变的，因此预测未来可能需求的范围是很重要的。然而，现有的研究主要集中在使用各种预测算法产生点预测。在本研究中，我们的目标是使用集成方法生成点和区间预测，准确地表征ED需求的可变性，该方法结合了基于经验表现的多个基本算法的预测。方法：数据包括2014-2018年Île-de-France 72个急诊科的每日到达和随后的住院情况。收集了其他解释变量，包括公共和学校假期、气象变量和公共卫生趋势。通过预测预测分布的10%、50%和90%分位数，得出了到达和住院的一天前点和80%间隔预测。分位数预测算法包括ARIMAX、随机森林变异和广义加性模型等方法。然后使用指数加权平均、Bernstein在线聚合和超级学习形成集合预测。使用自适应共形推理技术对预测区间进行后处理。点预测通过平均绝对误差（MAE）和平均绝对百分比误差（MAPE）进行评估，80%区间预测通过经验覆盖率和平均区间宽度进行评估。结果：对于点预测，集成方法的平均MAE和MAPE低于任何基本算法。所有的基本算法和集成方法在整合后都产生了接近最优经验覆盖率的预测区间。对于住院治疗，集合方法获得了最短的平均间隔宽度。结论：集成方法产生的结合点和预测区间适应于个体ed，并且比单个算法具有更好的性能。共形推理技术提高了预测区间的性能。

{"title":"Probabilistic prediction of arrivals and hospitalizations in emergency departments in Île-de-France","authors":"Herbert Susmann , Antoine Chambaz , Julie Josse , Philippe Aegerter , Mathias Wargon , Emmanuel Bacry","doi":"10.1016/j.ijmedinf.2024.105728","DOIUrl":"10.1016/j.ijmedinf.2024.105728","url":null,"abstract":"<div><h3>Background</h3><div>Forecasts of future demand is foundational for effective resource allocation in emergency departments (EDs). As ED demand is inherently variable, it is important for forecasts to characterize the range of possible future demand. However, extant research focuses primarily on producing point forecasts using a wide variety of prediction algorithms. In this study, our objective is to generate point and interval predictions that accurately characterize the variability in ED demand using ensemble methods that combine predictions from multiple base algorithms based on their empirical performance.</div></div><div><h3>Methods</h3><div>Data consisted in daily arrivals and subsequent hospitalizations at 72 emergency departments in Île-de-France from 2014–2018. Additional explanatory variables were collected including public and school holidays, meteorological variables, and public health trends. One-day ahead point and 80% interval predictions of arrivals and hospitalizations were produced by predicting the 10%, 50%, and 90% quantiles of the forecast distribution. Quantile prediction algorithms included methods such as ARIMAX, variations of random forests, and generalized additive models. Ensemble predictions were then formed using Exponentially Weighted Averaging, Bernstein Online Aggregation, and Super Learning. Prediction intervals were post-processed using Adaptive Conformal Inference techniques. Point predictions were evaluated by their Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE), and 80% interval predictions by their empirical coverage and mean interval width.</div></div><div><h3>Results</h3><div>For point forecasts, ensemble methods achieved lower average MAE and MAPE than any of the base algorithms. All of the base algorithms and ensemble methods yielded prediction intervals with near optimal empirical coverage after conformalization. For hospitalizations, the shortest mean interval widths were achieved by the ensemble methods.</div></div><div><h3>Conclusions</h3><div>Ensemble methods yield joint point and prediction intervals that adapt to individual EDs and achieve better performance than individual algorithms. Conformal inference techniques improve the performance of the prediction intervals.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105728"},"PeriodicalIF":3.7,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142808668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Construction and evaluation of prediction model for postoperative re-fractures in elderly patients with hip fractures 老年髋部骨折患者术后再骨折预测模型的构建与评价。

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Medical Informatics

Pub Date : 2024-12-02 DOI: 10.1016/j.ijmedinf.2024.105738

Jingjing Wu , Qingqing Zeng , Sijie Gui , Zhuolan Li , Wanyu Miao , Mi Zeng , Manyi Wang , Li Hu , Guqing Zeng

Objective

The aim of study was to construct a postoperative re-fracture prediction model for elderly hip fracture patients using an automated machine learning algorithm to provide a basis for early identification of patients with high risk of re-fracture occurrence.

Methods

Clinical data were collected and subjected to univariate and multivariate analyses to determine the independent risk factors affecting postoperative re-fracture of hip fracture in the elderly. The collected data were divided into training and validation sets in a ratio of 7:3, AutoGluon was applied to construct LightGBMXT, LightGBM, RandomForestGini, RandomForestEntr, CatBoost, NeuralNetFastAI, XGBoost, NeuralNetTorch, LightGBMLarge and WeightedEnsemble_L2 prediction models, and the constructed models were evaluated using evaluation indicators. The models were externally validated and the model with the best prediction performance was selected.

Results

The incidence of postoperative re-fracture was about 11.7%, and age, comorbid diabetes mellitus, comorbid osteoporosis, rehabilitation exercise status, and preoperative total protein level were considered as independent risk factors. The top three models in terms of AUC values in the training set were WeightedEnsemble_L2 (0.9671), XGBoost (0.9636), and LightGBM (0.9626), the WeightedEnsemble_L2 (0.9759) was best in the external validation. Based on the AUC and other evaluation indicators, WeightedEnsemble_L2 was considered the model with the best prediction performance.

Conclusion

The constructed model is highly generalizable and applicable, and can be used as an effective tool for healthcare professionals to assess and manage patients’ risk of re-fracture.

目的：利用自动机器学习算法构建老年髋部骨折患者术后再骨折预测模型，为早期识别再骨折高危患者提供依据。方法：收集临床资料，进行单因素和多因素分析，确定影响老年人髋部骨折术后再骨折的独立危险因素。将收集到的数据按7:3的比例划分为训练集和验证集，应用AutoGluon构建LightGBMXT、LightGBM、RandomForestGini、randomforestentrr、CatBoost、NeuralNetFastAI、XGBoost、NeuralNetTorch、LightGBMLarge和WeightedEnsemble_L2预测模型，并使用评价指标对构建的模型进行评价。对模型进行外部验证，选出预测效果最好的模型。结果：术后再骨折发生率约为11.7%，年龄、合并症糖尿病、合并症骨质疏松、康复运动状态、术前总蛋白水平为独立危险因素。训练集AUC值前三名的模型分别是WeightedEnsemble_L2（0.9671）、XGBoost（0.9636）和LightGBM(0.9626)，其中外部验证中，WeightedEnsemble_L2（0.9759）的AUC值最好。综合AUC等评价指标，认为weighttedensemble_l2模型预测性能最好。结论：所构建的模型具有较强的通用性和适用性，可作为医护人员评估和管理患者再骨折风险的有效工具。

{"title":"Construction and evaluation of prediction model for postoperative re-fractures in elderly patients with hip fractures","authors":"Jingjing Wu , Qingqing Zeng , Sijie Gui , Zhuolan Li , Wanyu Miao , Mi Zeng , Manyi Wang , Li Hu , Guqing Zeng","doi":"10.1016/j.ijmedinf.2024.105738","DOIUrl":"10.1016/j.ijmedinf.2024.105738","url":null,"abstract":"<div><h3>Objective</h3><div>The aim of study was to construct a postoperative re-fracture prediction model for elderly hip fracture patients using an automated machine learning algorithm to provide a basis for early identification of patients with high risk of re-fracture occurrence.</div></div><div><h3>Methods</h3><div>Clinical data were collected and subjected to univariate and multivariate analyses to determine the independent risk factors affecting postoperative re-fracture of hip fracture in the elderly. The collected data were divided into training and validation sets in a ratio of 7:3, AutoGluon was applied to construct LightGBMXT, LightGBM, RandomForestGini, RandomForestEntr, CatBoost, NeuralNetFastAI, XGBoost, NeuralNetTorch, LightGBMLarge and WeightedEnsemble_L2 prediction models, and the constructed models were evaluated using evaluation indicators. The models were externally validated and the model with the best prediction performance was selected.</div></div><div><h3>Results</h3><div>The incidence of postoperative re-fracture was about 11.7%, and age, comorbid diabetes mellitus, comorbid osteoporosis, rehabilitation exercise status, and preoperative total protein level were considered as independent risk factors. The top three models in terms of AUC values in the training set were WeightedEnsemble_L2 (0.9671), XGBoost (0.9636), and LightGBM (0.9626), the WeightedEnsemble_L2 (0.9759) was best in the external validation. Based on the AUC and other evaluation indicators, WeightedEnsemble_L2 was considered the model with the best prediction performance.</div></div><div><h3>Conclusion</h3><div>The constructed model is highly generalizable and applicable, and can be used as an effective tool for healthcare professionals to assess and manage patients’ risk of re-fracture.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105738"},"PeriodicalIF":3.7,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142792788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Development of machine learning-based models to predict congenital heart disease: A matched case-control study 开发基于机器学习的先天性心脏病预测模型：匹配病例对照研究

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Medical Informatics

Pub Date : 2024-12-02 DOI: 10.1016/j.ijmedinf.2024.105741

Shutong Zhang , Chenxi Kang , Jing Cui , Haodan Xue , Shanshan Zhao , Yukui Chen , Haixia Lu , Lu Ye , Duolao Wang , Fangyao Chen , Yaling Zhao , Leilei Pei , Pengfei Qu

Background

The current congenital heart disease (CHD) prediction tools lack adequate interpretability and convenience, hindering the development of personalized CHD management strategies. We developed a machine learning-based risk stratification model for CHD prediction.

Methods

This study utilized data from 1,759 participants in a case-control study of CHD conducted across six birth defects surveillance hospitals located in Xi’an, Shaanxi Province, Northwest China, spanning from January 2014 to December 2016. The data was partitioned into training and testing datasets with a ratio of 7:3. Predictors were selected from a total of 47 input variables through the Least Absolute Shrinkage and Selection Operator (LASSO). Five machine learning algorithms were used to build the CHD risk prediction models. Model performance was assessed based on a range of learning metrics, including the area under the receiver operating characteristic curve (AUROC), F1 score, and Brier score. Permutation feature importance was employed to elucidate the prediction model. The best-performing model was used to conduct the risk scores.

Results

The eXtreme Gradient Boosting (XGB) model demonstrated superior performance among CHD prediction models, achieving an AUROC of 0.772 (95 % CI 0.728, 0.817) in the testing dataset and 0.738 (0.699, 0.775) in the external validation dataset. The pivotal predictors (top 3) identified by the model included living in rural areas, the low wealth index, and folic acid supplements (<90 days). The resultant risk score exhibited robust calibration capabilities. Utilizing the risk scores, participants were stratified into low, moderate, and high-risk categories, signifying substantial variations in CHD risk.

Conclusion

This study underscores the feasibility and efficacy of employing a machine learning-based approach for CHD prediction. The risk scores exhibited potential in identifying pregnant women at high risk for fetal CHD, offering valuable insights for guiding primary prevention and CHD management.

背景：目前的先天性心脏病（CHD）预测工具缺乏足够的可解释性和便捷性，阻碍了个性化CHD管理策略的发展。我们开发了一个基于机器学习的冠心病预测风险分层模型。方法：本研究利用2014年1月至2016年12月在中国西北陕西省西安市6家出生缺陷监测医院开展的冠心病病例对照研究中1759名参与者的数据。将数据按7:3的比例划分为训练数据集和测试数据集。通过最小绝对收缩和选择算子（LASSO）从总共47个输入变量中选择预测因子。采用5种机器学习算法建立冠心病风险预测模型。根据一系列学习指标评估模型的性能，包括受试者工作特征曲线下面积（AUROC）、F1评分和Brier评分。利用排列特征重要度来阐明预测模型。采用表现最好的模型进行风险评分。结果：极端梯度增强（eXtreme Gradient Boosting， XGB）模型在冠心病预测模型中表现优异，测试数据集的AUROC为0.772 (95% CI 0.728, 0.817)，外部验证数据集的AUROC为0.738（0.699,0.775）。该模型确定的关键预测因素（前3）包括生活在农村地区、低财富指数和叶酸补充剂(结论：本研究强调了采用基于机器学习的方法预测冠心病的可行性和有效性。风险评分显示出识别胎儿冠心病高危孕妇的潜力，为指导初级预防和冠心病管理提供了有价值的见解。

{"title":"Development of machine learning-based models to predict congenital heart disease: A matched case-control study","authors":"Shutong Zhang , Chenxi Kang , Jing Cui , Haodan Xue , Shanshan Zhao , Yukui Chen , Haixia Lu , Lu Ye , Duolao Wang , Fangyao Chen , Yaling Zhao , Leilei Pei , Pengfei Qu","doi":"10.1016/j.ijmedinf.2024.105741","DOIUrl":"10.1016/j.ijmedinf.2024.105741","url":null,"abstract":"<div><h3>Background</h3><div>The current congenital heart disease (CHD) prediction tools lack adequate interpretability and convenience, hindering the development of personalized CHD management strategies. We developed a machine learning-based risk stratification model for CHD prediction.</div></div><div><h3>Methods</h3><div>This study utilized data from 1,759 participants in a case-control study of CHD conducted across six birth defects surveillance hospitals located in Xi’an, Shaanxi Province, Northwest China, spanning from January 2014 to December 2016. The data was partitioned into training and testing datasets with a ratio of 7:3. Predictors were selected from a total of 47 input variables through the Least Absolute Shrinkage and Selection Operator (LASSO). Five machine learning algorithms were used to build the CHD risk prediction models. Model performance was assessed based on a range of learning metrics, including the area under the receiver operating characteristic curve (AUROC), F1 score, and Brier score. Permutation feature importance was employed to elucidate the prediction model. The best-performing model was used to conduct the risk scores.</div></div><div><h3>Results</h3><div>The eXtreme Gradient Boosting (XGB) model demonstrated superior performance among CHD prediction models, achieving an AUROC of 0.772 (95 % CI 0.728, 0.817) in the testing dataset and 0.738 (0.699, 0.775) in the external validation dataset. The pivotal predictors (top 3) identified by the model included living in rural areas, the low wealth index, and folic acid supplements (<90 days). The resultant risk score exhibited robust calibration capabilities. Utilizing the risk scores, participants were stratified into low, moderate, and high-risk categories, signifying substantial variations in CHD risk.</div></div><div><h3>Conclusion</h3><div>This study underscores the feasibility and efficacy of employing a machine learning-based approach for CHD prediction. The risk scores exhibited potential in identifying pregnant women at high risk for fetal CHD, offering valuable insights for guiding primary prevention and CHD management.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105741"},"PeriodicalIF":3.7,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142796499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prediction of mortality in hemodialysis patients based on autoencoders 基于自编码器的血液透析患者死亡率预测。

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Medical Informatics

Pub Date : 2024-12-01 DOI: 10.1016/j.ijmedinf.2024.105744

Shuzhi Su , Jisheng Gao , Jingjing Dong , Qi Guo , Hualin Ma , Shaodong Luan , Xuejia Zheng , Huihui Tao , Lingling Zhou , Yong Dai

Background

Patients with end-stage renal disease (ESRD) undergoing hemodialysis (HD) exhibit a high mortality risk, particularly at the onset of treatment. Conventional risk assessment models, dependent on extensive temporal data accumulation, frequently encounter issues of data incompleteness and lengthy collection periods.

Objective

This study addresses the imbalance in short-term HD data and the issue of missing data features, achieving a robust assessment of mortality risk for HD patients over the subsequent 30 to 450 days.

Methods

An autoencoder-based mortality prediction model for HD patients is proposed. Leveraging the manifold structure of the non-missing features and the intrinsic relationship between the features in the high-dimensional data space, the model infers the values of the missing features. Noise and redundant information typically distort the manifold structure, impacting the accuracy of inferences about missing features. Consequently, we generate feature dropping masks to simulate the missing data distribution in the deep learning framework and design an autoencoder, forming an adaptive feature extraction module. The module utilizes readily available short-term data for unsupervised learning, enabling the encoder to reconstruct missing features and derive latent representations. Finally, a classifier based on the latent representations achieves the mortality prediction.

Results

Over a 30-day observation window, the model demonstrated superior mortality prediction performance compared to other models across all prediction windows. Feature importance analysis showed that creatinine and age are consistently the most critical features across all prediction windows. Glucose (fasting) and platelet count also remain significant, with their correlation with mortality risk increasing over time. Serum albumin, international standard ratio, and phosphate are particularly important for short-term predictions, while conjugated bilirubin and prothrombin time gain prominence in mid- and long-term predictions.

Conclusion

The proposed model proficiently leverages short-term HD data to provide precise mortality risk evaluations in HD patients, with particular efficacy in the short-term. Its application holds considerable value for clinical decision-making and risk management in this patient population.

背景：接受血液透析（HD）的终末期肾病（ESRD）患者表现出高死亡率，特别是在治疗开始时。传统的风险评估模型依赖于大量的时间数据积累，经常遇到数据不完整和收集周期长的问题。目的：本研究解决了HD短期数据的不平衡和数据特征缺失的问题，实现了HD患者随后30至450天死亡风险的可靠评估。方法：提出一种基于自编码器的HD患者死亡率预测模型。该模型利用高维数据空间中非缺失特征的流形结构和特征之间的内在关系，推断出缺失特征的值。噪声和冗余信息通常会扭曲流形结构，影响对缺失特征的推断的准确性。因此，我们生成特征删除掩码来模拟深度学习框架中的缺失数据分布，并设计一个自编码器，形成自适应特征提取模块。该模块利用易于获得的短期数据进行无监督学习，使编码器能够重建缺失的特征并获得潜在表征。最后，基于潜在表征的分类器实现死亡率预测。结果：在30天的观察窗口中，与其他模型相比，该模型在所有预测窗口中都表现出优越的死亡率预测性能。特征重要性分析表明，肌酐和年龄始终是所有预测窗口中最关键的特征。血糖（空腹）和血小板计数也很重要，它们与死亡风险的相关性随着时间的推移而增加。血清白蛋白、国际标准比值和磷酸盐对于短期预测尤为重要，而结合胆红素和凝血酶原时间在中长期预测中尤为重要。结论：所提出的模型熟练地利用HD短期数据，为HD患者提供精确的死亡风险评估，在短期内具有特别的疗效。其应用对该患者群体的临床决策和风险管理具有相当大的价值。

{"title":"Prediction of mortality in hemodialysis patients based on autoencoders","authors":"Shuzhi Su , Jisheng Gao , Jingjing Dong , Qi Guo , Hualin Ma , Shaodong Luan , Xuejia Zheng , Huihui Tao , Lingling Zhou , Yong Dai","doi":"10.1016/j.ijmedinf.2024.105744","DOIUrl":"10.1016/j.ijmedinf.2024.105744","url":null,"abstract":"<div><h3>Background</h3><div>Patients with end-stage renal disease (ESRD) undergoing hemodialysis (HD) exhibit a high mortality risk, particularly at the onset of treatment. Conventional risk assessment models, dependent on extensive temporal data accumulation, frequently encounter issues of data incompleteness and lengthy collection periods.</div></div><div><h3>Objective</h3><div>This study addresses the imbalance in short-term HD data and the issue of missing data features, achieving a robust assessment of mortality risk for HD patients over the subsequent 30 to 450 days.</div></div><div><h3>Methods</h3><div>An autoencoder-based mortality prediction model for HD patients is proposed. Leveraging the manifold structure of the non-missing features and the intrinsic relationship between the features in the high-dimensional data space, the model infers the values of the missing features. Noise and redundant information typically distort the manifold structure, impacting the accuracy of inferences about missing features. Consequently, we generate feature dropping masks to simulate the missing data distribution in the deep learning framework and design an autoencoder, forming an adaptive feature extraction module. The module utilizes readily available short-term data for unsupervised learning, enabling the encoder to reconstruct missing features and derive latent representations. Finally, a classifier based on the latent representations achieves the mortality prediction.</div></div><div><h3>Results</h3><div>Over a 30-day observation window, the model demonstrated superior mortality prediction performance compared to other models across all prediction windows. Feature importance analysis showed that creatinine and age are consistently the most critical features across all prediction windows. Glucose (fasting) and platelet count also remain significant, with their correlation with mortality risk increasing over time. Serum albumin, international standard ratio, and phosphate are particularly important for short-term predictions, while conjugated bilirubin and prothrombin time gain prominence in mid- and long-term predictions.</div></div><div><h3>Conclusion</h3><div>The proposed model proficiently leverages short-term HD data to provide precise mortality risk evaluations in HD patients, with particular efficacy in the short-term. Its application holds considerable value for clinical decision-making and risk management in this patient population.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105744"},"PeriodicalIF":3.7,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142792817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Association of clerical burden and EHR frustration with burnout and career intentions among physician faculty in an urban academic health system 办事员负担和电子病历挫折感与职业倦怠和职业意向在城市学术卫生系统中的医师教师的关联。

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Medical Informatics

Pub Date : 2024-12-01 DOI: 10.1016/j.ijmedinf.2024.105740

Jonathan A. Ripp , Robert H. Pietrzak , Eleonore de Guillebon , Lauren A. Peccoralo

Background and objectives

To examine changes in clerical burden, including daily clerical time, daily after hours Electronic Health Record (EHR) time and EHR frustration between 2018 and 2022 among physician faculty, and identify sociodemographic and occupational correlates of clerical burden with burnout and intent to leave one’s job (ILJ).

Methods

An institution-wide survey was sent to all physician faculty at an 8-Hospital Health System in New York City, between July and September 2022. Clerical time, after hours EHR time, practice unloading clerical burden and EHR frustration were assessed using ordinal-scale questions. Burnout was assessed using the Maslach Burnout Inventory-2. Demographic characteristics and ILJ were also assessed. Multivariable logistic regression analyses were conducted to determine associations between clerical burden and burnout and ILJ.

Results

Daily clerical and after hours EHR time increased in 2022 compared with 2018–2019 data. Medicine- vs. hospital-based department and hours worked per week were associated with greater clerical and after hours EHR time, and female gender with greater after hours EHR time. After adjusting for demographic and occupational characteristics, greater clerical time and EHR frustration were associated with greater likelihood of burnout. Endorsement of practice efforts unloading burden was associated with lower odds of burnout. EHR frustration was associated with greater likelihood of ILJ. Junior faculty rank was linked to both burnout and ILJ.

Conclusions

Clerical burden may be increasing among physician faculty and may be linked to greater odds of burnout and intention to leave. Results underscore the importance of unloading this burden to maintain a healthy workforce and avoid strain on our healthcare system.

背景和目的：研究2018年至2022年医师教职工文书负担的变化，包括日常文书时间、每日下班后电子健康记录（EHR）时间和EHR挫败感，并确定文书负担与职业倦怠和离职意向（ILJ）之间的社会人口统计学和职业相关性。方法：在2022年7月至9月期间，对纽约市8家医院卫生系统的所有医师教员进行了全机构范围的调查。文书时间、下班后电子病历时间、练习解除文书负担和电子病历挫败感采用顺序量表进行评估。使用Maslach职业倦怠量表-2评估职业倦怠。还评估了人口统计学特征和ILJ。采用多变量logistic回归分析确定文书负担、职业倦怠与ILJ之间的关系。结果：与2018-2019年的数据相比，2022年的日常文书和下班后电子病历时间有所增加。与医院部门和每周工作时间相比，医务人员和下班后使用电子病历的时间更长，女性下班后使用电子病历的时间更长。在调整了人口统计和职业特征后，更多的文书时间和电子病历挫折感与更大的倦怠可能性相关。赞同练习努力减轻负担与较低的倦怠率相关。电子病历挫折感与ILJ发生的可能性较大相关。初级教师的级别与职业倦怠和ILJ都有关系。结论：医务人员的工作负担可能正在增加，这可能与更大的职业倦怠和离职倾向有关。结果强调了减轻这一负担的重要性，以保持健康的劳动力队伍，避免对我们的医疗保健系统造成压力。

{"title":"Association of clerical burden and EHR frustration with burnout and career intentions among physician faculty in an urban academic health system","authors":"Jonathan A. Ripp , Robert H. Pietrzak , Eleonore de Guillebon , Lauren A. Peccoralo","doi":"10.1016/j.ijmedinf.2024.105740","DOIUrl":"10.1016/j.ijmedinf.2024.105740","url":null,"abstract":"<div><h3>Background and objectives</h3><div>To examine changes in clerical burden, including daily clerical time, daily after hours Electronic Health Record (EHR) time and EHR frustration between 2018 and 2022 among physician faculty, and identify sociodemographic and occupational correlates of clerical burden with burnout and intent to leave one’s job (ILJ).</div></div><div><h3>Methods</h3><div>An institution-wide survey was sent to all physician faculty at an 8-Hospital Health System in New York City, between July and September 2022. Clerical time, after hours EHR time, practice unloading clerical burden and EHR frustration were assessed using ordinal-scale questions. Burnout was assessed using the Maslach Burnout Inventory-2. Demographic characteristics and ILJ were also assessed. Multivariable logistic regression analyses were conducted to determine associations between clerical burden and burnout and ILJ.</div></div><div><h3>Results</h3><div>Daily clerical and after hours EHR time increased in 2022 compared with 2018–2019 data. Medicine- vs. hospital-based department and hours worked per week were associated with greater clerical and after hours EHR time, and female gender with greater after hours EHR time. After adjusting for demographic and occupational characteristics, greater clerical time and EHR frustration were associated with greater likelihood of burnout. Endorsement of practice efforts unloading burden was associated with lower odds of burnout. EHR frustration was associated with greater likelihood of ILJ. Junior faculty rank was linked to both burnout and ILJ.</div></div><div><h3>Conclusions</h3><div>Clerical burden may be increasing among physician faculty and may be linked to greater odds of burnout and intention to leave. Results underscore the importance of unloading this burden to maintain a healthy workforce and avoid strain on our healthcare system.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105740"},"PeriodicalIF":3.7,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142792751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Application of machine learning for delirium prediction and analysis of associated factors in hospitalized COVID-19 patients: A comparative study using the Korean Multidisciplinary cohort for delirium prevention (KoMCoDe) 机器学习在COVID-19住院患者谵妄预测及相关因素分析中的应用：韩国谵妄预防多学科队列（KoMCoDe）的比较研究

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Medical Informatics

Pub Date : 2024-12-01 DOI: 10.1016/j.ijmedinf.2024.105747

Hye Yoon Park , Hyoju Sohn , Arum Hong , Soo Wan Han , Yuna Jang , EKyong Yoon , Myeongju Kim , Hye Youn Park

Background

The incidence of delirium in hospitalized coronavirus disease 2019 (COVID-19) patients is linked to adverse health outcomes. Predicting the occurrence and risk factors of delirium is key to preventing its sudden onset.

Aims

To explore the factors associated with delirium in hospitalized COVID-19 patients and to compare the performance of various machine learning (ML) techniques for future use in predicting delirium.

Methods

We analyzed a dataset of 1,031 cases from two healthcare centers, which included 178 variables such as demographics, clinical data, and medication information. The ML techniques used in this study were extreme gradient boosting (XGB), light gradient boosting machine (LGBM), logistic regression (LR), random forest (RF), and support vector machine (SVM).

Results

The RF model emerged as the most effective for predicting delirium, achieving an area under the curve (AUC) of 0.923. It showed a sensitivity of 0.639, accuracy of 0.900, specificity of 0.934, positive predictive value (PPV) of 0.561, negative predictive value (NPV) of 0.952, and an F1 score of 0.597. The RF model identified key variables related to delirium, including medication type (antipsychotic, sedative, opioid), duration of hospital stay, remdesivir usage, and patient age. The reliability of the model was affirmed through calibration plots and Brier score evaluations.

Conclusions

This research developed and validated an RF-based ML model for predicting delirium in hospitalized COVID-19 patients. The model demonstrates superior accuracy and reliability compared to other ML methods and would possibly serve as a valuable tool for managing and anticipating delirium in COVID-19 patients, with the potential to enhance patient outcomes.

背景：2019冠状病毒病（COVID-19）住院患者谵妄的发生率与不良健康结局有关。预测谵妄的发生及危险因素是预防其突然发作的关键。目的：探讨COVID-19住院患者谵妄的相关因素，并比较各种机器学习（ML）技术在预测谵妄方面的性能。方法：我们分析了来自两家医疗中心的1031例病例的数据集，包括人口统计学、临床数据和药物信息等178个变量。本研究中使用的机器学习技术有极端梯度增强（XGB）、轻梯度增强机（LGBM）、逻辑回归（LR）、随机森林（RF）和支持向量机（SVM）。结果：射频模型预测谵妄最有效，曲线下面积（AUC）为0.923。其敏感性为0.639，准确性为0.900，特异性为0.934，阳性预测值（PPV）为0.561，阴性预测值（NPV）为0.952，F1评分为0.597。RF模型确定了与谵妄相关的关键变量，包括药物类型（抗精神病药、镇静剂、阿片类药物）、住院时间、瑞德西韦的使用和患者年龄。通过标定图和Brier评分评价，验证了模型的可靠性。结论：本研究建立并验证了一种基于rf的ML模型，用于预测COVID-19住院患者的谵妄。与其他ML方法相比，该模型具有更高的准确性和可靠性，可能成为管理和预测COVID-19患者谵妄的有价值工具，有可能提高患者的预后。

{"title":"Application of machine learning for delirium prediction and analysis of associated factors in hospitalized COVID-19 patients: A comparative study using the Korean Multidisciplinary cohort for delirium prevention (KoMCoDe)","authors":"Hye Yoon Park , Hyoju Sohn , Arum Hong , Soo Wan Han , Yuna Jang , EKyong Yoon , Myeongju Kim , Hye Youn Park","doi":"10.1016/j.ijmedinf.2024.105747","DOIUrl":"10.1016/j.ijmedinf.2024.105747","url":null,"abstract":"<div><h3>Background</h3><div>The incidence of delirium in hospitalized coronavirus disease 2019 (COVID-19) patients is linked to adverse health outcomes. Predicting the occurrence and risk factors of delirium is key to preventing its sudden onset.</div></div><div><h3>Aims</h3><div>To explore the factors associated with delirium in hospitalized COVID-19 patients and to compare the performance of various machine learning (ML) techniques for future use in predicting delirium.</div></div><div><h3>Methods</h3><div>We analyzed a dataset of 1,031 cases from two healthcare centers, which included 178 variables such as demographics, clinical data, and medication information. The ML techniques used in this study were extreme gradient boosting (XGB), light gradient boosting machine (LGBM), logistic regression (LR), random forest (RF), and support vector machine (SVM).</div></div><div><h3>Results</h3><div>The RF model emerged as the most effective for predicting delirium, achieving an area under the curve (AUC) of 0.923. It showed a sensitivity of 0.639, accuracy of 0.900, specificity of 0.934, positive predictive value (PPV) of 0.561, negative predictive value (NPV) of 0.952, and an F1 score of 0.597. The RF model identified key variables related to delirium, including medication type (antipsychotic, sedative, opioid), duration of hospital stay, remdesivir usage, and patient age. The reliability of the model was affirmed through calibration plots and Brier score evaluations.</div></div><div><h3>Conclusions</h3><div>This research developed and validated an RF-based ML model for predicting delirium in hospitalized COVID-19 patients. The model demonstrates superior accuracy and reliability compared to other ML methods and would possibly serve as a valuable tool for managing and anticipating delirium in COVID-19 patients, with the potential to enhance patient outcomes.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105747"},"PeriodicalIF":3.7,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142792750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Assessing AI Simplification of Medical Texts: Readability and Content Fidelity 评估AI简化医学文本：可读性和内容保真度。

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Medical Informatics

Pub Date : 2024-12-01 DOI: 10.1016/j.ijmedinf.2024.105743

Bryce Picton , Saman Andalib , Aidin Spina , Brandon Camp , Sean S. Solomon , Jason Liang , Patrick M. Chen , Jefferson W. Chen , Frank P. Hsu , Michael Y. Oh

Introduction

The escalating complexity of medical literature necessitates tools to enhance readability for patients. This study aimed to evaluate the efficacy of ChatGPT-4 in simplifying neurology and neurosurgical abstracts and patient education materials (PEMs) while assessing content preservation using Latent Semantic Analysis (LSA).

Methods

A total of 100 abstracts (25 each from Neurosurgery, Journal of Neurosurgery, Lancet Neurology, and JAMA Neurology) and 340 PEMs (66 from the American Association of Neurological Surgeons, 274 from the American Academy of Neurology) were transformed by a GPT-4.0 prompt requesting a 5th grade reading level. Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FKRE) scores were used before/after transformation. Content fidelity was validated via LSA (ranging 0–1, 1 meaning identical topics) and by expert assessment (0–1) for a subset (n = 40). Pearson correlation coefficient compared assessments.

Results

FKGL decreased from 12th to 5th grade for abstracts and 13th to 5th for PEMs (p < 0.001). FKRE scores showed similar improvement (p < 0.001). LSA confirmed high content similarity for abstracts (mean cosine similarity 0.746) and PEMs (mean 0.953). Expert assessment indicated a mean topic similarity of 0.775 for abstracts and 0.715 for PEMs. The Pearson coefficient between LSA and expert assessment of textual similarity was 0.598 for abstracts and −0.167 for PEMs. Segmented analysis of similarity correlations revealed a correlation of 0.48 (p = 0.02) below 450 words and a −0.20 (p = 0.43) correlation above 450 words.

Conclusion

GPT-4.0 markedly improved the readability of medical texts, predominantly maintaining content integrity as substantiated by LSA and expert evaluations. LSA emerged as a reliable tool for assessing content fidelity within moderate-length texts, but its utility diminished for longer documents, overestimating similarity. These findings support the potential of AI in combating low health literacy, however, the similarity scores indicate expert validation is crucial. Future research must strive to improve transformation precision and develop validation methodologies.

简介：医学文献的复杂性不断升级，需要工具来提高可读性。本研究旨在评估ChatGPT-4在简化神经病学和神经外科摘要和患者教育材料（PEMs）方面的功效，同时使用潜在语义分析（LSA）评估内容保存。方法：通过要求5年级阅读水平的GPT-4.0提示转换共100篇摘要（来自《神经外科》、《神经外科杂志》、《柳叶刀神经病学》和《美国医学会神经病学》各25篇）和340篇医学论文（来自美国神经外科协会66篇，美国神经病学学会274篇）。转换前后分别采用Flesch- kincaid Grade Level （FKGL）和Flesch Reading Ease （FKRE）评分。内容保真度通过LSA（范围0- 1,1表示相同的主题）和专家评估（0-1）对一个子集（n = 40）进行验证。Pearson相关系数比较评估。结果：摘要的FKGL从12级下降到5级，医学论文的FKGL从13级下降到5级(p结论：GPT-4.0显著提高了医学文献的可读性，主要保持了内容的完整性，LSA和专家评价证实了这一点。LSA作为评估中等长度文本内容保真度的可靠工具出现，但对于较长的文档，它的效用降低了，高估了相似性。这些发现支持人工智能在应对低健康素养方面的潜力，然而，相似度得分表明专家验证至关重要。未来的研究必须努力提高转换精度和开发验证方法。

{"title":"Assessing AI Simplification of Medical Texts: Readability and Content Fidelity","authors":"Bryce Picton , Saman Andalib , Aidin Spina , Brandon Camp , Sean S. Solomon , Jason Liang , Patrick M. Chen , Jefferson W. Chen , Frank P. Hsu , Michael Y. Oh","doi":"10.1016/j.ijmedinf.2024.105743","DOIUrl":"10.1016/j.ijmedinf.2024.105743","url":null,"abstract":"<div><h3>Introduction</h3><div>The escalating complexity of medical literature necessitates tools to enhance readability for patients. This study aimed to evaluate the efficacy of ChatGPT-4 in simplifying neurology and neurosurgical abstracts and patient education materials (PEMs) while assessing content preservation using Latent Semantic Analysis (LSA).</div></div><div><h3>Methods</h3><div>A total of 100 abstracts (25 each from <em>Neurosurgery, Journal of Neurosurgery, Lancet Neurology,</em> and <em>JAMA Neurology</em>) and 340 PEMs (66 from the <em>American Association of Neurological Surgeons,</em> 274 from the <em>American Academy</em> of <em>Neurology)</em> were transformed by a GPT-4.0 prompt requesting a 5th grade reading level. Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FKRE) scores were used before/after transformation. Content fidelity was validated via LSA (ranging 0–1, 1 meaning identical topics) and by expert assessment (0–1) for a subset (n = 40). Pearson correlation coefficient compared assessments.</div></div><div><h3>Results</h3><div>FKGL decreased from 12th to 5th grade for abstracts and 13th to 5th for PEMs (p < 0.001). FKRE scores showed similar improvement (p < 0.001). LSA confirmed high content similarity for abstracts (mean cosine similarity 0.746) and PEMs (mean 0.953). Expert assessment indicated a mean topic similarity of 0.775 for abstracts and 0.715 for PEMs. The Pearson coefficient between LSA and expert assessment of textual similarity was 0.598 for abstracts and −0.167 for PEMs. Segmented analysis of similarity correlations revealed a correlation of 0.48 (p = 0.02) below 450 words and a −0.20 (p = 0.43) correlation above 450 words.</div></div><div><h3>Conclusion</h3><div>GPT-4.0 markedly improved the readability of medical texts, predominantly maintaining content integrity as substantiated by LSA and expert evaluations. LSA emerged as a reliable tool for assessing content fidelity within moderate-length texts, but its utility diminished for longer documents, overestimating similarity. These findings support the potential of AI in combating low health literacy, however, the similarity scores indicate expert validation is crucial. Future research must strive to improve transformation precision and develop validation methodologies.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105743"},"PeriodicalIF":3.7,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142820381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Universal representations in cardiovascular ECG assessment: A self-supervised learning approach 心血管心电图评估中的普遍表征：一种自监督学习方法。

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Medical Informatics

Pub Date : 2024-12-01 DOI: 10.1016/j.ijmedinf.2024.105742

Zhi-Yong Liu , Ching-Heng Lin , Yu-Chun Hsu , Jung-Sheng Chen , Po-Cheng Chang , Ming-Shien Wen , Chang-Fu Kuo

Background

The 12-lead electrocardiogram (ECG) is an established modality for cardiovascular assessment. While deep learning algorithms have shown promising results for analyzing ECG data, the limited availability of labeled datasets hinders broader applications. Self-supervised learning can learn meaningful representations from the unlabeled data and transfer the knowledge to downstream tasks. This study underscores the development and validation of a self-supervised learning methodology tailored to produce universal ECG representations from longitudinally collected ECG data, applicable across a spectrum of cardiovascular assessments.

Methods

We introduced a pre-trained model that utilizes contrastive self-supervised learning to universal ECG representations from 4,932,573 ECG tracing from 1,684,298 adult patients on 7 campuses of Chang Gung Memorial Hospital. We extensively evaluated the proposed model using an internal dataset collected from diverse healthcare establishments and an external public dataset encompassing varied cardiovascular conditions and sample magnitudes.

Results

The pre-trained model showed the equivalent performance to the conventionally trained models, which solely rely on supervised learning in both internal and external datasets, to assess atrial fibrillation, atrial flutter, premature rhythm abnormalities, first-degree atrioventricular block, and myocardial infarction. When applied to small sample sizes, it was observed that the learned ECG representations enhanced the classification models, resulting in an improvement of up to 0.3 of the area under the receiver operating characteristic (AUROC).

Conclusions

The ECG representations learned from longitudinal ECG data are highly effective, particularly with small sample sizes, and further enhance the learning process and boost robustness.

背景：12导联心电图（ECG）是一种公认的心血管评估方式。虽然深度学习算法在分析ECG数据方面显示出有希望的结果，但标记数据集的有限可用性阻碍了其更广泛的应用。自监督学习可以从未标记的数据中学习有意义的表示，并将知识转移到下游任务中。本研究强调了一种自我监督学习方法的开发和验证，该方法旨在从纵向收集的ECG数据中产生通用的ECG表示，适用于心血管评估的范围。方法：我们引入了一个预训练模型，该模型利用对比自监督学习对来自长庚纪念医院7个校区的1,684,298名成年患者的4,932,573例ECG示踪进行通用ECG表征。我们使用从不同医疗机构收集的内部数据集和包含不同心血管疾病和样本大小的外部公共数据集广泛评估了所建议的模型。结果：预训练模型在评估心房颤动、心房扑动、早搏异常、一级房室传导阻滞和心肌梗死方面，与仅依赖内部和外部数据集的监督学习的常规训练模型表现相当。当应用于小样本量时，观察到学习到的ECG表征增强了分类模型，导致接收器工作特征（AUROC）下的面积提高了0.3。结论：从纵向心电数据中学习到的心电表征是非常有效的，特别是在小样本量的情况下，并且进一步增强了学习过程并增强了鲁棒性。

{"title":"Universal representations in cardiovascular ECG assessment: A self-supervised learning approach","authors":"Zhi-Yong Liu , Ching-Heng Lin , Yu-Chun Hsu , Jung-Sheng Chen , Po-Cheng Chang , Ming-Shien Wen , Chang-Fu Kuo","doi":"10.1016/j.ijmedinf.2024.105742","DOIUrl":"10.1016/j.ijmedinf.2024.105742","url":null,"abstract":"<div><h3>Background</h3><div>The 12-lead electrocardiogram (ECG) is an established modality for cardiovascular assessment. While deep learning algorithms have shown promising results for analyzing ECG data, the limited availability of labeled datasets hinders broader applications. Self-supervised learning can learn meaningful representations from the unlabeled data and transfer the knowledge to downstream tasks. This study underscores the development and validation of a self-supervised learning methodology tailored to produce universal ECG representations from longitudinally collected ECG data, applicable across a spectrum of cardiovascular assessments.</div></div><div><h3>Methods</h3><div>We introduced a pre-trained model that utilizes contrastive self-supervised learning to universal ECG representations from 4,932,573 ECG tracing from 1,684,298 adult patients on 7 campuses of Chang Gung Memorial Hospital. We extensively evaluated the proposed model using an internal dataset collected from diverse healthcare establishments and an external public dataset encompassing varied cardiovascular conditions and sample magnitudes.</div></div><div><h3>Results</h3><div>The pre-trained model showed the equivalent performance to the conventionally trained models, which solely rely on supervised learning in both internal and external datasets, to assess atrial fibrillation, atrial flutter, premature rhythm abnormalities, first-degree atrioventricular block, and myocardial infarction. When applied to small sample sizes, it was observed that the learned ECG representations enhanced the classification models, resulting in an improvement of up to 0.3 of the area under the receiver operating characteristic (AUROC).</div></div><div><h3>Conclusions</h3><div>The ECG representations learned from longitudinal ECG data are highly effective, particularly with small sample sizes, and further enhance the learning process and boost robustness.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105742"},"PeriodicalIF":3.7,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Development of a code system for allergens and its integration into the HL7 FHIR AllergyIntolerance resource 过敏原编码系统的开发及其与HL7 FHIR过敏原耐受性资源的整合

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Medical Informatics

Pub Date : 2024-11-30 DOI: 10.1016/j.ijmedinf.2024.105739

Yoshimasa Kawazoe , Satomi Nagashima , Shinichiroh Yokota , Kazuhiko Ohe

Background

Allergy code systems are essential for safety and medical information interoperability. However, current terminology systems lack allergens unique to Japan.

Objective

This study established a code system encompassing Japanese food and non-food/non-medication allergens (JFAGY), and developed a meta-code system for integration with existing drug code systems. The practicality and limitations of the JFAGY were assessed by profiling HL7 FHIR allergy intolerance.

Methods

Allergen terms were selected based on the Standard Commodity Classification of Japan. Additional terms were extracted from clinical guidelines and public documents. For non-food, non-medication allergens, terms from the clinical guidelines were manually compiled to conform to a classification hierarchy. To validate the coverage of the developed food allergen code system, we extracted 823 unique food allergens, totaling 12,027 entries, from two years of electronic health records (EHRs) and performed manual mapping to the code system.

Results

In total, 1,123 food and 607 non-food/non-medication allergen terms were included. The three-digit meta-code system comprises an identifier for coding systems, code length, and allergen categories. The codes allowed the determination of hierarchical relationships between any two terms. The Japanese allergy intolerance value set was developed and bound to the allergy intolerance code. Of the food allergens extracted from EHRs, 62.9% corresponded to unique codes, 6.1% to multiple codes, and 31.0% were unmapped, accounting for 91.5%, 1.9%, and 6.6% of entries, respectively.

Conclusions

The JFAGY encompasses Japanese-specific food and non-food/non-medication allergens, enabling hierarchy determination between two terms, and playing a critical role in medical safety. When utilizing the JFAGY with the FHIR allergy intolerance resource, an FHIR extension must be included to denote a denied allergy.

过敏代码系统对于安全和医疗信息互操作性至关重要。然而，目前的术语系统缺乏日本特有的过敏原。目的建立日本食品和非食品/非药物过敏原编码体系（JFAGY），并开发与现有药品编码体系整合的元编码体系。通过分析HL7 FHIR过敏不耐受来评估JFAGY的实用性和局限性。方法根据日本《标准商品分类》选取过敏原术语。附加术语摘自临床指南和公开文件。对于非食物、非药物过敏原，临床指南中的术语是手工编制的，以符合分类等级。为了验证开发的食品过敏原代码系统的覆盖范围，我们从两年的电子健康记录（EHRs）中提取了823个独特的食品过敏原，共计12027个条目，并对代码系统进行了手动映射。结果共纳入食物类过敏原1123项，非食物/非药物类过敏原607项。三位数元代码系统包括用于编码系统、代码长度和过敏原类别的标识符。这些代码允许确定任意两个术语之间的层次关系。开发了日本过敏不耐受值集，并与过敏不耐受代码绑定。从电子病历中提取的食物过敏原中，62.9%对应唯一编码，6.1%对应多个编码，31.0%对应未映射，分别占条目的91.5%、1.9%和6.6%。JFAGY涵盖了日本特有的食品过敏原和非食品/非药物过敏原，实现了两个术语之间的等级确定，并在医疗安全中发挥了关键作用。当使用JFAGY和FHIR过敏不耐受资源时，必须包含FHIR扩展来表示拒绝的过敏。

{"title":"Development of a code system for allergens and its integration into the HL7 FHIR AllergyIntolerance resource","authors":"Yoshimasa Kawazoe , Satomi Nagashima , Shinichiroh Yokota , Kazuhiko Ohe","doi":"10.1016/j.ijmedinf.2024.105739","DOIUrl":"10.1016/j.ijmedinf.2024.105739","url":null,"abstract":"<div><h3>Background</h3><div>Allergy code systems are essential for safety and medical information interoperability. However, current terminology systems lack allergens unique to Japan.</div></div><div><h3>Objective</h3><div>This study established a code system encompassing Japanese food and non-food/non-medication allergens (JFAGY), and developed a <em>meta</em>-code system for integration with existing drug code systems. The practicality and limitations of the JFAGY were assessed by profiling HL7 FHIR allergy intolerance.</div></div><div><h3>Methods</h3><div>Allergen terms were selected based on the Standard Commodity Classification of Japan. Additional terms were extracted from clinical guidelines and public documents. For non-food, non-medication allergens, terms from the clinical guidelines were manually compiled to conform to a classification hierarchy. To validate the coverage of the developed food allergen code system, we extracted 823 unique food allergens, totaling 12,027 entries, from two years of electronic health records (EHRs) and performed manual mapping to the code system.</div></div><div><h3>Results</h3><div>In total, 1,123 food and 607 non-food/non-medication allergen terms were included. The three-digit <em>meta</em>-code system comprises an identifier for coding systems, code length, and allergen categories. The codes allowed the determination of hierarchical relationships between any two terms. The Japanese allergy intolerance value set was developed and bound to the allergy intolerance code. Of the food allergens extracted from EHRs, 62.9% corresponded to unique codes, 6.1% to multiple codes, and 31.0% were unmapped, accounting for 91.5%, 1.9%, and 6.6% of entries, respectively.</div></div><div><h3>Conclusions</h3><div>The JFAGY encompasses Japanese-specific food and non-food/non-medication allergens, enabling hierarchy determination between two terms, and playing a critical role in medical safety. When utilizing the JFAGY with the FHIR allergy intolerance resource, an FHIR extension must be included to denote a denied allergy.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105739"},"PeriodicalIF":3.7,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A randomized controlled trial on evaluating clinician-supervised generative AI for decision support 一项评估临床医生监督下的生成人工智能决策支持的随机对照试验。

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Medical Informatics

Pub Date : 2024-11-29 DOI: 10.1016/j.ijmedinf.2024.105701

Rayan Ebnali Harari , Abdullah Altaweel , Tareq Ahram , Madeleine Keehner , Hamid Shokoohi

Background

The integration of generative artificial intelligence (AI) as clinical decision support systems (CDSS) into telemedicine presents a significant opportunity to enhance clinical outcomes, yet its application remains underexplored.

Objective

This study investigates the efficacy of one of the most common generative AI tools, ChatGPT, for providing clinical guidance during cardiac arrest scenarios.

Methods

We examined the performance, cognitive load, and trust associated with traditional methods (paper guide), autonomous ChatGPT, and clinician-supervised ChatGPT, where a clinician supervised the AI recommendations. Fifty-four subjects without medical backgrounds participated in randomized controlled trials, each assigned to one of three intervention groups: paper guide, ChatGPT, or supervised ChatGPT. Participants completed a standardized CPR scenario using an Augmented Reality (AR) headset, and performance, physiological, and self-reported metrics were recorded.

Main Findings

Results indicate that the Supervised-ChatGPT group showed significantly higher decision accuracy compared to the paper guide and ChatGPT groups, although the scenario completion time was longer. Physiological data showed a reduced LF/HF ratio in the Supervised-ChatGPT group, suggesting potentially lower cognitive load. Trust in AI was also highest in the supervised condition. In one instance, ChatGPT suggested a risky option, highlighting the need for clinician supervision.

Conclusion

Our findings highlight the potential of supervised generative AI to enhance decision-making accuracy and user trust in emergency healthcare settings, despite trade-offs with response time. The study underscores the importance of clinician oversight and the need for further refinement of AI systems to improve safety. Future research should explore strategies to optimize AI supervision and assess the implementation of these systems in real-world clinical settings.

背景：将生成式人工智能（AI）作为临床决策支持系统（CDSS）集成到远程医疗中，为提高临床结果提供了重要机会，但其应用仍未得到充分探索。目的：本研究探讨了最常见的生成式人工智能工具之一ChatGPT在心脏骤停场景下提供临床指导的功效。方法：我们检查了与传统方法（论文指南）、自主ChatGPT和临床医生监督的ChatGPT相关的性能、认知负荷和信任。54名没有医学背景的受试者参加了随机对照试验，每个受试者被分配到三个干预组中的一个：论文指导、ChatGPT或监督ChatGPT。参与者使用增强现实（AR）耳机完成了标准化的CPR场景，并记录了表现、生理和自我报告指标。主要发现：结果表明，尽管情景完成时间更长，但与论文指南和ChatGPT组相比，Supervised-ChatGPT组的决策准确性显着提高。生理数据显示，督导-聊天gpt组LF/HF比值降低，提示认知负荷可能降低。在受监督的情况下，对人工智能的信任度也最高。在一个例子中，ChatGPT提出了一个有风险的选择，强调了临床医生监督的必要性。结论：我们的研究结果强调了监督生成人工智能在紧急医疗环境中提高决策准确性和用户信任的潜力，尽管需要在响应时间上进行权衡。该研究强调了临床医生监督的重要性，以及进一步完善人工智能系统以提高安全性的必要性。未来的研究应该探索优化人工智能监管的策略，并评估这些系统在现实世界临床环境中的实施情况。

{"title":"A randomized controlled trial on evaluating clinician-supervised generative AI for decision support","authors":"Rayan Ebnali Harari , Abdullah Altaweel , Tareq Ahram , Madeleine Keehner , Hamid Shokoohi","doi":"10.1016/j.ijmedinf.2024.105701","DOIUrl":"10.1016/j.ijmedinf.2024.105701","url":null,"abstract":"<div><h3>Background</h3><div>The integration of generative artificial intelligence (AI) as clinical decision support systems (CDSS) into telemedicine presents a significant opportunity to enhance clinical outcomes, yet its application remains underexplored.</div></div><div><h3>Objective</h3><div>This study investigates the efficacy of one of the most common generative AI tools, ChatGPT, for providing clinical guidance during cardiac arrest scenarios.</div></div><div><h3>Methods</h3><div>We examined the performance, cognitive load, and trust associated with traditional methods (paper guide), autonomous ChatGPT, and clinician-supervised ChatGPT, where a clinician supervised the AI recommendations. Fifty-four subjects without medical backgrounds participated in randomized controlled trials, each assigned to one of three intervention groups: paper guide, ChatGPT, or supervised ChatGPT. Participants completed a standardized CPR scenario using an Augmented Reality (AR) headset, and performance, physiological, and self-reported metrics were recorded.</div></div><div><h3>Main Findings</h3><div>Results indicate that the Supervised-ChatGPT group showed significantly higher decision accuracy compared to the paper guide and ChatGPT groups, although the scenario completion time was longer. Physiological data showed a reduced LF/HF ratio in the Supervised-ChatGPT group, suggesting potentially lower cognitive load. Trust in AI was also highest in the supervised condition. In one instance, ChatGPT suggested a risky option, highlighting the need for clinician supervision.</div></div><div><h3>Conclusion</h3><div>Our findings highlight the potential of supervised generative AI to enhance decision-making accuracy and user trust in emergency healthcare settings, despite trade-offs with response time. The study underscores the importance of clinician oversight and the need for further refinement of AI systems to improve safety. Future research should explore strategies to optimize AI supervision and assess the implementation of these systems in real-world clinical settings.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105701"},"PeriodicalIF":3.7,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0