{"title":"Accurate prediction of all-cause mortality in patients with metabolic dysfunction-associated steatotic liver disease using electronic health records","authors":"","doi":"10.1016/j.aohep.2024.101528","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction and Objectives</h3><p>Despite the huge clinical burden of MASLD, validated tools for early risk stratification are lacking, and heterogeneous disease expression and a highly variable rate of progression to clinical outcomes result in prognostic uncertainty. We aimed to investigate longitudinal electronic health record-based outcome prediction in MASLD using a state-of-the-art machine learning model.</p></div><div><h3>Patients and Methods</h3><p><em>n</em> = 940 patients with histologically-defined MASLD were used to develop a deep-learning model for all-cause mortality prediction. Patient timelines, spanning 12 years, were fully-annotated with demographic/clinical characteristics, ICD-9 and -10 codes, blood test results, prescribing data, and secondary care activity. A Transformer neural network (TNN) was trained to output concomitant probabilities of 12-, 24-, and 36-month all-cause mortality. In-sample performance was assessed using 5-fold cross-validation. Out-of-sample performance was assessed in an independent set of <em>n</em> = 528 MASLD patients.</p></div><div><h3>Results</h3><p>In-sample model performance achieved AUROC curve 0.74–0.90 (95 % CI: 0.72–0.94), sensitivity 64 %-82 %, specificity 75 %–92 % and Positive Predictive Value (PPV) 94 %-98 %. Out-of-sample model validation had AUROC 0.70–0.86 (95 % CI: 0.67–0.90), sensitivity 69 %–70 %, specificity 96 %–97 % and PPV 75 %–77 %. Key predictive factors, identified using coefficients of determination, were age, presence of type 2 diabetes, and history of hospital admissions with length of stay >14 days.</p></div><div><h3>Conclusions</h3><p>A TNN, applied to routinely-collected longitudinal electronic health records, achieved good performance in prediction of 12-, 24-, and 36-month all-cause mortality in patients with MASLD. Extrapolation of our technique to population-level data will enable scalable and accurate risk stratification to identify people most likely to benefit from anticipatory health care and personalized interventions.</p></div>","PeriodicalId":7979,"journal":{"name":"Annals of hepatology","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1665268124003223/pdfft?md5=78eda29d27beb056fd99526cd9d223f2&pid=1-s2.0-S1665268124003223-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of hepatology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1665268124003223","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction and Objectives
Despite the huge clinical burden of MASLD, validated tools for early risk stratification are lacking, and heterogeneous disease expression and a highly variable rate of progression to clinical outcomes result in prognostic uncertainty. We aimed to investigate longitudinal electronic health record-based outcome prediction in MASLD using a state-of-the-art machine learning model.
Patients and Methods
n = 940 patients with histologically-defined MASLD were used to develop a deep-learning model for all-cause mortality prediction. Patient timelines, spanning 12 years, were fully-annotated with demographic/clinical characteristics, ICD-9 and -10 codes, blood test results, prescribing data, and secondary care activity. A Transformer neural network (TNN) was trained to output concomitant probabilities of 12-, 24-, and 36-month all-cause mortality. In-sample performance was assessed using 5-fold cross-validation. Out-of-sample performance was assessed in an independent set of n = 528 MASLD patients.
Results
In-sample model performance achieved AUROC curve 0.74–0.90 (95 % CI: 0.72–0.94), sensitivity 64 %-82 %, specificity 75 %–92 % and Positive Predictive Value (PPV) 94 %-98 %. Out-of-sample model validation had AUROC 0.70–0.86 (95 % CI: 0.67–0.90), sensitivity 69 %–70 %, specificity 96 %–97 % and PPV 75 %–77 %. Key predictive factors, identified using coefficients of determination, were age, presence of type 2 diabetes, and history of hospital admissions with length of stay >14 days.
Conclusions
A TNN, applied to routinely-collected longitudinal electronic health records, achieved good performance in prediction of 12-, 24-, and 36-month all-cause mortality in patients with MASLD. Extrapolation of our technique to population-level data will enable scalable and accurate risk stratification to identify people most likely to benefit from anticipatory health care and personalized interventions.
期刊介绍:
Annals of Hepatology publishes original research on the biology and diseases of the liver in both humans and experimental models. Contributions may be submitted as regular articles. The journal also publishes concise reviews of both basic and clinical topics.