Pub Date : 2025-10-15eCollection Date: 2025-11-01DOI: 10.1093/ehjdh/ztaf120
Tony Hauptmann, Sven-Oliver Tröbs, Andreas Schulz, Aida Romano Martinez, Philipp Lurz, Jürgen Prochaska, Philipp Sebastian Wild, Stefan Kramer
Aims: Automatic echocardiographic measurements using artificial intelligence have shown promising results; however, they have not been compared with manual measurements regarding heart failure (HF) progression and algorithm runtime.
Methods and results: Data came from the prospective HF study MyoVasc (NCT04064450), which involved a highly standardized 5-h examination, including comprehensive echocardiography, at a dedicated study centre between January 2013 and April 2018. Worsening of HF was a primary composite endpoint, recorded by structured follow-up, death certificates, and medical records. The automated assessment was performed using EchoDL, eight 3D convolutional neural networks (CNNs) trained to predict clinical parameters. Manual and automatic left ventricular ejection fraction (LVEF), E/E'-ratio and left ventricular mass (LVM) demonstrated a good intraclass correlation coefficient {LVEF: 0.75 [95% confidence interval (CI) 0.75-0.77], E/E'-ratio: 0.59 [CI 0.56-0.61], LVM: 0.64 [CI 0.62-0.66]}. After a median follow-up of 3.8 years (IQR 2.1-5.0), 470 patients experienced worsening of HF. In multivariable Cox analysis, comparison of manually and automatically assessed LVEF, E/E'-ratio and LVM demonstrated risk estimates slightly in favour of the CNNs. Direct comparison of C-indices showed significantly better model performance for automatically determined LVEF (0.71 vs. 0.73, P = 0.038) and E/E'-ratio (0.64 vs. 0.66, P = 0.013) and a trend for LVM (0.66 vs. 0.68, P = 0.063). Echo-DL required an average of 1053.4 ms (95% CI 1050.7-1056.0) to analyse a four-second-long echocardiogram.
Conclusion: Automated analysis of echocardiograms using 3D CNNs was comparable to manual measurements in predicting HF-specific outcomes. Echo-DL offers potential time savings and improved risk prediction in clinical settings, allowing integration into echocardiographic hardware.
目的:人工智能自动超声心动图测量显示出良好的结果;然而,还没有将它们与人工测量的心力衰竭(HF)进展和算法运行时间进行比较。方法和结果:数据来自前瞻性心衰研究MyoVasc (NCT04064450),该研究于2013年1月至2018年4月在一个专门的研究中心进行了高度标准化的5小时检查,包括全面的超声心动图。心衰恶化是主要的复合终点,通过结构化随访、死亡证明和医疗记录进行记录。使用EchoDL进行自动评估,8个3D卷积神经网络(cnn)经过训练来预测临床参数。手动和自动左室射血分数(LVEF)、E/E′-比和左室质量(LVM)表现出良好的类内相关系数{LVEF: 0.75[95%可信区间(CI) 0.75 ~ 0.77], E/E′-比:0.59 [CI 0.56 ~ 0.61], LVM: 0.64 [CI 0.62 ~ 0.66]}。中位随访3.8年(IQR 2.1-5.0)后,470例患者心衰恶化。在多变量Cox分析中,人工和自动评估的LVEF、E/E’-ratio和LVM的比较显示,风险估计略微偏向cnn。c指数的直接比较表明,自动确定的LVEF (0.71 vs. 0.73, P = 0.038)和E/E'-ratio (0.64 vs. 0.66, P = 0.013)的模型性能明显更好,LVM (0.66 vs. 0.68, P = 0.063)有趋势。Echo-DL平均需要1053.4 ms (95% CI 1050.7-1056.0)来分析4秒长的超声心动图。结论:使用3D cnn自动分析超声心动图在预测hf特异性结果方面与人工测量相当。Echo-DL在临床环境中提供了潜在的时间节省和改进的风险预测,允许集成到超声心动图硬件。
{"title":"Echocardiographic measures read by artificial intelligence enable accurate and rapid prediction of the worsening of heart failure.","authors":"Tony Hauptmann, Sven-Oliver Tröbs, Andreas Schulz, Aida Romano Martinez, Philipp Lurz, Jürgen Prochaska, Philipp Sebastian Wild, Stefan Kramer","doi":"10.1093/ehjdh/ztaf120","DOIUrl":"10.1093/ehjdh/ztaf120","url":null,"abstract":"<p><strong>Aims: </strong>Automatic echocardiographic measurements using artificial intelligence have shown promising results; however, they have not been compared with manual measurements regarding heart failure (HF) progression and algorithm runtime.</p><p><strong>Methods and results: </strong>Data came from the prospective HF study MyoVasc (NCT04064450), which involved a highly standardized 5-h examination, including comprehensive echocardiography, at a dedicated study centre between January 2013 and April 2018. Worsening of HF was a primary composite endpoint, recorded by structured follow-up, death certificates, and medical records. The automated assessment was performed using EchoDL, eight 3D convolutional neural networks (CNNs) trained to predict clinical parameters. Manual and automatic left ventricular ejection fraction (LVEF), <i>E</i>/<i>E</i>'-ratio and left ventricular mass (LVM) demonstrated a good intraclass correlation coefficient {LVEF: 0.75 [95% confidence interval (CI) 0.75-0.77], <i>E</i>/<i>E</i>'-ratio: 0.59 [CI 0.56-0.61], LVM: 0.64 [CI 0.62-0.66]}. After a median follow-up of 3.8 years (IQR 2.1-5.0), 470 patients experienced worsening of HF. In multivariable Cox analysis, comparison of manually and automatically assessed LVEF, <i>E</i>/<i>E</i>'-ratio and LVM demonstrated risk estimates slightly in favour of the CNNs. Direct comparison of <i>C</i>-indices showed significantly better model performance for automatically determined LVEF (0.71 vs. 0.73, <i>P</i> = 0.038) and <i>E</i>/<i>E</i>'-ratio (0.64 vs. 0.66, <i>P</i> = 0.013) and a trend for LVM (0.66 vs. 0.68, <i>P</i> = 0.063). Echo-DL required an average of 1053.4 ms (95% CI 1050.7-1056.0) to analyse a four-second-long echocardiogram.</p><p><strong>Conclusion: </strong>Automated analysis of echocardiograms using 3D CNNs was comparable to manual measurements in predicting HF-specific outcomes. Echo-DL offers potential time savings and improved risk prediction in clinical settings, allowing integration into echocardiographic hardware.</p>","PeriodicalId":72965,"journal":{"name":"European heart journal. Digital health","volume":"6 6","pages":"1246-1256"},"PeriodicalIF":4.4,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629647/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-13eCollection Date: 2025-11-01DOI: 10.1093/ehjdh/ztaf116
Shuang Leng, Nicholas Cheng, Eddy Tan, Lohendran Baskaran, Lynette Teo, Min Sen Yew, Kee Yuan Ngiam, Weimin Huang, Ping Chai, Ching Ching Ong, Ching Hui Sia, Malay Singh, Yan Ting Loong, Nur A S Raffiee, Xiaomeng Wang, John Allen, Swee Yaw Tan, Mark Chan, Hwee Kuan Lee, Liang Zhong
Aims: Epicardial adipose tissue (EAT), located within the pericardial sac, has emerged as a biomarker for coronary artery disease (CAD) progression. This study aimed to develop and validate a deep learning-based system for automated EAT volume quantification using non-contrast computed tomography (NCCT) scans from a large, multi-centre, pan-Asian cohort.
Methods and results: A total of 1243 NCCT patient scans from three centres were used to train and internally validate a deep learning model based on 3D UNet++ architecture for pericardium segmentation, followed by intensity thresholding to derive EAT volume. Epicardial adipose tissue quantification required ∼30 s per scan. The final model was evaluated on an external testing cohort of 160 patients, including 90 non-Asian individuals. In this cohort, AI-predicted EAT volumes showed excellent agreement with expert annotations (r = 0.975; P < 0.0001). The Bland-Altman analysis demonstrated a mean bias of -5.2 cm3with 95% limits of agreement from -25.1 to 14.7 cm3. Among the non-Asian subgroup, model performance remained strong (r = 0.970; bias, -3.2 cm3; limits of agreement, -25.1-18.7 cm3). AI-derived EAT volume was independently associated with obstructive CAD (odds ratio 1.11; 95% confidence interval, 1.04-1.19; P = 0.004), after adjusting for confounders. The global χ2 statistic increased from 81.7 with coronary calcium score alone to 93.3 when EAT volume was added (P = 0.001), indicating improved risk prediction.
Conclusion: We developed and validated a deep learning system for automated EAT volume quantification from NCCT scans. The model demonstrated high accuracy and generalizability across ethnically diverse populations, supporting its potential for routine EAT assessment and CAD risk stratification.
{"title":"Deep learning-based quantification of epicardial adipose tissue volume from non-contrast computed tomography images: a multi-centre study.","authors":"Shuang Leng, Nicholas Cheng, Eddy Tan, Lohendran Baskaran, Lynette Teo, Min Sen Yew, Kee Yuan Ngiam, Weimin Huang, Ping Chai, Ching Ching Ong, Ching Hui Sia, Malay Singh, Yan Ting Loong, Nur A S Raffiee, Xiaomeng Wang, John Allen, Swee Yaw Tan, Mark Chan, Hwee Kuan Lee, Liang Zhong","doi":"10.1093/ehjdh/ztaf116","DOIUrl":"10.1093/ehjdh/ztaf116","url":null,"abstract":"<p><strong>Aims: </strong>Epicardial adipose tissue (EAT), located within the pericardial sac, has emerged as a biomarker for coronary artery disease (CAD) progression. This study aimed to develop and validate a deep learning-based system for automated EAT volume quantification using non-contrast computed tomography (NCCT) scans from a large, multi-centre, pan-Asian cohort.</p><p><strong>Methods and results: </strong>A total of 1243 NCCT patient scans from three centres were used to train and internally validate a deep learning model based on 3D UNet++ architecture for pericardium segmentation, followed by intensity thresholding to derive EAT volume. Epicardial adipose tissue quantification required ∼30 s per scan. The final model was evaluated on an external testing cohort of 160 patients, including 90 non-Asian individuals. In this cohort, AI-predicted EAT volumes showed excellent agreement with expert annotations (<i>r</i> = 0.975; <i>P</i> < 0.0001). The Bland-Altman analysis demonstrated a mean bias of -5.2 cm<sup>3</sup>with 95% limits of agreement from -25.1 to 14.7 cm<sup>3</sup>. Among the non-Asian subgroup, model performance remained strong (<i>r</i> = 0.970; bias, -3.2 cm<sup>3</sup>; limits of agreement, -25.1-18.7 cm<sup>3</sup>). AI-derived EAT volume was independently associated with obstructive CAD (odds ratio 1.11; 95% confidence interval, 1.04-1.19; <i>P</i> = 0.004), after adjusting for confounders. The global χ<sup>2</sup> statistic increased from 81.7 with coronary calcium score alone to 93.3 when EAT volume was added (<i>P</i> = 0.001), indicating improved risk prediction.</p><p><strong>Conclusion: </strong>We developed and validated a deep learning system for automated EAT volume quantification from NCCT scans. The model demonstrated high accuracy and generalizability across ethnically diverse populations, supporting its potential for routine EAT assessment and CAD risk stratification.</p><p><strong>Trial registration: </strong>ClinicalTrials.gov Identifier: NCT05509010.</p>","PeriodicalId":72965,"journal":{"name":"European heart journal. Digital health","volume":"6 6","pages":"1223-1233"},"PeriodicalIF":4.4,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629654/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06eCollection Date: 2025-11-01DOI: 10.1093/ehjdh/ztaf118
Thomas F Kok, Navin Suthahar, Jesse H Krijthe, Rudolf A de Boer, Eric Boersma, Isabella Kardys
Aims: We aimed to compare performances of conventional survival models with machine learning (ML) survival models for incident heart failure (HF) in men and women without prevalent HF, cardiomyopathy (CM) or ischaemic heart disease (IHD), and to identify potential high-risk precursors overlooked by conventional survival models.
Methods and results: We predicted 10-year risk of incident HF in 266 306 women (2894 events) and 212 061 men (4213 events). We constructed multivariable Cox models, first using ∼ 400 baseline characteristics, and subsequently only those remaining after LASSO stability selection. We also used Random Survival Forest (RSF) and eXtreme Gradient Survival Boosting (XGBoost). Performances were assessed using internal cross validation and hold-out sets, with C-indices, calibration curves and net-benefit analyses. Model performances were comparable during internal validation: XGBoost (C-index ± SE) (men: 0.79 ± 0.0040, women: 0.83 ± 0.0023) showed similar performance to the multivariable Cox model (men: 0.80 ± 0.0031, women: 0.83 ± 0.0022) and Cox models after LASSO stability selection, while RSF showed numerically slightly lower performance (men: 0.78 ± 0.0025, women: 0.81 ± 0.0015). Findings were similar in the hold-out sets. Age, cystatin-C, lifetime treatments/medications, other heart disease, systolic blood pressure, and spirometry measures were identified as high-risk factors in both model types for both sexes. Additionally, sex-specific and model-specific risk factors were identified.
Conclusion: Machine learning models and Cox proportional hazard models performed well and similarly for 10-year incident HF risk prediction in the general population. However, sex-specific and model-specific risk predictors were found. Spirometry measures, rarely included in existing models, were identified as important risk factors. Our results suggest that ML models for HF prediction in the general population reveal insights that would otherwise remain unnoticed.
{"title":"High-dimensional machine learning models for prediction of heart failure in more than 400 000 men and women from the UK Biobank.","authors":"Thomas F Kok, Navin Suthahar, Jesse H Krijthe, Rudolf A de Boer, Eric Boersma, Isabella Kardys","doi":"10.1093/ehjdh/ztaf118","DOIUrl":"10.1093/ehjdh/ztaf118","url":null,"abstract":"<p><strong>Aims: </strong>We aimed to compare performances of conventional survival models with machine learning (ML) survival models for incident heart failure (HF) in men and women without prevalent HF, cardiomyopathy (CM) or ischaemic heart disease (IHD), and to identify potential high-risk precursors overlooked by conventional survival models.</p><p><strong>Methods and results: </strong>We predicted 10-year risk of incident HF in 266 306 women (2894 events) and 212 061 men (4213 events). We constructed multivariable Cox models, first using ∼ 400 baseline characteristics, and subsequently only those remaining after LASSO stability selection. We also used Random Survival Forest (RSF) and eXtreme Gradient Survival Boosting (XGBoost). Performances were assessed using internal cross validation and hold-out sets, with C-indices, calibration curves and net-benefit analyses. Model performances were comparable during internal validation: XGBoost (<i>C</i>-index ± SE) (men: 0.79 ± 0.0040, women: 0.83 ± 0.0023) showed similar performance to the multivariable Cox model (men: 0.80 ± 0.0031, women: 0.83 ± 0.0022) and Cox models after LASSO stability selection, while RSF showed numerically slightly lower performance (men: 0.78 ± 0.0025, women: 0.81 ± 0.0015). Findings were similar in the hold-out sets. Age, cystatin-C, lifetime treatments/medications, other heart disease, systolic blood pressure, and spirometry measures were identified as high-risk factors in both model types for both sexes. Additionally, sex-specific and model-specific risk factors were identified.</p><p><strong>Conclusion: </strong>Machine learning models and Cox proportional hazard models performed well and similarly for 10-year incident HF risk prediction in the general population. However, sex-specific and model-specific risk predictors were found. Spirometry measures, rarely included in existing models, were identified as important risk factors. Our results suggest that ML models for HF prediction in the general population reveal insights that would otherwise remain unnoticed.</p>","PeriodicalId":72965,"journal":{"name":"European heart journal. Digital health","volume":"6 6","pages":"1234-1245"},"PeriodicalIF":4.4,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629653/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06eCollection Date: 2025-11-01DOI: 10.1093/ehjdh/ztaf117
Ivor B Asztalos, Amy Li, Victoria L Vetter, John K Triedman, Joshua Mayourian
Aims: To assess the potential for artificial intelligence-enabled electrocardiogram (AI-ECG) to serve as a long-term cardiac surveillance tool and predict left ventricular systolic dysfunction in childhood cancer patients.
Methods and results: We assessed performance of our previously established AI-ECG model to predict left ventricular ejection fraction (LVEF) ≤50% and ≤40% in patients with childhood cancer during internal testing (Boston Children's Hospital) and external validation (Children's Hospital of Philadelphia). The internal test cohort comprised 447 patients [57% male; age at cancer diagnosis 11.2 (5.4-15.7) years; 1553 ECG-echo pairs at median age 13.5 (IQR 7.7-17.9) years; 6.4% with LVEF ≤50%; 1.3% with LVEF ≤40%], 28% with leukaemia, 16% with lymphoma, 8% with neuroblastoma, 8% with sarcoma, 2% with gastrointestinal cancers, 3% with genitourinary cancers, 6% with central nervous system cancers, 11% with other/unspecified cancers, and 18% with missing/unknown cancer labels. Treatment strategies included anthracyclines (35%), bone marrow transplant (7%), and radiation (1%). The external test cohort comprised 2964 patients [55.4% male; 7054 ECG-echo pairs at median age 11.6 (IQR 6.8-15.1) years; 2.5% with LVEF ≤50%; 0.9% with LVEF ≤40%]. Similar AUROCs (0.80-0.85), sensitivities (0.75-0.82), NPVs (0.986-0.996), and percent predicted negative (51-65%) were obtained across institutions to predict LVEF ≤50%, outperforming a biomarker-based model benchmark. Patients with high AI-ECG risk scores for LVEF ≤50% had higher rates of mortality [hazard ratio 3.1 (95% CI 1.8-5.3), P < 0.001] compared to patients with low AI-ECG risk scores.
Conclusion: AI-ECG shows promise as a digital biomarker for cardiac surveillance in the vulnerable childhood cancer survivor population.
目的:评估人工智能心电图(AI-ECG)作为长期心脏监测工具和预测儿童癌症患者左心室收缩功能障碍的潜力。方法和结果:我们在内部测试(波士顿儿童医院)和外部验证(费城儿童医院)期间评估了我们之前建立的AI-ECG模型的性能,以预测儿童癌症患者的左室射血分数(LVEF)≤50%和≤40%。内测队列包括447例患者[57%为男性;癌症诊断年龄11.2(5.4-15.7)岁;1553对,中位年龄13.5 (IQR 7.7 ~ 17.9)岁;LVEF≤50%的占6.4%;1.3%为LVEF≤40%],28%为白血病,16%为淋巴瘤,8%为神经母细胞瘤,8%为肉瘤,2%为胃肠道癌,3%为泌尿生殖系统癌,6%为中枢神经系统癌,11%为其他/不明癌症,18%为缺失/未知癌症标签。治疗策略包括蒽环类药物(35%)、骨髓移植(7%)和放疗(1%)。外部测试队列包括2964例患者,其中男性55.4%;7054对,中位年龄11.6 (IQR 6.8-15.1)岁;2.5%, LVEF≤50%;0.9%, LVEF≤40%]。在预测LVEF≤50%时,各机构的auroc(0.80-0.85)、灵敏度(0.75-0.82)、npv(0.986-0.996)和预测阴性百分比(51-65%)相似,优于基于生物标志物的模型基准。LVEF≤50%的高AI-ECG风险评分患者的死亡率高于低AI-ECG风险评分患者[风险比3.1 (95% CI 1.8-5.3), P < 0.001]。结论:AI-ECG有望作为易受伤害的儿童癌症幸存者人群心脏监测的数字生物标志物。
{"title":"Cardiac surveillance of childhood cancer using artificial intelligence-enabled electrocardiograms.","authors":"Ivor B Asztalos, Amy Li, Victoria L Vetter, John K Triedman, Joshua Mayourian","doi":"10.1093/ehjdh/ztaf117","DOIUrl":"10.1093/ehjdh/ztaf117","url":null,"abstract":"<p><strong>Aims: </strong>To assess the potential for artificial intelligence-enabled electrocardiogram (AI-ECG) to serve as a long-term cardiac surveillance tool and predict left ventricular systolic dysfunction in childhood cancer patients.</p><p><strong>Methods and results: </strong>We assessed performance of our previously established AI-ECG model to predict left ventricular ejection fraction (LVEF) ≤50% and ≤40% in patients with childhood cancer during internal testing (Boston Children's Hospital) and external validation (Children's Hospital of Philadelphia). The internal test cohort comprised 447 patients [57% male; age at cancer diagnosis 11.2 (5.4-15.7) years; 1553 ECG-echo pairs at median age 13.5 (IQR 7.7-17.9) years; 6.4% with LVEF ≤50%; 1.3% with LVEF ≤40%], 28% with leukaemia, 16% with lymphoma, 8% with neuroblastoma, 8% with sarcoma, 2% with gastrointestinal cancers, 3% with genitourinary cancers, 6% with central nervous system cancers, 11% with other/unspecified cancers, and 18% with missing/unknown cancer labels. Treatment strategies included anthracyclines (35%), bone marrow transplant (7%), and radiation (1%). The external test cohort comprised 2964 patients [55.4% male; 7054 ECG-echo pairs at median age 11.6 (IQR 6.8-15.1) years; 2.5% with LVEF ≤50%; 0.9% with LVEF ≤40%]. Similar AUROCs (0.80-0.85), sensitivities (0.75-0.82), NPVs (0.986-0.996), and percent predicted negative (51-65%) were obtained across institutions to predict LVEF ≤50%, outperforming a biomarker-based model benchmark. Patients with high AI-ECG risk scores for LVEF ≤50% had higher rates of mortality [hazard ratio 3.1 (95% CI 1.8-5.3), <i>P</i> < 0.001] compared to patients with low AI-ECG risk scores.</p><p><strong>Conclusion: </strong>AI-ECG shows promise as a digital biomarker for cardiac surveillance in the vulnerable childhood cancer survivor population.</p>","PeriodicalId":72965,"journal":{"name":"European heart journal. Digital health","volume":"6 6","pages":"1293-1296"},"PeriodicalIF":4.4,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629641/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06eCollection Date: 2025-11-01DOI: 10.1093/ehjdh/ztaf106
Manuel Sigle, Diana Heurich, Wenke Faller, Meinrad Gawaz, Karin Anne Lydia Mueller, Andreas Goldschmied
Aims: Overdiagnosis in patients suspected of acute coronary syndrome (ACS) leads to unnecessary coronary angiographies, particularly in cases with non-specifically elevated troponin (Trop) levels. We established machine learning (ML) models integrating sequentially available prehospital and in-hospital variables to improve early prediction of the need for coronary re while minimizing overdiagnosis.
Methods and results: Retrospective cohort study analysing patients with suspected ACS from 2016 to 2020. Machine learning models were trained using data available at different diagnostic time points, including prehospital assessment, arterial blood gas analysis, full laboratory results, and sequential Trop measurements. A total of 2756 patients were included, identified through emergency physician protocols for ACS-related complaints. Patients with incomplete data or prehospital mortality were excluded.Model performance improved with additional diagnostic data. Model 1 (prehospital data only) achieved an area under the receiver operating characteristic (AUROC) of 0.76 (95% confidence interval [CI] 0.72-0.79), while Model 4 (including sequential Trop testing) reached 0.87 (95% CI 0.83-0.91). Adding early hospital diagnostics (Model 2) significantly improved accuracy compared with Model 1 (0.65 vs. 0.78). Sequential Trop testing in Model 4 did not substantially enhance performance compared with single Trop testing in Model 3 (AUROC 0.87, 95% CI 0.83-0.91 vs. 0.86, 95% CI 0.82-0.91). Misclassification analysis revealed that underdiagnosed patients were typically older females with dyspnoea and known coronary artery disease but no ST-elevations. Overdiagnosed patients had higher body mass index, ST-elevations, regional wall motion abnormalities, and impaired left ventricular ejection fraction but lacked significant sequential Trop elevation.
Conclusion: Prehospital assessments combined with early in-hospital diagnostics provide reliable stratification of coronary intervention need, potentially optimizing clinical decision-making and resource utilization.
目的:疑似急性冠脉综合征(ACS)患者的过度诊断导致不必要的冠状动脉造影,特别是在肌钙蛋白(Trop)水平非特异性升高的病例中。我们建立了机器学习(ML)模型,整合依次可用的院前和院内变量,以提高对冠状动脉再灌注需求的早期预测,同时最大限度地减少过度诊断。方法与结果:回顾性队列研究,分析2016 - 2020年疑似ACS患者。机器学习模型使用不同诊断时间点的可用数据进行训练,包括院前评估、动脉血气分析、完整的实验室结果和连续的Trop测量。共纳入2756例患者,通过acs相关投诉的急诊医师协议确定。排除资料不完整或院前死亡率的患者。使用额外的诊断数据可以提高模型性能。模型1(仅院前数据)的受试者工作特征(AUROC)下面积为0.76(95%可信区间[CI] 0.72-0.79),而模型4(包括序贯Trop检验)达到0.87 (95% CI 0.83-0.91)。与模型1相比,加入早期医院诊断(模型2)显著提高了准确性(0.65 vs. 0.78)。与模型3中的单一Trop检验相比,模型4中的顺序Trop检验并没有显著提高性能(AUROC为0.87,95% CI 0.83-0.91比0.86,95% CI 0.82-0.91)。错误分类分析显示,未被诊断的患者通常是有呼吸困难和已知冠状动脉疾病但无st段抬高的老年女性。过度诊断的患者有较高的身体质量指数、st段抬高、局部壁运动异常和左室射血分数受损,但缺乏显著的连续Trop升高。结论:院前评估结合早期住院诊断可提供可靠的冠状动脉介入治疗需求分层,有可能优化临床决策和资源利用。
{"title":"AI-guided refinement of coronary revascularization need in patients suspected of acute coronary syndrome.","authors":"Manuel Sigle, Diana Heurich, Wenke Faller, Meinrad Gawaz, Karin Anne Lydia Mueller, Andreas Goldschmied","doi":"10.1093/ehjdh/ztaf106","DOIUrl":"10.1093/ehjdh/ztaf106","url":null,"abstract":"<p><strong>Aims: </strong>Overdiagnosis in patients suspected of acute coronary syndrome (ACS) leads to unnecessary coronary angiographies, particularly in cases with non-specifically elevated troponin (Trop) levels. We established machine learning (ML) models integrating sequentially available prehospital and in-hospital variables to improve early prediction of the need for coronary re while minimizing overdiagnosis.</p><p><strong>Methods and results: </strong>Retrospective cohort study analysing patients with suspected ACS from 2016 to 2020. Machine learning models were trained using data available at different diagnostic time points, including prehospital assessment, arterial blood gas analysis, full laboratory results, and sequential Trop measurements. A total of 2756 patients were included, identified through emergency physician protocols for ACS-related complaints. Patients with incomplete data or prehospital mortality were excluded.Model performance improved with additional diagnostic data. Model 1 (prehospital data only) achieved an area under the receiver operating characteristic (AUROC) of 0.76 (95% confidence interval [CI] 0.72-0.79), while Model 4 (including sequential Trop testing) reached 0.87 (95% CI 0.83-0.91). Adding early hospital diagnostics (Model 2) significantly improved accuracy compared with Model 1 (0.65 vs. 0.78). Sequential Trop testing in Model 4 did not substantially enhance performance compared with single Trop testing in Model 3 (AUROC 0.87, 95% CI 0.83-0.91 vs. 0.86, 95% CI 0.82-0.91). Misclassification analysis revealed that underdiagnosed patients were typically older females with dyspnoea and known coronary artery disease but no ST-elevations. Overdiagnosed patients had higher body mass index, ST-elevations, regional wall motion abnormalities, and impaired left ventricular ejection fraction but lacked significant sequential Trop elevation.</p><p><strong>Conclusion: </strong>Prehospital assessments combined with early in-hospital diagnostics provide reliable stratification of coronary intervention need, potentially optimizing clinical decision-making and resource utilization.</p>","PeriodicalId":72965,"journal":{"name":"European heart journal. Digital health","volume":"6 6","pages":"1169-1180"},"PeriodicalIF":4.4,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629660/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-03eCollection Date: 2025-11-01DOI: 10.1093/ehjdh/ztaf109
Daniel Pavluk, Fabian Theurl, Samuel Proell, Michael Schreinlecher, Florian Hofer, Patrick Rockenschaub, Angus Nicolson, Mercedes Gauthier, Sebastian Reinstadler, Clemens Dlaska, Axel Bauer
Aims: Artificial Intelligence (AI) models applied to standard 12-lead ECGs enable estimation of biological age (AI-ECG age), which has shown prognostic value in general populations. However, its clinical utility in high-risk patients with cardiovascular disease (CVD) or acute medical conditions remains insufficiently explored.
Methods and results: We analysed the first ECG of 48 950 consecutive patients presenting to a tertiary care centre with CVD or acute illness between 2000 and 2021. AI-ECG age was derived using a validated deep learning model. Δ-age, defined as the difference between AI-ECG and chronological age, was analysed categorically (±8 years) and continuously using multivariable Cox models adjusted for clinical and ECG variables. Primary endpoint was long-term total mortality (up to 10 years). Saliency map analysis was performed to identify input regions that the model was most sensitive to. AI-ECG age correlated strongly with chronological age (r = 0.72, P < 0.001), though this correlation weakened in patients with multiple comorbidities. Patients with a positive Δ-age (≥+8 years) had significantly higher 10 year mortality risk (HR: 1.45, P < 0.001), while those with a negative Δ-age (≤-8 years) had lower risk (HR: 0.88, P < 0.001). These associations were consistent across care settings and remained robust when Δ-age was analysed continuously. Saliency maps indicated that the AI model was most sensitive to the P-wave.
Conclusion: AI-ECG age is a strong and independent predictor of long-term mortality in cardiovascular and acute care patients.
目的:应用于标准12导联心电图的人工智能(AI)模型能够估计生物年龄(AI- ecg年龄),这在一般人群中显示出预后价值。然而,其在高危心血管疾病(CVD)或急性疾病患者中的临床应用仍未得到充分探讨。方法和结果:我们分析了2000年至2021年间在三级保健中心连续就诊的48950例心血管疾病或急性疾病患者的首次心电图。使用经过验证的深度学习模型推导AI-ECG年龄。Δ-age,定义为AI-ECG与实足年龄之间的差异,分类分析(±8年),并使用经临床和ECG变量调整的多变量Cox模型进行持续分析。主要终点是长期总死亡率(长达10年)。进行显著性图分析以识别模型最敏感的输入区域。AI-ECG年龄与实足年龄密切相关(r = 0.72, P < 0.001),但在合并多种合病的患者中,这种相关性减弱。Δ-age阳性(≥+8年)患者10年死亡风险显著增高(HR: 1.45, P < 0.001),而Δ-age阴性(≤-8年)患者10年死亡风险显著降低(HR: 0.88, P < 0.001)。这些关联在整个护理环境中是一致的,并且在对Δ-age进行连续分析时保持稳健。显著性图显示,人工智能模型对p波最为敏感。结论:AI-ECG年龄是心血管和急症患者长期死亡率的一个强有力的独立预测因子。
{"title":"AI-ECG-derived biological age as a predictor of mortality in cardiovascular and acute care patients.","authors":"Daniel Pavluk, Fabian Theurl, Samuel Proell, Michael Schreinlecher, Florian Hofer, Patrick Rockenschaub, Angus Nicolson, Mercedes Gauthier, Sebastian Reinstadler, Clemens Dlaska, Axel Bauer","doi":"10.1093/ehjdh/ztaf109","DOIUrl":"10.1093/ehjdh/ztaf109","url":null,"abstract":"<p><strong>Aims: </strong>Artificial Intelligence (AI) models applied to standard 12-lead ECGs enable estimation of biological age (AI-ECG age), which has shown prognostic value in general populations. However, its clinical utility in high-risk patients with cardiovascular disease (CVD) or acute medical conditions remains insufficiently explored.</p><p><strong>Methods and results: </strong>We analysed the first ECG of 48 950 consecutive patients presenting to a tertiary care centre with CVD or acute illness between 2000 and 2021. AI-ECG age was derived using a validated deep learning model. Δ-age, defined as the difference between AI-ECG and chronological age, was analysed categorically (±8 years) and continuously using multivariable Cox models adjusted for clinical and ECG variables. Primary endpoint was long-term total mortality (up to 10 years). Saliency map analysis was performed to identify input regions that the model was most sensitive to. AI-ECG age correlated strongly with chronological age (<i>r</i> = 0.72, <i>P</i> < 0.001), though this correlation weakened in patients with multiple comorbidities. Patients with a positive Δ-age (≥+8 years) had significantly higher 10 year mortality risk (HR: 1.45, <i>P</i> < 0.001), while those with a negative Δ-age (≤-8 years) had lower risk (HR: 0.88, <i>P</i> < 0.001). These associations were consistent across care settings and remained robust when Δ-age was analysed continuously. Saliency maps indicated that the AI model was most sensitive to the <i>P</i>-wave.</p><p><strong>Conclusion: </strong>AI-ECG age is a strong and independent predictor of long-term mortality in cardiovascular and acute care patients.</p>","PeriodicalId":72965,"journal":{"name":"European heart journal. Digital health","volume":"6 6","pages":"1204-1215"},"PeriodicalIF":4.4,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629650/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-26eCollection Date: 2025-11-01DOI: 10.1093/ehjdh/ztaf089
Julia Lortz, Tienush Rassaf, Laura Johannsen, Wibke Tonscheidt, Finley Sam Mellis, Lisa Maria Jahre, Marc Hesenius, Marvin Bachert, Christos Rammos, Martin Teufel, Alexander Bäuerle
Aims: Cardiovascular disease is the leading global cause of mortality. Traditional face-to-face cardiovascular care, while effective, poses challenges such as travel burdens and accessibility issues. Video consultations offer a modern solution, improving access and efficiency while reducing patient strain. This study investigates patient acceptance of video consultations in cardiovascular care using a survey-based approach, assessing key factors influencing their integration into routine practice.
Methods and results: A cross-sectional study including patients attending a cardiological university hospital was conducted. Acceptance of video consultations and its associated factors were assessed using a modified assessment instrument based on the unified theory of acceptance and use of technology. The study comprised 337 participants (M = 61.13 years, SD = 14.54), 54.6% male. Acceptance was moderate (M = 2.88, SD = 1.37), with 30.27% of the participants reporting high acceptance, 28.19% reporting moderate acceptance, and 41.54% low acceptance. Only 3% had used video consultations before. eHealth literacy was high, while digital confidence was moderate. Analysis showed that higher education, digital confidence, and eHealth literacy predicted greater acceptance of video consultations. Effort expectancy, performance expectancy (PE), and social influence (SI) accounted for most of the variance in acceptance (R2 = 0.724).
Conclusion: We identified moderate acceptance of video consultations in cardiology, with education, digital confidence, eHealth literacy, and PE as key factors associated with acceptance. Despite low prior use, perceived ease of use and SI were most strongly associated with acceptance. Addressing technical concerns and promoting digital literacy may enhance adoption, improving access to remote cardiac care.
{"title":"Patient acceptance of video consultations in cardiology.","authors":"Julia Lortz, Tienush Rassaf, Laura Johannsen, Wibke Tonscheidt, Finley Sam Mellis, Lisa Maria Jahre, Marc Hesenius, Marvin Bachert, Christos Rammos, Martin Teufel, Alexander Bäuerle","doi":"10.1093/ehjdh/ztaf089","DOIUrl":"10.1093/ehjdh/ztaf089","url":null,"abstract":"<p><strong>Aims: </strong>Cardiovascular disease is the leading global cause of mortality. Traditional face-to-face cardiovascular care, while effective, poses challenges such as travel burdens and accessibility issues. Video consultations offer a modern solution, improving access and efficiency while reducing patient strain. This study investigates patient acceptance of video consultations in cardiovascular care using a survey-based approach, assessing key factors influencing their integration into routine practice.</p><p><strong>Methods and results: </strong>A cross-sectional study including patients attending a cardiological university hospital was conducted. Acceptance of video consultations and its associated factors were assessed using a modified assessment instrument based on the unified theory of acceptance and use of technology. The study comprised 337 participants (<i>M</i> = 61.13 years, SD = 14.54), 54.6% male. Acceptance was moderate (<i>M</i> = 2.88, SD = 1.37), with 30.27% of the participants reporting high acceptance, 28.19% reporting moderate acceptance, and 41.54% low acceptance. Only 3% had used video consultations before. eHealth literacy was high, while digital confidence was moderate. Analysis showed that higher education, digital confidence, and eHealth literacy predicted greater acceptance of video consultations. Effort expectancy, performance expectancy (PE), and social influence (SI) accounted for most of the variance in acceptance (<i>R</i> <sup>2</sup> = 0.724).</p><p><strong>Conclusion: </strong>We identified moderate acceptance of video consultations in cardiology, with education, digital confidence, eHealth literacy, and PE as key factors associated with acceptance. Despite low prior use, perceived ease of use and SI were most strongly associated with acceptance. Addressing technical concerns and promoting digital literacy may enhance adoption, improving access to remote cardiac care.</p>","PeriodicalId":72965,"journal":{"name":"European heart journal. Digital health","volume":"6 6","pages":"1273-1281"},"PeriodicalIF":4.4,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629656/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-23eCollection Date: 2025-11-01DOI: 10.1093/ehjdh/ztaf112
I Min Chiu, Yuki Sahashi, Milos Vukadinovic, Paul P Cheng, Susan Cheng, David Ouyang
Aims: Timely and accurate detection of pericardial effusion and assessment of cardiac tamponade remain challenging and highly operator dependent.
Objectives: Artificial intelligence has advanced many echocardiographic assessments, and we aimed to develop and validate a deep learning model to automate the assessment of pericardial effusion severity and cardiac tamponade from echocardiogram videos.
Methods and results: We developed a deep learning model (EchoNet-Pericardium) using temporal-spatial convolutional neural networks to automate pericardial effusion severity grading and tamponade detection from echocardiography videos. The model was trained using a retrospective dataset of 1 427 660 videos from 85 380 echocardiograms at Cedars-Sinai Medical Centre (CSMC) to predict PE severity and cardiac tamponade across individual echocardiographic views and an ensemble approach combining predictions from five standard views. External validation was performed on 33 310 videos from 1806 echocardiograms from Stanford Healthcare (SHC). In the held-out CSMC test set, EchoNet-Pericardium achieved an AUC of 0.900 (95% CI: 0.884-0.916) for detecting moderate or larger pericardial effusion, 0.942 (95% CI: 0.917-0.964) for large pericardial effusion, and 0.955 (95% CI: 0.939-0.968) for cardiac tamponade. In the SHC external validation cohort, the model achieved AUCs of 0.869 (95% CI: 0.794-0.933) for moderate or larger pericardial effusion, 0.959 (95% CI: 0.945-0.972) for large pericardial effusion, and 0.966 (95% CI: 0.906-0.995) for cardiac tamponade. Subgroup analysis demonstrated consistent performance across ages, sexes, left ventricular ejection fraction, and atrial fibrillation statuses.
Conclusion: Our deep learning-based framework accurately grades pericardial effusion severity and detects cardiac tamponade from echocardiograms, demonstrating consistent performance and generalizability across different cohorts. This automated tool has the potential to enhance clinical decision-making by reducing operator dependence and expediting diagnosis.
{"title":"Automated evaluation for pericardial effusion and cardiac tamponade with echocardiographic artificial intelligence.","authors":"I Min Chiu, Yuki Sahashi, Milos Vukadinovic, Paul P Cheng, Susan Cheng, David Ouyang","doi":"10.1093/ehjdh/ztaf112","DOIUrl":"10.1093/ehjdh/ztaf112","url":null,"abstract":"<p><strong>Aims: </strong>Timely and accurate detection of pericardial effusion and assessment of cardiac tamponade remain challenging and highly operator dependent.</p><p><strong>Objectives: </strong>Artificial intelligence has advanced many echocardiographic assessments, and we aimed to develop and validate a deep learning model to automate the assessment of pericardial effusion severity and cardiac tamponade from echocardiogram videos.</p><p><strong>Methods and results: </strong>We developed a deep learning model (EchoNet-Pericardium) using temporal-spatial convolutional neural networks to automate pericardial effusion severity grading and tamponade detection from echocardiography videos. The model was trained using a retrospective dataset of 1 427 660 videos from 85 380 echocardiograms at Cedars-Sinai Medical Centre (CSMC) to predict PE severity and cardiac tamponade across individual echocardiographic views and an ensemble approach combining predictions from five standard views. External validation was performed on 33 310 videos from 1806 echocardiograms from Stanford Healthcare (SHC). In the held-out CSMC test set, EchoNet-Pericardium achieved an AUC of 0.900 (95% CI: 0.884-0.916) for detecting moderate or larger pericardial effusion, 0.942 (95% CI: 0.917-0.964) for large pericardial effusion, and 0.955 (95% CI: 0.939-0.968) for cardiac tamponade. In the SHC external validation cohort, the model achieved AUCs of 0.869 (95% CI: 0.794-0.933) for moderate or larger pericardial effusion, 0.959 (95% CI: 0.945-0.972) for large pericardial effusion, and 0.966 (95% CI: 0.906-0.995) for cardiac tamponade. Subgroup analysis demonstrated consistent performance across ages, sexes, left ventricular ejection fraction, and atrial fibrillation statuses.</p><p><strong>Conclusion: </strong>Our deep learning-based framework accurately grades pericardial effusion severity and detects cardiac tamponade from echocardiograms, demonstrating consistent performance and generalizability across different cohorts. This automated tool has the potential to enhance clinical decision-making by reducing operator dependence and expediting diagnosis.</p>","PeriodicalId":72965,"journal":{"name":"European heart journal. Digital health","volume":"6 6","pages":"1216-1222"},"PeriodicalIF":4.4,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629644/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-23eCollection Date: 2025-11-01DOI: 10.1093/ehjdh/ztaf111
Robert M Radke, Gerhard-Paul Diller, Rohan G Reddy, Pushpa Shivaram, David A Danford, Shelby Kutty
Aims: The aim of the current study was to assess the utility of a state-of-the-art large language model (LLM) based on curated, defined clinical practice recommendations to support clinicians in obtaining point-of-care guidelines for individual patient treatment while maintaining transparency.
Methods and results: We combined cloud-based and locally run LLMs with versatile open-source tools to form a multi-query, multimodal, retrieval-augmented generation chain that closely reflects European cardiology guidelines in its responses. We compared the performance of this generation chain to other LLMs including GPT-3.5 and GPT-4 on a 306-question multiple-choice exam with questions consisting of short patient vignettes from various cardiology subspecialties, originally written to prepare candidates for the European Exam in Core Cardiology. On the multiple-choice test, our system demonstrated overall accuracy of 73.53%, while GPT-3.5 and GPT-4 had overall accuracies of 44.03 and 62.26%, respectively. Our system outperformed GPT-3.5 and GPT-4 for the following categories of questions: coronary artery disease, arrhythmia, other, valvular heart disease, cardiomyopathies, endocarditis, adult congenital heart disease, pericardial disease, cardio-oncology, pulmonary hypertension, and non-cardiac surgery. For maximum transparency, the system also provided reference quotes for its recommendations.
Conclusion: Our system demonstrated superior performance in question-answering tasks on a set of core cardiology questions as compared with contemporary publicly available chat models. The current study illustrates that LLMs can be tailored to provide documented and accountable guideline recommendations towards actual clinical needs while ensuring recommendations are derived from up-to-date, trustable, and traceable documents.
{"title":"A multi-query, multimodal, receiver-augmented solution to extract contemporary cardiology guideline information using large language models.","authors":"Robert M Radke, Gerhard-Paul Diller, Rohan G Reddy, Pushpa Shivaram, David A Danford, Shelby Kutty","doi":"10.1093/ehjdh/ztaf111","DOIUrl":"10.1093/ehjdh/ztaf111","url":null,"abstract":"<p><strong>Aims: </strong>The aim of the current study was to assess the utility of a state-of-the-art large language model (LLM) based on curated, defined clinical practice recommendations to support clinicians in obtaining point-of-care guidelines for individual patient treatment while maintaining transparency.</p><p><strong>Methods and results: </strong>We combined cloud-based and locally run LLMs with versatile open-source tools to form a multi-query, multimodal, retrieval-augmented generation chain that closely reflects European cardiology guidelines in its responses. We compared the performance of this generation chain to other LLMs including GPT-3.5 and GPT-4 on a 306-question multiple-choice exam with questions consisting of short patient vignettes from various cardiology subspecialties, originally written to prepare candidates for the European Exam in Core Cardiology. On the multiple-choice test, our system demonstrated overall accuracy of 73.53%, while GPT-3.5 and GPT-4 had overall accuracies of 44.03 and 62.26%, respectively. Our system outperformed GPT-3.5 and GPT-4 for the following categories of questions: coronary artery disease, arrhythmia, other, valvular heart disease, cardiomyopathies, endocarditis, adult congenital heart disease, pericardial disease, cardio-oncology, pulmonary hypertension, and non-cardiac surgery. For maximum transparency, the system also provided reference quotes for its recommendations.</p><p><strong>Conclusion: </strong>Our system demonstrated superior performance in question-answering tasks on a set of core cardiology questions as compared with contemporary publicly available chat models. The current study illustrates that LLMs can be tailored to provide documented and accountable guideline recommendations towards actual clinical needs while ensuring recommendations are derived from up-to-date, trustable, and traceable documents.</p>","PeriodicalId":72965,"journal":{"name":"European heart journal. Digital health","volume":"6 6","pages":"1257-1263"},"PeriodicalIF":4.4,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629642/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-17eCollection Date: 2025-11-01DOI: 10.1093/ehjdh/ztaf108
Jieyu Hu, Sindre Hellum Olaisen, David Pasdeloup, Gilles Van De Vyver, Andreas Østvik, Espen Holte, Bjørnar Grenne, Håvard Dalen, Lasse Lovstakken
Background: Echocardiographic image data accumulating in echo labs are a highly valuable but underutilized resource for cardiac imaging research. Despite the availability of large image databases, quantitative measurements required for clinical analysis and research remain limited. Retrospective manual measurements are highly time-consuming and susceptible to operator-related variability. Moreover, data curation and quality control metrics are needed to prepare real-world data for analysis.
Methods: Deep learning-based image analysis can provide fully automated, rapid, and consistent extraction of measurements, given that the data have been properly curated. In this work, we develop an automated pipeline for data curation of a large echo database of 14 326 exams from 9678 patients and evaluate automated measurements of left ventricular ejection fraction (LVEF) and left atrial volume index (LAVI) as a use case.
Results: In validation subsample of 1763 subjects with varying image quality and cardiac diseases and 1488 healthy subjects, the pipeline output was compared with manual measurements. Bland-Altman analysis revealed a bias [standard deviation (SD)] of -1.8% (7.6%) for LVEF and 3.3 mL/m² (8.1 mL/m²) for LAVI and demonstrated robust performance for varying image quality and pathological conditions. Additionally, in the large part of the database of 9678 exams without clinical measurements, the automated data curation and measurement quality control resulted in 79% measured data with high confidence.
Conclusion: This work highlights the potential of deep learning-based automated measurements in echocardiography for data mining in large real-world databases, paving the way for advancements in cardiac imaging research and diagnostics.
{"title":"A deep learning-based pipeline for large-scale echocardiography data curation and measurements.","authors":"Jieyu Hu, Sindre Hellum Olaisen, David Pasdeloup, Gilles Van De Vyver, Andreas Østvik, Espen Holte, Bjørnar Grenne, Håvard Dalen, Lasse Lovstakken","doi":"10.1093/ehjdh/ztaf108","DOIUrl":"10.1093/ehjdh/ztaf108","url":null,"abstract":"<p><strong>Background: </strong>Echocardiographic image data accumulating in echo labs are a highly valuable but underutilized resource for cardiac imaging research. Despite the availability of large image databases, quantitative measurements required for clinical analysis and research remain limited. Retrospective manual measurements are highly time-consuming and susceptible to operator-related variability. Moreover, data curation and quality control metrics are needed to prepare real-world data for analysis.</p><p><strong>Methods: </strong>Deep learning-based image analysis can provide fully automated, rapid, and consistent extraction of measurements, given that the data have been properly curated. In this work, we develop an automated pipeline for data curation of a large echo database of 14 326 exams from 9678 patients and evaluate automated measurements of left ventricular ejection fraction (LVEF) and left atrial volume index (LAVI) as a use case.</p><p><strong>Results: </strong>In validation subsample of 1763 subjects with varying image quality and cardiac diseases and 1488 healthy subjects, the pipeline output was compared with manual measurements. Bland-Altman analysis revealed a bias [standard deviation (SD)] of -1.8% (7.6%) for LVEF and 3.3 mL/m² (8.1 mL/m²) for LAVI and demonstrated robust performance for varying image quality and pathological conditions. Additionally, in the large part of the database of 9678 exams without clinical measurements, the automated data curation and measurement quality control resulted in 79% measured data with high confidence.</p><p><strong>Conclusion: </strong>This work highlights the potential of deep learning-based automated measurements in echocardiography for data mining in large real-world databases, paving the way for advancements in cardiac imaging research and diagnostics.</p>","PeriodicalId":72965,"journal":{"name":"European heart journal. Digital health","volume":"6 6","pages":"1194-1203"},"PeriodicalIF":4.4,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629651/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}