Pub Date : 2025-09-01Epub Date: 2025-09-09DOI: 10.1016/j.landig.2025.100896
Jonathan P Bedford DPhil , Oliver Redfern PhD , Stephen Gerry DPhil , Robert Hatch BA , Prof Liza Keating MSc , Prof David Clifton DPhil , Prof Gary S Collins PhD , Prof Peter J Watkinson MD
Background
New-onset atrial fibrillation, a condition associated with adverse outcomes in the short and long term, is common in patients admitted to intensive care units (ICUs). Identifying patients at high risk could inform trials of preventive interventions and help to target such interventions. We aimed to develop and externally validate a prediction model for new-onset atrial fibrillation in patients admitted to ICUs.
Methods
We conducted a multicentre, retrospective cohort study in three ICUs across the UK and four ICUs across the USA. Patients aged 16 years and older admitted to an ICU for more than 3 h without a history or presentation of clinically significant arrhythmia were eligible for inclusion. We analysed clinical variables to investigate the associations between predetermined candidate variables and risk of new-onset atrial fibrillation and to develop a model to estimate these risks. We developed the METRIC-AF model, a machine learning model incorporating dynamic variables. Model performance was assessed through internal–external cross-validation during model development and externally validated by use of multicentre data from ICUs across the UK. We then developed a simple graphical prediction tool using three important predictors.
Findings
Among 39 084 eligible patients admitted to an ICU between 2008 and 2019, 2797 (7·2%) developed new-onset atrial fibrillation during the first 7 days of their ICU stay. We identified multiple non-linear associations between candidate variables and risk of new-onset atrial fibrillation, including hypomagnesaemia at serum concentrations below 0·70 mmol/L. The final METRIC-AF model contained ten routinely collected clinical variables. Compared with a published logistic regression model, the METRIC-AF model showed superior calibration, net benefit across clinically relevant risk thresholds, and discriminative performance (C statistic 0·812 [95% CI 0·805–0·822] vs 0·786 [0·778–0·801]; p=0·0003). The simple graphical tool performed well in attributing the risk of new-onset atrial fibrillation in the external validation dataset (C statistic 0·727 [95% CI 0·716–0·739]).
Interpretation
The METRIC-AF model and its companion graphical tool could support the identification of patients at increased risk of developing new-onset atrial fibrillation during ICU admission, informing targeted prophylactic strategies and trial enrichment by use of routinely available clinical data. An online app also developed as part of the study allows for the exploration of prediction generation among individuals and external validation in prospective studies.
Funding
National Institute for Health and Care Research (NIHR) and NIHR Oxford Biomedical Research Centre.
背景:新发心房颤动是一种与短期和长期不良后果相关的疾病,在重症监护病房(icu)住院患者中很常见。识别高风险患者可以为预防性干预措施的试验提供信息,并有助于确定此类干预措施的目标。我们的目的是开发和外部验证一个预测模型的新发心房颤动入住icu的患者。方法:我们在英国的3个icu和美国的4个icu中进行了一项多中心、回顾性队列研究。年龄在16岁及以上且无明显心律失常病史或临床表现的患者入住ICU超过3小时符合入选条件。我们分析了临床变量,以研究预定候选变量与新发房颤风险之间的关系,并建立了一个模型来估计这些风险。我们开发了METRIC-AF模型,这是一个包含动态变量的机器学习模型。模型性能通过模型开发期间的内部-外部交叉验证进行评估,并通过使用来自英国各地icu的多中心数据进行外部验证。然后,我们使用三个重要的预测因子开发了一个简单的图形预测工具。研究结果:在2008年至2019年期间入住ICU的39084例符合条件的患者中,2797例(7.2%)在入住ICU的前7天内发生了新发心房颤动。我们确定了候选变量与新发房颤风险之间的多个非线性关联,包括血清浓度低于0.70 mmol/L的低镁血症。最终的METRIC-AF模型包含10个常规收集的临床变量。与已发表的logistic回归模型相比,METRIC-AF模型显示出更好的校准、临床相关风险阈值的净收益和判别性能(C统计量0.812 [95% CI 0.805 - 0.822] vs 0.786 [0.778 - 0.801]; p= 0.0003)。在外部验证数据集中,简单的图形工具在归因新发房颤风险方面表现良好(C统计量0.727 [95% CI 0.716 - 0.739])。解释:METRIC-AF模型及其伴随的图形工具可以支持识别ICU入院期间新发房颤风险增加的患者,通过使用常规临床数据提供有针对性的预防策略和试验丰富。作为研究的一部分,还开发了一个在线应用程序,用于探索个体之间的预测生成和前瞻性研究的外部验证。资助:国家卫生与保健研究所(NIHR)和NIHR牛津生物医学研究中心。
{"title":"Development and external validation of a clinical prediction model for new-onset atrial fibrillation in intensive care: a multicentre, retrospective cohort study","authors":"Jonathan P Bedford DPhil , Oliver Redfern PhD , Stephen Gerry DPhil , Robert Hatch BA , Prof Liza Keating MSc , Prof David Clifton DPhil , Prof Gary S Collins PhD , Prof Peter J Watkinson MD","doi":"10.1016/j.landig.2025.100896","DOIUrl":"10.1016/j.landig.2025.100896","url":null,"abstract":"<div><h3>Background</h3><div>New-onset atrial fibrillation, a condition associated with adverse outcomes in the short and long term, is common in patients admitted to intensive care units (ICUs). Identifying patients at high risk could inform trials of preventive interventions and help to target such interventions. We aimed to develop and externally validate a prediction model for new-onset atrial fibrillation in patients admitted to ICUs.</div></div><div><h3>Methods</h3><div>We conducted a multicentre, retrospective cohort study in three ICUs across the UK and four ICUs across the USA. Patients aged 16 years and older admitted to an ICU for more than 3 h without a history or presentation of clinically significant arrhythmia were eligible for inclusion. We analysed clinical variables to investigate the associations between predetermined candidate variables and risk of new-onset atrial fibrillation and to develop a model to estimate these risks. We developed the METRIC-AF model, a machine learning model incorporating dynamic variables. Model performance was assessed through internal–external cross-validation during model development and externally validated by use of multicentre data from ICUs across the UK. We then developed a simple graphical prediction tool using three important predictors.</div></div><div><h3>Findings</h3><div>Among 39 084 eligible patients admitted to an ICU between 2008 and 2019, 2797 (7·2%) developed new-onset atrial fibrillation during the first 7 days of their ICU stay. We identified multiple non-linear associations between candidate variables and risk of new-onset atrial fibrillation, including hypomagnesaemia at serum concentrations below 0·70 mmol/L. The final METRIC-AF model contained ten routinely collected clinical variables. Compared with a published logistic regression model, the METRIC-AF model showed superior calibration, net benefit across clinically relevant risk thresholds, and discriminative performance (C statistic 0·812 [95% CI 0·805–0·822] <em>vs</em> 0·786 [0·778–0·801]; p=0·0003). The simple graphical tool performed well in attributing the risk of new-onset atrial fibrillation in the external validation dataset (C statistic 0·727 [95% CI 0·716–0·739]).</div></div><div><h3>Interpretation</h3><div>The METRIC-AF model and its companion graphical tool could support the identification of patients at increased risk of developing new-onset atrial fibrillation during ICU admission, informing targeted prophylactic strategies and trial enrichment by use of routinely available clinical data. An online app also developed as part of the study allows for the exploration of prediction generation among individuals and external validation in prospective studies.</div></div><div><h3>Funding</h3><div>National Institute for Health and Care Research (NIHR) and NIHR Oxford Biomedical Research Centre.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 9","pages":"Article 100896"},"PeriodicalIF":24.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145034409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The effect of digital adherence technologies (DATs) on long-term tuberculosis treatment outcomes remains unclear. We aimed to assess the effectiveness of DATs in improving tuberculosis treatment outcomes and recurrence.
Methods
We did a pragmatic cluster-randomised trial in Ethiopia. 78 health facilities (clusters) were randomised (1:1:1) to smart pillbox, medication labels, or standard of care. Adults aged 18 years or older with drug-sensitive pulmonary tuberculosis on a fixed-dose combination tuberculosis treatment regimen were enrolled and followed up for 12 months after treatment initiation. Those in the smart pillbox group received a pillbox with customisable audio-visual reminders, whereas participants in the medications label group received their tuberculosis medication with a weekly unique code label. Opening the box or texting the code prompted real-time dose logging on the adherence platform, facilitating differentiated response to an individual’s adherence by a health-care worker. The primary composite outcome comprised death, loss to follow-up, treatment failure, switch to drug-resistant tuberculosis treatment, or recurrence. Secondary outcomes were poor end-of-treatment outcome and loss to follow-up. Analysis accounted for clustered design with multiple imputation for the primary composite outcome. The trial is registered with Pan African Clinical Trials Registry (PACTR202008776694999) and is complete.
Findings
From May 24, 2021, to Aug 8, 2022, 8477 individuals undergoing tuberculosis treatment were assessed for eligibility. Of the 3885 participants enrolled, 3858 were included in the intention-to-treat population. 1567 (40·6%) of 3858 participants were women and the median age of all participants was 30 years (IQR 24–40). At 12 months, using multiple imputation, neither the smart pillbox group (adjusted odds ratio [OR] 1·04 [95% CI 0·74 to 1·45]; adjusted risk difference: 0·96 percentage points [95% CI –1·19 to 3·11]) nor the medication labels group (adjusted OR 1·14 [0·83 to 1·61]; adjusted risk difference: 0·42 percentage points [–1·75 to 2·59]) reduced the risk of the primary composite outcome. There was no evidence of effect on poor end-of-treatment outcomes or loss to follow-up in either intervention group, although the label intervention showed weak evidence of reduced loss to follow-up. Results were similar in complete case and per-protocol analyses.
Interpretation
The DAT interventions showed no reduction in unfavourable outcomes. This emphasises the necessity to optimise DATs to enhance tuberculosis management strategies and treatment outcomes.
Funding
Unitaid.
背景:数字依从性技术(DATs)对长期结核病治疗结果的影响尚不清楚。我们的目的是评估dat在改善结核病治疗结果和复发方面的有效性。方法:我们在埃塞俄比亚进行了一项实用的集群随机试验。78个卫生机构(集群)被随机(1:1:1)分配到智能药盒、药物标签或标准护理。接受固定剂量结核联合治疗方案的18岁或以上药物敏感性肺结核患者入组,并在治疗开始后随访12个月。智能药盒组的参与者收到了一个带有可定制的视听提醒的药盒,而药物标签组的参与者收到了每周唯一代码标签的结核病药物。打开盒子或发送代码提示在依从平台上实时记录剂量,促进卫生保健工作者对个人依从性的差异化反应。主要的复合结局包括死亡、失去随访、治疗失败、转向耐药结核病治疗或复发。次要结局是治疗结束时预后差和随访失败。分析采用聚类设计对主要综合结果进行多重输入。该试验已在泛非临床试验注册中心(PACTR202008776694999)注册完成。研究结果:从2021年5月24日至2022年8月8日,8477名接受结核病治疗的患者接受了资格评估。在入组的3885名参与者中,3858人被纳入意向治疗人群。3858名参与者中1567名(40.6%)为女性,所有参与者的中位年龄为30岁(IQR 24-40)。在12个月时,采用多重归因法,智能药盒组(调整优势比[OR] 1.04 [95% CI 0.74 ~ 1.45];调整风险差:0.96个百分点[95% CI - 1.19 ~ 3.11])和药物标签组(调整优势比[OR] 1.14[0.83 ~ 1.61];调整风险差:0.42个百分点[- 1.75 ~ 2.59])均未降低主要综合结局的风险。没有证据表明干预组对治疗结束时的不良结果或随访损失有影响,尽管标签干预显示了减少随访损失的微弱证据。完整病例分析和方案分析的结果相似。解释:DAT干预没有显示不利结果的减少。这就强调了优化结核治疗方案以加强结核病管理战略和治疗结果的必要性。资金:国际药品采购机制。
{"title":"Digital adherence technology interventions to reduce poor end-of-treatment outcomes and recurrence among adults with drug-sensitive tuberculosis in Ethiopia: a three-arm, pragmatic, cluster-randomised, controlled trial","authors":"Amare W Tadesse PhD , Mamush Sahile MPH , Nicola Foster PhD , Christopher Finn McQuaid PhD , Gedion Teferra Weldemichael MD , Tofik Abdurhman MSc , Zemedu Mohammed MPH , Mahilet Belachew MSc , Amanuel Shiferaw MPH , Demelash Assefa MPH , Demekech Gadissa MPH , Hiwot Yazew MPH , Nuria Yakob MPH , Zewdneh Shewamene PhD , Lara Goscé PhD , Job van Rest MSc , Norma Madden MSc , Prof Salome Charalambous PhD , Kristian van Kalmthout MSc , Ahmed Bedru MD , Prof Katherine L Fielding PhD","doi":"10.1016/j.landig.2025.100895","DOIUrl":"10.1016/j.landig.2025.100895","url":null,"abstract":"<div><h3>Background</h3><div>The effect of digital adherence technologies (DATs) on long-term tuberculosis treatment outcomes remains unclear. We aimed to assess the effectiveness of DATs in improving tuberculosis treatment outcomes and recurrence.</div></div><div><h3>Methods</h3><div>We did a pragmatic cluster-randomised trial in Ethiopia. 78 health facilities (clusters) were randomised (1:1:1) to smart pillbox, medication labels, or standard of care. Adults aged 18 years or older with drug-sensitive pulmonary tuberculosis on a fixed-dose combination tuberculosis treatment regimen were enrolled and followed up for 12 months after treatment initiation. Those in the smart pillbox group received a pillbox with customisable audio-visual reminders, whereas participants in the medications label group received their tuberculosis medication with a weekly unique code label. Opening the box or texting the code prompted real-time dose logging on the adherence platform, facilitating differentiated response to an individual’s adherence by a health-care worker. The primary composite outcome comprised death, loss to follow-up, treatment failure, switch to drug-resistant tuberculosis treatment, or recurrence. Secondary outcomes were poor end-of-treatment outcome and loss to follow-up. Analysis accounted for clustered design with multiple imputation for the primary composite outcome. The trial is registered with Pan African Clinical Trials Registry (PACTR202008776694999) and is complete.</div></div><div><h3>Findings</h3><div>From May 24, 2021, to Aug 8, 2022, 8477 individuals undergoing tuberculosis treatment were assessed for eligibility. Of the 3885 participants enrolled, 3858 were included in the intention-to-treat population. 1567 (40·6%) of 3858 participants were women and the median age of all participants was 30 years (IQR 24–40). At 12 months, using multiple imputation, neither the smart pillbox group (adjusted odds ratio [OR] 1·04 [95% CI 0·74 to 1·45]; adjusted risk difference: 0·96 percentage points [95% CI –1·19 to 3·11]) nor the medication labels group (adjusted OR 1·14 [0·83 to 1·61]; adjusted risk difference: 0·42 percentage points [–1·75 to 2·59]) reduced the risk of the primary composite outcome. There was no evidence of effect on poor end-of-treatment outcomes or loss to follow-up in either intervention group, although the label intervention showed weak evidence of reduced loss to follow-up. Results were similar in complete case and per-protocol analyses.</div></div><div><h3>Interpretation</h3><div>The DAT interventions showed no reduction in unfavourable outcomes. This emphasises the necessity to optimise DATs to enhance tuberculosis management strategies and treatment outcomes.</div></div><div><h3>Funding</h3><div>Unitaid.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 9","pages":"Article 100895"},"PeriodicalIF":24.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145201808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01Epub Date: 2025-10-09DOI: 10.1016/j.landig.2025.100925
The Lancet Digital Health
{"title":"Navigating the landscape of medical artificial intelligence reporting guidelines","authors":"The Lancet Digital Health","doi":"10.1016/j.landig.2025.100925","DOIUrl":"10.1016/j.landig.2025.100925","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 9","pages":"Article 100925"},"PeriodicalIF":24.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145276446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01Epub Date: 2025-08-14DOI: 10.1016/j.landig.2025.100890
Bardia Khosravi MD MPH , Saptarshi Purkayastha PhD , Prof Bradley J Erickson MD PhD , Hari M Trivedi MD , Judy W Gichoya MD MS
Generative artificial intelligence has emerged as a transformative force in medical imaging since 2022, enabling the creation of derivative synthetic datasets that closely resemble real-world data. This Viewpoint examines key aspects of synthetic data, focusing on its advancements, applications, and challenges in medical imaging. Various generative artificial intelligence image generation paradigms, such as physics-informed and statistical models, and their potential to augment and diversify medical research resources are explored. The promises of synthetic datasets, including increased diversity, privacy preservation, and multifunctionality, are also discussed, along with their ability to model complex biological phenomena. Next, specific applications using synthetic data such as enhancing medical education, augmenting rare disease datasets, improving radiology workflows, and enabling privacy-preserving multicentre collaborations are highlighted. The challenges and ethical considerations surrounding generative artificial intelligence, including patient privacy, data copying, and potential biases that could impede clinical translation, are also addressed. Finally, future directions for research and development in this rapidly evolving field are outlined, emphasising the need for robust evaluation frameworks and responsible utilisation of generative artificial intelligence in medical imaging.
{"title":"Exploring the potential of generative artificial intelligence in medical image synthesis: opportunities, challenges, and future directions","authors":"Bardia Khosravi MD MPH , Saptarshi Purkayastha PhD , Prof Bradley J Erickson MD PhD , Hari M Trivedi MD , Judy W Gichoya MD MS","doi":"10.1016/j.landig.2025.100890","DOIUrl":"10.1016/j.landig.2025.100890","url":null,"abstract":"<div><div>Generative artificial intelligence has emerged as a transformative force in medical imaging since 2022, enabling the creation of derivative synthetic datasets that closely resemble real-world data. This Viewpoint examines key aspects of synthetic data, focusing on its advancements, applications, and challenges in medical imaging. Various generative artificial intelligence image generation paradigms, such as physics-informed and statistical models, and their potential to augment and diversify medical research resources are explored. The promises of synthetic datasets, including increased diversity, privacy preservation, and multifunctionality, are also discussed, along with their ability to model complex biological phenomena. Next, specific applications using synthetic data such as enhancing medical education, augmenting rare disease datasets, improving radiology workflows, and enabling privacy-preserving multicentre collaborations are highlighted. The challenges and ethical considerations surrounding generative artificial intelligence, including patient privacy, data copying, and potential biases that could impede clinical translation, are also addressed. Finally, future directions for research and development in this rapidly evolving field are outlined, emphasising the need for robust evaluation frameworks and responsible utilisation of generative artificial intelligence in medical imaging.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 9","pages":"Article 100890"},"PeriodicalIF":24.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144859803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01Epub Date: 2025-09-25DOI: 10.1016/j.landig.2025.100901
Qin Zhong PhD , Yuxiao Cheng BEng , Zongren Li PhD , Prof Dongjin Wang MD , Chongyou Rao PhD , Yi Jiang MD , Lianglong Li BEng , Ziqian Wang BEng , Pan Liu PhD , Hebin Che MSc , Pei Li PhD , Prof Xin Lu PhD , Jinli Suo PhD , Kunlun He PhD
<div><h3>Background</h3><div>Cardiac surgery-associated acute kidney injury (CSA-AKI) is a complex complication substantially contributing to an increased risk of mortality. Effective CSA-AKI management relies on timely diagnosis and interventions. However, many cases are detected too late. Despite the advancements in novel biomarkers and data-driven predictive models, existing practices are primarily constrained due to the limited discriminative and generalisation capabilities and stringent application requirements, presenting major challenges to the timely and effective diagnosis and interventions in CSA-AKI management. This study aimed to develop a causal deep learning architecture, named REACT, to achieve precise and dynamic predictions of CSA-AKI within the subsequent 48 h.</div></div><div><h3>Methods</h3><div>In this retrospective model development and prospective validation study, we included adult patients (aged ≥18 years) from seven distinct cohorts undergoing major open-heart surgery for model training and validation. Data for model development and internal validation were sourced from electronic health records of two large centres in Beijing, China, between Jan 1, 2000, and Dec 31, 2022. External validation was conducted on three independent centres in China between Jan 1, 2000, and Dec 31, 2022<strong>,</strong> along with cross-national data from the public databases MIMIC-IV and eICU in the USA. To facilitate implementation, we also developed a publicly accessible web calculator and applet. The model’s prospective application was validated from June 1, to Oct 31, 2023, at two centres in Beijing and Nanjing, China.</div></div><div><h3>Findings</h3><div>The final derivation cohort included 14 513 eligible patients with a median age of 56 years (IQR 45–65), 5515 (38·0%) patients were female, and 3047 (21·0%) developed CSA-AKI. The external validation dataset included 20 813 patients from China and 28 023 from the USA. REACT reduced 1328 input variables to six essential causal factors for CSA-AKI prediction. In internal validation, REACT achieved an average area under the receiver operating characteristic curve (AUROC) of 0·930 (SD 0·032), outperforming state-of-the-art deep learning architectures, specifically transformer-based and long short-term memory-based models, which rely on more complex variables. The model consistently outperformed in external validation across different centres (average AUROC 0·920 [SD 0·036]) and regions (0·867 [0·073]), as well as in prospective validation (0·896 [0·023]). Compared with guideline-recommended pathways, REACT detected CSA-AKI on average 16·35 h (SD 2·01) earlier in external validation.</div></div><div><h3>Interpretation</h3><div>We proposed a causal deep learning approach to predict CSA-AKI risk within 48 h, distilling the complex temporal interactions between variables into only a few universal, relatively cost-effective inputs. The approach shows great potential for deployment across hospit
背景:心脏手术相关急性肾损伤(CSA-AKI)是一种复杂的并发症,大大增加了死亡风险。有效的CSA-AKI管理依赖于及时的诊断和干预。然而,许多病例发现得太晚了。尽管在新型生物标志物和数据驱动的预测模型方面取得了进展,但现有的实践主要受到有限的判别和推广能力以及严格的应用要求的限制,这对CSA-AKI管理中及时有效的诊断和干预提出了重大挑战。本研究旨在开发一种名为REACT的因果深度学习架构,以在随后的48小时内实现CSA-AKI的精确和动态预测。方法:在这项回顾性模型开发和前瞻性验证研究中,我们纳入了来自7个不同队列的接受大型心内直视手术的成年患者(年龄≥18岁)进行模型训练和验证。模型开发和内部验证的数据来自2000年1月1日至2022年12月31日期间中国北京两个大型中心的电子健康记录。外部验证于2000年1月1日至2022年12月31日在中国的三个独立中心进行,同时使用了来自美国公共数据库MIMIC-IV和eICU的跨国数据。为了便于实现,我们还开发了一个可公开访问的web计算器和applet。该模型的预期应用于2023年6月1日至10月31日在中国北京和南京的两个中心进行了验证。结果:最终衍生队列包括14513例符合条件的患者,中位年龄为56岁(IQR 45-65), 5515例(38.0%)为女性,3047例(21.0%)为CSA-AKI。外部验证数据集包括来自中国的20813名患者和来自美国的28023名患者。REACT将1328个输入变量简化为CSA-AKI预测的6个基本因果因素。在内部验证中,REACT在接收者工作特征曲线(AUROC)下的平均面积为0.930 (SD为0.032),优于最先进的深度学习架构,特别是基于变压器和基于长短期记忆的模型,这些模型依赖于更复杂的变量。该模型在不同中心(平均AUROC为0.920 [SD为0.036])和区域(平均AUROC为0.867[0.073])以及前瞻性验证(平均AUROC为0.896[0.023])的外部验证中始终表现优异。与指南推荐的途径相比,REACT在外部验证中检测CSA-AKI的平均时间提前了16·35 h (SD 2.01)。我们提出了一种因果深度学习方法来预测48小时内的CSA-AKI风险,将变量之间复杂的时间相互作用提炼成几个通用的、相对具有成本效益的输入。该方法显示出在医院之间部署的巨大潜力,数据需求最低,并为因果深度学习和早期发现其他疾病提供了一个通用框架。资助项目:建设项目和国家自然科学基金。
{"title":"Causal deep learning for real-time detection of cardiac surgery-associated acute kidney injury: derivation and validation in seven time-series cohorts","authors":"Qin Zhong PhD , Yuxiao Cheng BEng , Zongren Li PhD , Prof Dongjin Wang MD , Chongyou Rao PhD , Yi Jiang MD , Lianglong Li BEng , Ziqian Wang BEng , Pan Liu PhD , Hebin Che MSc , Pei Li PhD , Prof Xin Lu PhD , Jinli Suo PhD , Kunlun He PhD","doi":"10.1016/j.landig.2025.100901","DOIUrl":"10.1016/j.landig.2025.100901","url":null,"abstract":"<div><h3>Background</h3><div>Cardiac surgery-associated acute kidney injury (CSA-AKI) is a complex complication substantially contributing to an increased risk of mortality. Effective CSA-AKI management relies on timely diagnosis and interventions. However, many cases are detected too late. Despite the advancements in novel biomarkers and data-driven predictive models, existing practices are primarily constrained due to the limited discriminative and generalisation capabilities and stringent application requirements, presenting major challenges to the timely and effective diagnosis and interventions in CSA-AKI management. This study aimed to develop a causal deep learning architecture, named REACT, to achieve precise and dynamic predictions of CSA-AKI within the subsequent 48 h.</div></div><div><h3>Methods</h3><div>In this retrospective model development and prospective validation study, we included adult patients (aged ≥18 years) from seven distinct cohorts undergoing major open-heart surgery for model training and validation. Data for model development and internal validation were sourced from electronic health records of two large centres in Beijing, China, between Jan 1, 2000, and Dec 31, 2022. External validation was conducted on three independent centres in China between Jan 1, 2000, and Dec 31, 2022<strong>,</strong> along with cross-national data from the public databases MIMIC-IV and eICU in the USA. To facilitate implementation, we also developed a publicly accessible web calculator and applet. The model’s prospective application was validated from June 1, to Oct 31, 2023, at two centres in Beijing and Nanjing, China.</div></div><div><h3>Findings</h3><div>The final derivation cohort included 14 513 eligible patients with a median age of 56 years (IQR 45–65), 5515 (38·0%) patients were female, and 3047 (21·0%) developed CSA-AKI. The external validation dataset included 20 813 patients from China and 28 023 from the USA. REACT reduced 1328 input variables to six essential causal factors for CSA-AKI prediction. In internal validation, REACT achieved an average area under the receiver operating characteristic curve (AUROC) of 0·930 (SD 0·032), outperforming state-of-the-art deep learning architectures, specifically transformer-based and long short-term memory-based models, which rely on more complex variables. The model consistently outperformed in external validation across different centres (average AUROC 0·920 [SD 0·036]) and regions (0·867 [0·073]), as well as in prospective validation (0·896 [0·023]). Compared with guideline-recommended pathways, REACT detected CSA-AKI on average 16·35 h (SD 2·01) earlier in external validation.</div></div><div><h3>Interpretation</h3><div>We proposed a causal deep learning approach to predict CSA-AKI risk within 48 h, distilling the complex temporal interactions between variables into only a few universal, relatively cost-effective inputs. The approach shows great potential for deployment across hospit","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 9","pages":"Article 100901"},"PeriodicalIF":24.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145151393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01Epub Date: 2025-08-08DOI: 10.1016/j.landig.2025.100876
Sebastian Voigtlaender MSc , Thomas A Nelson MD , Philipp Karschnia MD , Eugene J Vaios MD , Prof Michelle M Kim MD , Philipp Lohmann PhD , Prof Norbert Galldiks MD , Prof Mariella G Filbin MD , Shekoofeh Azizi PhD , Vivek Natarajan MSc , Prof Michelle Monje MD , Prof Jorg Dietrich MD , Sebastian F Winter MD
CNS cancers are complex, difficult-to-treat malignancies that remain insufficiently understood and mostly incurable, despite decades of research efforts. Artificial intelligence (AI) is poised to reshape neuro-oncological practice and research, driving advances in medical image analysis, neuro–molecular–genetic characterisation, biomarker discovery, therapeutic target identification, tailored management strategies, and neurorehabilitation. This Review examines key opportunities and challenges associated with AI applications along the neuro-oncological care trajectory. We highlight emerging trends in foundation models, biophysical modelling, synthetic data, and drug development and discuss regulatory, operational, and ethical hurdles across data, translation, and implementation gaps. Near-term clinical translation depends on scaling validated AI solutions for well defined clinical tasks. In contrast, more experimental AI solutions offer broader potential but require technical refinement and resolution of data and regulatory challenges. Addressing both general and neuro-oncology-specific issues is essential to unlock the full potential of AI and ensure its responsible, effective, and needs-based integration into neuro-oncological practice.
{"title":"Value of artificial intelligence in neuro-oncology","authors":"Sebastian Voigtlaender MSc , Thomas A Nelson MD , Philipp Karschnia MD , Eugene J Vaios MD , Prof Michelle M Kim MD , Philipp Lohmann PhD , Prof Norbert Galldiks MD , Prof Mariella G Filbin MD , Shekoofeh Azizi PhD , Vivek Natarajan MSc , Prof Michelle Monje MD , Prof Jorg Dietrich MD , Sebastian F Winter MD","doi":"10.1016/j.landig.2025.100876","DOIUrl":"10.1016/j.landig.2025.100876","url":null,"abstract":"<div><div>CNS cancers are complex, difficult-to-treat malignancies that remain insufficiently understood and mostly incurable, despite decades of research efforts. Artificial intelligence (AI) is poised to reshape neuro-oncological practice and research, driving advances in medical image analysis, neuro–molecular–genetic characterisation, biomarker discovery, therapeutic target identification, tailored management strategies, and neurorehabilitation. This Review examines key opportunities and challenges associated with AI applications along the neuro-oncological care trajectory. We highlight emerging trends in foundation models, biophysical modelling, synthetic data, and drug development and discuss regulatory, operational, and ethical hurdles across data, translation, and implementation gaps. Near-term clinical translation depends on scaling validated AI solutions for well defined clinical tasks. In contrast, more experimental AI solutions offer broader potential but require technical refinement and resolution of data and regulatory challenges. Addressing both general and neuro-oncology-specific issues is essential to unlock the full potential of AI and ensure its responsible, effective, and needs-based integration into neuro-oncological practice.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 9","pages":"Article 100876"},"PeriodicalIF":24.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144812596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01Epub Date: 2025-08-26DOI: 10.1016/j.landig.2025.100910
Arun James Thirunavukarasu , Ernest Lim , Bright Huo
{"title":"How CHART (Chatbot Assessment Reporting Tool) can help to advance clinical artificial intelligence research through clearer task definition and robust validation","authors":"Arun James Thirunavukarasu , Ernest Lim , Bright Huo","doi":"10.1016/j.landig.2025.100910","DOIUrl":"10.1016/j.landig.2025.100910","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 9","pages":"Article 100910"},"PeriodicalIF":24.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144974481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-08-26DOI: 10.1016/j.landig.2025.100888
Juliana C Taube AB , Zachary Susswein BS , Vittoria Colizza PhD , Prof Shweta Bansal PhD
<div><h3>Background</h3><div>Interpersonal contact has a crucial role in the transmission of infectious diseases. Characterising heterogeneity in contact patterns across individuals, time, and space is necessary to inform accurate estimates of transmission risk, particularly to explain superspreading, predict differences in vulnerability by age, and inform physical distancing policies. Current respiratory disease models often rely on data from the 2008 POLYMOD study conducted in Europe, which is now outdated and is potentially unrepresentative of behaviour in other geographical regions. We aimed to understand the variation in contact patterns in the USA across time, spatial scales, and demographic and social classifications during the COVID-19 pandemic, and to estimate what social behaviour looks like at baseline, in the absence of an ongoing pandemic.</div></div><div><h3>Methods</h3><div>For this study of contact patterns relevant to respiratory transmission during a pandemic, we examined 10·7 million responses to the US COVID-19 Trends and Impact Survey between June 1, 2020, and April 30, 2021 (ie, during the COVID-19 pandemic); the survey recruited participants aged 18 years and older in the USA through Facebook. Data were post-stratified by age and gender to correct for sample representation. We used generalised additive models to characterise spatiotemporal heterogeneity in respiratory contact patterns during the pandemic at the county-week scale; we established how contact patterns vary by urbanicity, age (18–54 years, 55–64 years, 65–74 years, or ≥75 years), gender (male or female), race or ethnicity (Asian, Black or African American, Hispanic, White, or other), and contact setting (work, shopping for essentials, social gatherings, or other). We used a regression approach to estimate baseline (non-pandemic) contact patterns.</div></div><div><h3>Findings</h3><div>Although contact patterns varied over time during the COVID-19 pandemic, the average number of daily contacts was relatively stable after controlling for the effect of incidence-mediated risk perception and disease-related policy. The mean number of non-household contacts was spatially heterogeneous, varying across urban versus rural settings, regardless of the presence of disease. Additional heterogeneity was observed across age, gender, race or ethnicity, and contact setting. Mean number of contacts decreased with age for individuals older than 55 years and was lower in women than in men. During periods of increased national incidence of disease, the contacts of White individuals and contacts at work or social gatherings showed the greatest change.</div></div><div><h3>Interpretation</h3><div>Our findings indicate that US adult baseline contact patterns show little variability over time after controlling for disease, but high spatial variability regardless of disease, with implications for understanding the seasonality of respiratory infectious diseases. The highly structured spat
{"title":"Characterising non-household contact patterns relevant to respiratory transmission in the USA: analysis of a cross-sectional survey","authors":"Juliana C Taube AB , Zachary Susswein BS , Vittoria Colizza PhD , Prof Shweta Bansal PhD","doi":"10.1016/j.landig.2025.100888","DOIUrl":"10.1016/j.landig.2025.100888","url":null,"abstract":"<div><h3>Background</h3><div>Interpersonal contact has a crucial role in the transmission of infectious diseases. Characterising heterogeneity in contact patterns across individuals, time, and space is necessary to inform accurate estimates of transmission risk, particularly to explain superspreading, predict differences in vulnerability by age, and inform physical distancing policies. Current respiratory disease models often rely on data from the 2008 POLYMOD study conducted in Europe, which is now outdated and is potentially unrepresentative of behaviour in other geographical regions. We aimed to understand the variation in contact patterns in the USA across time, spatial scales, and demographic and social classifications during the COVID-19 pandemic, and to estimate what social behaviour looks like at baseline, in the absence of an ongoing pandemic.</div></div><div><h3>Methods</h3><div>For this study of contact patterns relevant to respiratory transmission during a pandemic, we examined 10·7 million responses to the US COVID-19 Trends and Impact Survey between June 1, 2020, and April 30, 2021 (ie, during the COVID-19 pandemic); the survey recruited participants aged 18 years and older in the USA through Facebook. Data were post-stratified by age and gender to correct for sample representation. We used generalised additive models to characterise spatiotemporal heterogeneity in respiratory contact patterns during the pandemic at the county-week scale; we established how contact patterns vary by urbanicity, age (18–54 years, 55–64 years, 65–74 years, or ≥75 years), gender (male or female), race or ethnicity (Asian, Black or African American, Hispanic, White, or other), and contact setting (work, shopping for essentials, social gatherings, or other). We used a regression approach to estimate baseline (non-pandemic) contact patterns.</div></div><div><h3>Findings</h3><div>Although contact patterns varied over time during the COVID-19 pandemic, the average number of daily contacts was relatively stable after controlling for the effect of incidence-mediated risk perception and disease-related policy. The mean number of non-household contacts was spatially heterogeneous, varying across urban versus rural settings, regardless of the presence of disease. Additional heterogeneity was observed across age, gender, race or ethnicity, and contact setting. Mean number of contacts decreased with age for individuals older than 55 years and was lower in women than in men. During periods of increased national incidence of disease, the contacts of White individuals and contacts at work or social gatherings showed the greatest change.</div></div><div><h3>Interpretation</h3><div>Our findings indicate that US adult baseline contact patterns show little variability over time after controlling for disease, but high spatial variability regardless of disease, with implications for understanding the seasonality of respiratory infectious diseases. The highly structured spat","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 8","pages":"Article 100888"},"PeriodicalIF":24.1,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144974453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-07-03DOI: 10.1016/j.landig.2025.100878
Cristina Crocamo PhD , Dario Palpella MD , Daniele Cavaleri MD , Christian Nasti MD , Susanna Piacenti MD , Pietro Morello MD , Giada Lauria MD , Oliviero Villa MD , Ilaria Riboldi PhD , Francesco Bartoli PhD , John Torous MD , Prof Giuseppe Carrà PhD
Digital health interventions (DHIs) show promise for the treatment of mental health disorders. However, existing meta-analytical research is methodologically heterogeneous, with studies including a mix of clinical, non-clinical, and transdiagnostic populations, hindering a comprehensive understanding of DHI effectiveness. Thus, we conducted an umbrella review of meta-analyses of randomised controlled trials investigating the effectiveness of DHIs for specific mental health disorders and evaluating the quality of evidence. We searched three public electronic databases from inception to February, 2024 and included 16 studies. DHIs were effective compared with active interventions for schizophrenia spectrum disorders, major depressive disorder, social anxiety disorder, and panic disorder. Notable treatment effects compared with a waiting list were also observed for specific phobias, generalised anxiety disorder, obsessive-compulsive disorder, post-traumatic stress disorder, and bulimia nervosa. Certainty of evidence was rated as very low or low in most cases, except for generalised anxiety disorder-related outcomes, which showed a moderate rating. To integrate DHIs into clinical practice, further high-quality studies with clearly defined target populations and robust comparators are needed.
{"title":"Digital health interventions for mental health disorders: an umbrella review of meta-analyses of randomised controlled trials","authors":"Cristina Crocamo PhD , Dario Palpella MD , Daniele Cavaleri MD , Christian Nasti MD , Susanna Piacenti MD , Pietro Morello MD , Giada Lauria MD , Oliviero Villa MD , Ilaria Riboldi PhD , Francesco Bartoli PhD , John Torous MD , Prof Giuseppe Carrà PhD","doi":"10.1016/j.landig.2025.100878","DOIUrl":"10.1016/j.landig.2025.100878","url":null,"abstract":"<div><div>Digital health interventions (DHIs) show promise for the treatment of mental health disorders. However, existing meta-analytical research is methodologically heterogeneous, with studies including a mix of clinical, non-clinical, and transdiagnostic populations, hindering a comprehensive understanding of DHI effectiveness. Thus, we conducted an umbrella review of meta-analyses of randomised controlled trials investigating the effectiveness of DHIs for specific mental health disorders and evaluating the quality of evidence. We searched three public electronic databases from inception to February, 2024 and included 16 studies. DHIs were effective compared with active interventions for schizophrenia spectrum disorders, major depressive disorder, social anxiety disorder, and panic disorder. Notable treatment effects compared with a waiting list were also observed for specific phobias, generalised anxiety disorder, obsessive-compulsive disorder, post-traumatic stress disorder, and bulimia nervosa. Certainty of evidence was rated as very low or low in most cases, except for generalised anxiety disorder-related outcomes, which showed a moderate rating. To integrate DHIs into clinical practice, further high-quality studies with clearly defined target populations and robust comparators are needed.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 8","pages":"Article 100878"},"PeriodicalIF":24.1,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144561613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-08-25DOI: 10.1016/j.landig.2025.100887
Miles Crosskey PhD , Tomas McIntee PhD , Sandy Preiss MS , Daniel Brannock MS , John M Baratta MD , Yun Jae Yoo BS , Emily Hadley MS , Frank Blanceró BA , Robert Chew MS , Johanna Loomba MS , Abhishek Bhatia MS , Prof Christopher G Chute MD , Prof Melissa Haendel PhD , Richard Moffitt PhD , Emily R Pfaff PhD , N3C Consortium and the RECOVER EHR cohort
Background
In 2021, we used the National COVID Cohort Collaborative (N3C) as part of the National Institutes of Health RECOVER Initiative to develop a machine learning pipeline to identify patients with a high probability of having post-acute sequelae of SARS-CoV-2 infection or long COVID. However, the increased home testing, missing documentation, and reinfections that characterise the pandemic beyond 2022 necessitated the re-engineering of our original model to account for these changes in the COVID-19 research landscape.
Methods
Trained on 72 745 patient records (36 238 with long COVID and 36 507 with no evidence of long COVID), our updated XGBoost model gathered data for each patient in overlapping 100-day periods that progressed through time and issued a probability of long COVID for each 100-day period. We ran the model on patients in N3C (n=5 875 065) who met at least one of the following criteria from Jan 1, 2020, to June 22, 2023: a U07·1 (COVID-19) diagnosis code; a positive SARS-CoV-2 test; a U09·9 (post-acute sequelae of SARS-CoV-2 infection) diagnosis code; a prescription for nirmatrelvir–ritonavir or remdesivir; or an M35·81 (multisystem inflammatory syndrome in children [MIS-C]) diagnosis code. Each patient was given a model score that predicted long COVID status for each 100-day window in which they were aged ≥18 years. If a patient had known acute COVID-19 during any 100-day window (including reinfections), we censored the data from 7 days before the diagnosis or positive test date to 28 days after. We ran the model on controls selected from pre-2020 data to assess the likelihood of false positives.
Findings
The updated model had an area under the receiver operating characteristic curve of 0·90. Precision and recall could be adjusted according to a given use case, depending on whether greater sensitivity or specificity was warranted. Using our model, we estimate the overall prevalence of long COVID among the COVID-19 positive cohort within N3C repository to be 10.4%.
Interpretation
By eschewing the COVID-19 index date as an anchor point for analysis, we can assess the probability of long COVID among patients who might have tested at home, or with suspected (but untested) cases of COVID-19, or multiple SARS-CoV-2 reinfections. We view this exercise as a model for maintaining and updating any machine learning pipeline used for clinical research and operations.
{"title":"Re-engineering a machine learning phenotype to adapt to the changing COVID-19 landscape: a machine learning modelling study from the N3C and RECOVER consortia","authors":"Miles Crosskey PhD , Tomas McIntee PhD , Sandy Preiss MS , Daniel Brannock MS , John M Baratta MD , Yun Jae Yoo BS , Emily Hadley MS , Frank Blanceró BA , Robert Chew MS , Johanna Loomba MS , Abhishek Bhatia MS , Prof Christopher G Chute MD , Prof Melissa Haendel PhD , Richard Moffitt PhD , Emily R Pfaff PhD , N3C Consortium and the RECOVER EHR cohort","doi":"10.1016/j.landig.2025.100887","DOIUrl":"10.1016/j.landig.2025.100887","url":null,"abstract":"<div><h3>Background</h3><div>In 2021, we used the National COVID Cohort Collaborative (N3C) as part of the National Institutes of Health RECOVER Initiative to develop a machine learning pipeline to identify patients with a high probability of having post-acute sequelae of SARS-CoV-2 infection or long COVID. However, the increased home testing, missing documentation, and reinfections that characterise the pandemic beyond 2022 necessitated the re-engineering of our original model to account for these changes in the COVID-19 research landscape.</div></div><div><h3>Methods</h3><div>Trained on 72 745 patient records (36 238 with long COVID and 36 507 with no evidence of long COVID), our updated XGBoost model gathered data for each patient in overlapping 100-day periods that progressed through time and issued a probability of long COVID for each 100-day period. We ran the model on patients in N3C (n=5 875 065) who met at least one of the following criteria from Jan 1, 2020, to June 22, 2023: a U07·1 (COVID-19) diagnosis code; a positive SARS-CoV-2 test; a U09·9 (post-acute sequelae of SARS-CoV-2 infection) diagnosis code; a prescription for nirmatrelvir–ritonavir or remdesivir; or an M35·81 (multisystem inflammatory syndrome in children [MIS-C]) diagnosis code. Each patient was given a model score that predicted long COVID status for each 100-day window in which they were aged ≥18 years. If a patient had known acute COVID-19 during any 100-day window (including reinfections), we censored the data from 7 days before the diagnosis or positive test date to 28 days after. We ran the model on controls selected from pre-2020 data to assess the likelihood of false positives.</div></div><div><h3>Findings</h3><div>The updated model had an area under the receiver operating characteristic curve of 0·90. Precision and recall could be adjusted according to a given use case, depending on whether greater sensitivity or specificity was warranted. Using our model, we estimate the overall prevalence of long COVID among the COVID-19 positive cohort within N3C repository to be 10.4%.</div></div><div><h3>Interpretation</h3><div>By eschewing the COVID-19 index date as an anchor point for analysis, we can assess the probability of long COVID among patients who might have tested at home, or with suspected (but untested) cases of COVID-19, or multiple SARS-CoV-2 reinfections. We view this exercise as a model for maintaining and updating any machine learning pipeline used for clinical research and operations.</div></div><div><h3>Funding</h3><div>National Institutes of Health RECOVER Initiative.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 8","pages":"Article 100887"},"PeriodicalIF":24.1,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144974537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}