Pub Date : 2025-08-01Epub Date: 2025-08-25DOI: 10.1016/j.landig.2025.100887
Miles Crosskey PhD , Tomas McIntee PhD , Sandy Preiss MS , Daniel Brannock MS , John M Baratta MD , Yun Jae Yoo BS , Emily Hadley MS , Frank Blanceró BA , Robert Chew MS , Johanna Loomba MS , Abhishek Bhatia MS , Prof Christopher G Chute MD , Prof Melissa Haendel PhD , Richard Moffitt PhD , Emily R Pfaff PhD , N3C Consortium and the RECOVER EHR cohort
Background
In 2021, we used the National COVID Cohort Collaborative (N3C) as part of the National Institutes of Health RECOVER Initiative to develop a machine learning pipeline to identify patients with a high probability of having post-acute sequelae of SARS-CoV-2 infection or long COVID. However, the increased home testing, missing documentation, and reinfections that characterise the pandemic beyond 2022 necessitated the re-engineering of our original model to account for these changes in the COVID-19 research landscape.
Methods
Trained on 72 745 patient records (36 238 with long COVID and 36 507 with no evidence of long COVID), our updated XGBoost model gathered data for each patient in overlapping 100-day periods that progressed through time and issued a probability of long COVID for each 100-day period. We ran the model on patients in N3C (n=5 875 065) who met at least one of the following criteria from Jan 1, 2020, to June 22, 2023: a U07·1 (COVID-19) diagnosis code; a positive SARS-CoV-2 test; a U09·9 (post-acute sequelae of SARS-CoV-2 infection) diagnosis code; a prescription for nirmatrelvir–ritonavir or remdesivir; or an M35·81 (multisystem inflammatory syndrome in children [MIS-C]) diagnosis code. Each patient was given a model score that predicted long COVID status for each 100-day window in which they were aged ≥18 years. If a patient had known acute COVID-19 during any 100-day window (including reinfections), we censored the data from 7 days before the diagnosis or positive test date to 28 days after. We ran the model on controls selected from pre-2020 data to assess the likelihood of false positives.
Findings
The updated model had an area under the receiver operating characteristic curve of 0·90. Precision and recall could be adjusted according to a given use case, depending on whether greater sensitivity or specificity was warranted. Using our model, we estimate the overall prevalence of long COVID among the COVID-19 positive cohort within N3C repository to be 10.4%.
Interpretation
By eschewing the COVID-19 index date as an anchor point for analysis, we can assess the probability of long COVID among patients who might have tested at home, or with suspected (but untested) cases of COVID-19, or multiple SARS-CoV-2 reinfections. We view this exercise as a model for maintaining and updating any machine learning pipeline used for clinical research and operations.
{"title":"Re-engineering a machine learning phenotype to adapt to the changing COVID-19 landscape: a machine learning modelling study from the N3C and RECOVER consortia","authors":"Miles Crosskey PhD , Tomas McIntee PhD , Sandy Preiss MS , Daniel Brannock MS , John M Baratta MD , Yun Jae Yoo BS , Emily Hadley MS , Frank Blanceró BA , Robert Chew MS , Johanna Loomba MS , Abhishek Bhatia MS , Prof Christopher G Chute MD , Prof Melissa Haendel PhD , Richard Moffitt PhD , Emily R Pfaff PhD , N3C Consortium and the RECOVER EHR cohort","doi":"10.1016/j.landig.2025.100887","DOIUrl":"10.1016/j.landig.2025.100887","url":null,"abstract":"<div><h3>Background</h3><div>In 2021, we used the National COVID Cohort Collaborative (N3C) as part of the National Institutes of Health RECOVER Initiative to develop a machine learning pipeline to identify patients with a high probability of having post-acute sequelae of SARS-CoV-2 infection or long COVID. However, the increased home testing, missing documentation, and reinfections that characterise the pandemic beyond 2022 necessitated the re-engineering of our original model to account for these changes in the COVID-19 research landscape.</div></div><div><h3>Methods</h3><div>Trained on 72 745 patient records (36 238 with long COVID and 36 507 with no evidence of long COVID), our updated XGBoost model gathered data for each patient in overlapping 100-day periods that progressed through time and issued a probability of long COVID for each 100-day period. We ran the model on patients in N3C (n=5 875 065) who met at least one of the following criteria from Jan 1, 2020, to June 22, 2023: a U07·1 (COVID-19) diagnosis code; a positive SARS-CoV-2 test; a U09·9 (post-acute sequelae of SARS-CoV-2 infection) diagnosis code; a prescription for nirmatrelvir–ritonavir or remdesivir; or an M35·81 (multisystem inflammatory syndrome in children [MIS-C]) diagnosis code. Each patient was given a model score that predicted long COVID status for each 100-day window in which they were aged ≥18 years. If a patient had known acute COVID-19 during any 100-day window (including reinfections), we censored the data from 7 days before the diagnosis or positive test date to 28 days after. We ran the model on controls selected from pre-2020 data to assess the likelihood of false positives.</div></div><div><h3>Findings</h3><div>The updated model had an area under the receiver operating characteristic curve of 0·90. Precision and recall could be adjusted according to a given use case, depending on whether greater sensitivity or specificity was warranted. Using our model, we estimate the overall prevalence of long COVID among the COVID-19 positive cohort within N3C repository to be 10.4%.</div></div><div><h3>Interpretation</h3><div>By eschewing the COVID-19 index date as an anchor point for analysis, we can assess the probability of long COVID among patients who might have tested at home, or with suspected (but untested) cases of COVID-19, or multiple SARS-CoV-2 reinfections. We view this exercise as a model for maintaining and updating any machine learning pipeline used for clinical research and operations.</div></div><div><h3>Funding</h3><div>National Institutes of Health RECOVER Initiative.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 8","pages":"Article 100887"},"PeriodicalIF":24.1,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144974537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-08-14DOI: 10.1016/j.landig.2025.100882
Suzanne L van Winkel MSc , Jim Peters MSc , Natasja Janssen PhD , Jaap Kroes PhD , Elizabeth A Loehrer PhD , Jessie Gommers MSc , Prof Ioannis Sechopoulos PhD , Linda de Munck PhD , Jonas Teuwen PhD , Prof Mireille Broeders PhD , Prof Nico Karssemeijer PhD , Ritse M Mann MD
Background
Breast cancer screening programmes have shown to reduce mortality, but current methods face challenges such as limited mammographic sensitivity, limited resources, and variability in radiologist expertise. Artificial intelligence (AI) offers potential to improve screening accuracy and efficiency. This study simulated different screening scenarios, evaluating the performance of population-based breast cancer screening when using an AI system as a stand-alone reader or second reader.
Methods
In this retrospective cohort study, 42 236 consecutive 2D mammograms from 42 100 women attending the Dutch population-based breast cancer screening between Sept 1, 2016, and Aug 31, 2018 were processed by an AI-based cancer detection system (Transpara version 1.7.0, ScreenPoint Medical). Verified outcomes from the Netherlands Cancer Registry on screen-detected cancers, interval cancers, and later-in-future-detected breast cancers were available with 4-year follow-up. We compared sensitivity, specificity, and recall-rate between single human reading, double human reading, stand-alone AI reading, and combined single human reading with AI. Furthermore, we assessed potential differences in performance regarding breast density, tumour size, lymph-node positivity, and invasiveness between cancers identified by single human readers and AI alone.
Findings
After follow-up, 580 mammograms (579 woman) were labelled positive: 291 screen-detected cancers, 102 interval cancers, and 187 future breast cancers. Double human reading recalled 1244 mammograms (2·9%, 291 screen-detected cancers) and combined single human reading with AI recalled 2112 mammograms (5·0%, 282 screen-detected cancers, 29 interval cancers, 38 future breast cancers), improving the sensitivity by 8·4% (95% CI 5·7–11·2, p<0·0001). No significant difference in performance between combined single human reading with AI across density categories was found. AI-detected future breast cancers and interval cancers missed by human readers were more often invasive cancers (26·7%) or tumours larger than 20 mm in diameter (16·6%) by the time of eventual detection compared with the average screen-detected cancers.
Interpretation
Evaluating screening mammograms with one human reader and AI leads to increased breast cancer detection compared with double human reading, independent of breast density. However, an effective arbitration process is needed as the recall rate increases. AI-identified breast cancers that are missed by human readers seem larger and more often invasive by the time they are eventually detected, confirming the clinical relevance of these cases, recognisable by AI at an earlier stage.
Funding
MARBLE.
背景:乳腺癌筛查项目已显示出降低死亡率的效果,但目前的方法面临着诸如乳腺x线摄影灵敏度有限、资源有限以及放射科医生专业知识差异等挑战。人工智能(AI)提供了提高筛查准确性和效率的潜力。本研究模拟了不同的筛查场景,评估了使用人工智能系统作为独立阅读器或第二阅读器时基于人群的乳腺癌筛查的性能。方法:在这项回顾性队列研究中,使用基于人工智能的癌症检测系统(Transpara version 1.7.0, ScreenPoint Medical)处理2016年9月1日至2018年8月31日期间参加荷兰人群乳腺癌筛查的42 100名妇女的42 236张连续2D乳房x光片。荷兰癌症登记处关于筛查检测到的癌症、间隔期癌症和以后检测到的乳腺癌的验证结果可通过4年随访获得。我们比较了单人阅读、双人阅读、独立人工智能阅读和单人阅读与人工智能结合的敏感性、特异性和召回率。此外,我们评估了单个人类读者和单独人工智能识别的癌症在乳腺密度、肿瘤大小、淋巴结阳性和侵袭性方面的潜在差异。结果:随访后,580张乳房x光片(579名女性)被标记为阳性:291例筛查出的癌症,102例间隔期癌症,187例未来的乳腺癌。双人阅读召回1244张乳房x光片(2.9%,筛查出291例癌症),单人阅读联合人工智能召回2112张乳房x光片(5.0%,筛查出282例癌症,29例间隔期癌症,38例未来乳腺癌),灵敏度提高了8.4% (95% CI 5.7 - 11.2)。解释:与双人阅读相比,单人阅读和人工智能筛查乳房x光片的乳腺癌检出率增加,与乳腺密度无关。但是,随着召回率的增加,需要有效的仲裁程序。人工智能识别出的乳腺癌在最终被检测到时似乎更大,更具侵袭性,这证实了这些病例的临床相关性,人工智能可以在早期阶段识别出来。资金:大理石。
{"title":"AI as an independent second reader in detection of clinically relevant breast cancers within a population-based screening programme in the Netherlands: a retrospective cohort study","authors":"Suzanne L van Winkel MSc , Jim Peters MSc , Natasja Janssen PhD , Jaap Kroes PhD , Elizabeth A Loehrer PhD , Jessie Gommers MSc , Prof Ioannis Sechopoulos PhD , Linda de Munck PhD , Jonas Teuwen PhD , Prof Mireille Broeders PhD , Prof Nico Karssemeijer PhD , Ritse M Mann MD","doi":"10.1016/j.landig.2025.100882","DOIUrl":"10.1016/j.landig.2025.100882","url":null,"abstract":"<div><h3>Background</h3><div>Breast cancer screening programmes have shown to reduce mortality, but current methods face challenges such as limited mammographic sensitivity, limited resources, and variability in radiologist expertise. Artificial intelligence (AI) offers potential to improve screening accuracy and efficiency. This study simulated different screening scenarios, evaluating the performance of population-based breast cancer screening when using an AI system as a stand-alone reader or second reader.</div></div><div><h3>Methods</h3><div>In this retrospective cohort study, 42 236 consecutive 2D mammograms from 42 100 women attending the Dutch population-based breast cancer screening between Sept 1, 2016, and Aug 31, 2018 were processed by an AI-based cancer detection system (Transpara version 1.7.0, ScreenPoint Medical). Verified outcomes from the Netherlands Cancer Registry on screen-detected cancers, interval cancers, and later-in-future-detected breast cancers were available with 4-year follow-up. We compared sensitivity, specificity, and recall-rate between single human reading, double human reading, stand-alone AI reading, and combined single human reading with AI. Furthermore, we assessed potential differences in performance regarding breast density, tumour size, lymph-node positivity, and invasiveness between cancers identified by single human readers and AI alone.</div></div><div><h3>Findings</h3><div>After follow-up, 580 mammograms (579 woman) were labelled positive: 291 screen-detected cancers, 102 interval cancers, and 187 future breast cancers. Double human reading recalled 1244 mammograms (2·9%, 291 screen-detected cancers) and combined single human reading with AI recalled 2112 mammograms (5·0%, 282 screen-detected cancers, 29 interval cancers, 38 future breast cancers), improving the sensitivity by 8·4% (95% CI 5·7–11·2, p<0·0001). No significant difference in performance between combined single human reading with AI across density categories was found. AI-detected future breast cancers and interval cancers missed by human readers were more often invasive cancers (26·7%) or tumours larger than 20 mm in diameter (16·6%) by the time of eventual detection compared with the average screen-detected cancers.</div></div><div><h3>Interpretation</h3><div>Evaluating screening mammograms with one human reader and AI leads to increased breast cancer detection compared with double human reading, independent of breast density. However, an effective arbitration process is needed as the recall rate increases. AI-identified breast cancers that are missed by human readers seem larger and more often invasive by the time they are eventually detected, confirming the clinical relevance of these cases, recognisable by AI at an earlier stage.</div></div><div><h3>Funding</h3><div>MARBLE.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 8","pages":"Article 100882"},"PeriodicalIF":24.1,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144859802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-08-12DOI: 10.1016/j.landig.2025.100909
The Lancet Digital Health
{"title":"Rapid generative AI rollout in health care","authors":"The Lancet Digital Health","doi":"10.1016/j.landig.2025.100909","DOIUrl":"10.1016/j.landig.2025.100909","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 8","pages":"Article 100909"},"PeriodicalIF":24.1,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144849412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-07-23DOI: 10.1016/j.landig.2025.100898
Bilal A Mateen , Vasee Moorthy , Alain Labrique , Jeremy Farrar
{"title":"Artificial intelligence and clinical trials: a framework for effective adoption☆","authors":"Bilal A Mateen , Vasee Moorthy , Alain Labrique , Jeremy Farrar","doi":"10.1016/j.landig.2025.100898","DOIUrl":"10.1016/j.landig.2025.100898","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 8","pages":"Article 100898"},"PeriodicalIF":24.1,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144709489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-08-19DOI: 10.1016/j.landig.2025.100891
Marco Gustav MSc , Marko van Treeck MSc , Nic G Reitsam MD , Zunamys I Carrero PhD , Chiara M L Loeffler MD , Asier Rabasco Meneghetti PhD , Prof Bruno Märkl MD , Prof Lisa A Boardman MD , Amy J French MSc , Prof Ellen L Goode PhD , Andrea Gsur PhD , Stefanie Brezina PhD , Prof Marc J Gunter PhD , Neil Murphy PhD , Pia Hönscheid PhD , Christian Sperling MTLA , Sebastian Foersch MD , Robert Steinfelder PhD , Tabitha Harrison MPH , Prof Ulrike Peters PhD , Prof Jakob Nikolas Kather MD
Background
Deep learning-based models enable the prediction of molecular biomarkers from histopathology slides of colorectal cancer stained with haematoxylin and eosin; however, few studies have assessed prediction targets beyond microsatellite instability (MSI), BRAF, and KRAS systematically. We aimed to develop and validate a multi-target model based on deep learning for the simultaneous prediction of numerous genetic alterations and their associated phenotypes in colorectal cancer.
Methods
In this multicentre cohort study, tissue samples from patients with colorectal cancer were obtained by surgical resection and stained with haematoxylin and eosin. These samples were then digitised into whole-slide images and used to train and test a transformer-based deep learning algorithm for biomarker detection to simultaneously predict multiple genetic alterations and provide heatmap explanations. The primary dataset comprised 1376 patients from five cohorts who underwent comprehensive panel sequencing, with an additional 536 patients from two public datasets for validation. We compared the model's performance against conventional single-target models and examined the co-occurrence of alterations and shared morphology.
Findings
The multi-target model was able to predict numerous biomarkers from pathology slides, matching and partly exceeding single-target transformers. In the primary external validation cohorts, mean area under the receiver operating characteristic curve (AUROC) for the multi-target transformer was 0·78 (SD 0·01) for BRAF, 0·88 (0·01) for hypermutation, 0·93 (0·01) for MSI, and 0·86 (0·01) for RNF43; predictive performance was consistent across metrics and supported by co-occurrence analyses. However, biomarkers with high AUROCs largely correlated with MSI, with model predictions depending considerably on morphology associated with MSI at pathological examination.
Interpretation
By use of morphology associated with MSI and more subtle biomarker-specific patterns within a shared phenotype, the multi-target transformers efficiently predicted biomarker status for diverse genetic alterations in colorectal cancer from slides stained with haematoxylin and eosin. These results highlight the importance of considering mutational co-occurrence and common morphology in biomarker research based on deep learning. Our validated and scalable model could support extension to other cancers and large, diverse cohorts, potentially facilitating cost-effective pre-screening and streamlined diagnostics in precision oncology.
Funding
German Federal Ministry of Health, Max-Eder-Programme of German Cancer Aid, German Federal Ministry of Education and Research, German Academic Exchange Service, and the EU.
{"title":"Assessing genotype−phenotype correlations in colorectal cancer with deep learning: a multicentre cohort study","authors":"Marco Gustav MSc , Marko van Treeck MSc , Nic G Reitsam MD , Zunamys I Carrero PhD , Chiara M L Loeffler MD , Asier Rabasco Meneghetti PhD , Prof Bruno Märkl MD , Prof Lisa A Boardman MD , Amy J French MSc , Prof Ellen L Goode PhD , Andrea Gsur PhD , Stefanie Brezina PhD , Prof Marc J Gunter PhD , Neil Murphy PhD , Pia Hönscheid PhD , Christian Sperling MTLA , Sebastian Foersch MD , Robert Steinfelder PhD , Tabitha Harrison MPH , Prof Ulrike Peters PhD , Prof Jakob Nikolas Kather MD","doi":"10.1016/j.landig.2025.100891","DOIUrl":"10.1016/j.landig.2025.100891","url":null,"abstract":"<div><h3>Background</h3><div>Deep learning-based models enable the prediction of molecular biomarkers from histopathology slides of colorectal cancer stained with haematoxylin and eosin; however, few studies have assessed prediction targets beyond microsatellite instability (MSI), <em>BRAF</em>, and <em>KRAS</em> systematically. We aimed to develop and validate a multi-target model based on deep learning for the simultaneous prediction of numerous genetic alterations and their associated phenotypes in colorectal cancer.</div></div><div><h3>Methods</h3><div>In this multicentre cohort study, tissue samples from patients with colorectal cancer were obtained by surgical resection and stained with haematoxylin and eosin. These samples were then digitised into whole-slide images and used to train and test a transformer-based deep learning algorithm for biomarker detection to simultaneously predict multiple genetic alterations and provide heatmap explanations. The primary dataset comprised 1376 patients from five cohorts who underwent comprehensive panel sequencing, with an additional 536 patients from two public datasets for validation. We compared the model's performance against conventional single-target models and examined the co-occurrence of alterations and shared morphology.</div></div><div><h3>Findings</h3><div>The multi-target model was able to predict numerous biomarkers from pathology slides, matching and partly exceeding single-target transformers. In the primary external validation cohorts, mean area under the receiver operating characteristic curve (AUROC) for the multi-target transformer was 0·78 (SD 0·01) for <em>BRAF</em>, 0·88 (0·01) for hypermutation, 0·93 (0·01) for MSI, and 0·86 (0·01) for <em>RNF43</em>; predictive performance was consistent across metrics and supported by co-occurrence analyses. However, biomarkers with high AUROCs largely correlated with MSI, with model predictions depending considerably on morphology associated with MSI at pathological examination.</div></div><div><h3>Interpretation</h3><div>By use of morphology associated with MSI and more subtle biomarker-specific patterns within a shared phenotype, the multi-target transformers efficiently predicted biomarker status for diverse genetic alterations in colorectal cancer from slides stained with haematoxylin and eosin. These results highlight the importance of considering mutational co-occurrence and common morphology in biomarker research based on deep learning. Our validated and scalable model could support extension to other cancers and large, diverse cohorts, potentially facilitating cost-effective pre-screening and streamlined diagnostics in precision oncology.</div></div><div><h3>Funding</h3><div>German Federal Ministry of Health, Max-Eder-Programme of German Cancer Aid, German Federal Ministry of Education and Research, German Academic Exchange Service, and the EU.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 8","pages":"Article 100891"},"PeriodicalIF":24.1,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144884123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-07-21DOI: 10.1016/j.landig.2025.100880
Daniel R Balcarcel MD , Sanjiv D Mehta MD , Celeste G Dixon MD , Charlotte Z Woods-Hill MD , Ewan C Goligher MD , Wouter A C van Amsterdam MD , Nadir Yehya MD
Prognostic models developed for use in the intensive care unit (ICU) can inform treatment decisions and improve patient care. However, despite extensive research, few models have contributed to improved patient-centred outcomes. A major limitation is that the influence of treatment interventions on patient outcomes during model development and validation is often overlooked. Upon implementation, prognostic models can affect clinical interventions, creating feedback loops that alter the relationship between predictors and observed patient outcomes. This alteration caused by model-mediated intervention is known as model drift. Positive feedback loops reinforce initial prognoses, leading to self-fulfilling prophecies, whereas negative feedback loops obscure the efficacy of successful interventions by rendering them as apparent model inaccuracies. To mitigate these issues, prognostic models for use in ICUs should account for treatment effects and the causal relationships among predictions, interventions, and outcomes. Thus, collaboration among data scientists, epidemiologists, clinical researchers, and implementation scientists is required to ensure that prognostic models enhance patient care without causing inadvertent harm.
{"title":"Feedback loops in intensive care unit prognostic models: an under-recognised threat to clinical validity","authors":"Daniel R Balcarcel MD , Sanjiv D Mehta MD , Celeste G Dixon MD , Charlotte Z Woods-Hill MD , Ewan C Goligher MD , Wouter A C van Amsterdam MD , Nadir Yehya MD","doi":"10.1016/j.landig.2025.100880","DOIUrl":"10.1016/j.landig.2025.100880","url":null,"abstract":"<div><div>Prognostic models developed for use in the intensive care unit (ICU) can inform treatment decisions and improve patient care. However, despite extensive research, few models have contributed to improved patient-centred outcomes. A major limitation is that the influence of treatment interventions on patient outcomes during model development and validation is often overlooked. Upon implementation, prognostic models can affect clinical interventions, creating feedback loops that alter the relationship between predictors and observed patient outcomes. This alteration caused by model-mediated intervention is known as model drift. Positive feedback loops reinforce initial prognoses, leading to self-fulfilling prophecies, whereas negative feedback loops obscure the efficacy of successful interventions by rendering them as apparent model inaccuracies. To mitigate these issues, prognostic models for use in ICUs should account for treatment effects and the causal relationships among predictions, interventions, and outcomes. Thus, collaboration among data scientists, epidemiologists, clinical researchers, and implementation scientists is required to ensure that prognostic models enhance patient care without causing inadvertent harm.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 8","pages":"Article 100880"},"PeriodicalIF":24.1,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144691991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-01Epub Date: 2025-05-24DOI: 10.1016/j.landig.2025.100879
Kunal Rajput , Ara Darzi , Saira Ghafur
{"title":"Overlooked and under-reported: the impact of cyberattacks on primary care in the UK National Health Service","authors":"Kunal Rajput , Ara Darzi , Saira Ghafur","doi":"10.1016/j.landig.2025.100879","DOIUrl":"10.1016/j.landig.2025.100879","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 7","pages":"Article 100879"},"PeriodicalIF":24.1,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-01Epub Date: 2025-06-03DOI: 10.1016/j.landig.2025.100885
Charles T A Parker MB BChir , Larissa Mendes PhD , Vinnie Y T Liu MSc , Emily Grist PhD , Songwan Joun MSc , Rikiya Yamashita MD PhD , Akinori Mitani MD PhD , Emmalyn Chen PhD , Marina A Parry PhD , Ashwin Sachdeva PhD , Laura Murphy PhD , Huei-Chung Huang MA , Jacqueline Griffin PhD , Douwe van der Wal MSc , Tamara Todorovic MPH , Sharanpreet Lall BSc , Sara Santos Vidal MSc , Miriam Goncalves BSc , Suparna Thakali BSc , Anna Wingate MSc , Prof Gerhardt Attard PhD
<div><h3>Background</h3><div>Effective prognostication improves selection of patients with prostate cancer for treatment combinations. We aimed to evaluate whether a previously developed multimodal artificial intelligence (MMAI) algorithm was prognostic in very advanced prostate cancer using data from four phase 3 trials of the STAMPEDE platform protocol.</div></div><div><h3>Methods</h3><div>We included patients starting androgen-deprivation therapy in the docetaxel, docetaxel plus zoledronic acid, abiraterone, or abiraterone plus enzalutamide trials. Patients were recruited at 112 sites. We combined all standard-of-care control patients (including those allocated to standard of care [SOC-ADT] consisting of testosterone suppression with luteinising hormone-releasing hormone agonists or antagonists, and radiotherapy when indicated), and we combined the rest of the patients into docetaxel-treated or abiraterone-treated groups. Patients had either metastatic disease or were at very high-risk of metastatic disease, determined by node-positivity or, if node-negative, by T stage, serum prostate-specific antigen (PSA) level, and Gleason score. We used the locked ArteraAI Prostate MMAI algorithm that combined these clinical variables, age, and digitised prostate biopsy pathology images. We performed Fine–Gray and Cox regression adjusted for treatment allocation and cumulative incidence analyses at 5 years to evaluate associations with prostate cancer-specific mortality (PCSM) for continuous (per SD increase) and categorical (quartile-Q) scores. The STAMPEDE platform protocol is registered with <span><span>ClinicalTrials.gov</span><svg><path></path></svg></span>, <span><span>NCT00268476</span><svg><path></path></svg></span>.</div></div><div><h3>Findings</h3><div>Of 5213 eligible patients recruited from Oct 5, 2005, to March 31, 2016, 3167 were included in this analysis (1575 [49·7%] with non-metastatic disease, 1592 [50·3%] with metastatic disease; median follow-up 6·9 years [IQR 5·9–8·0]) with all datapoints available for score generation. The MMAI algorithm (per SD increase) was strongly associated with PCSM (hazard ratio [HR] 1·40, 95% CI 1·30–1·51, p<0·0001). On ad-hoc inspection, the highest scoring quartile of patients in each disease and treatment allocation group (MMAI Q4; <em>vs</em> the bottom three quartiles, Q1–3) had the highest PCSM risk in both patients with non-metastatic disease (HR 2·12, 1·61–2·81, p<0·0001) and those with metastatic disease (HR 1·62, 1·39–1·88, p<0·0001). MMAI quartile stratification split patients categorised by disease burden into groups with notably different risks of 5-year PCSM: patients with non-metastatic disease that were node-negative could be further stratified by MMAI score quartile Q1–3 (3%, 2–4) versus Q4 (11%, 7–15), those with non-metastatic disease that were node-positive could be stratified by Q1–3 (11%, 8–14) versus Q4 (20%, 13–26), those with metastatic disease with low-volume could be strati
背景:有效的预后改善了前列腺癌患者联合治疗的选择。我们的目的是评估先前开发的多模态人工智能(MMAI)算法是否可以使用STAMPEDE平台协议的四个3期试验数据来预测晚期前列腺癌的预后。方法:我们纳入了多西他赛、多西他赛加唑来膦酸、阿比特龙或阿比特龙加恩杂鲁胺试验中开始雄激素剥夺治疗的患者。在112个地点招募患者。我们将所有标准护理对照患者(包括那些分配到标准护理[SOC-ADT]的患者,包括睾酮抑制与黄体生成素释放激素激动剂或拮抗剂,并在有指示时进行放疗),并将其余患者合并为多西他赛治疗组或阿比特龙治疗组。患者要么患有转移性疾病,要么处于转移性疾病的高危状态,通过淋巴结阳性或淋巴结阴性,通过T分期、血清前列腺特异性抗原(PSA)水平和Gleason评分来确定。我们使用了锁定的ArteraAI前列腺MMAI算法,该算法结合了这些临床变量、年龄和数字化的前列腺活检病理图像。我们对5年的治疗分配和累积发病率分析进行了微调的Fine-Gray和Cox回归,以评估前列腺癌特异性死亡率(PCSM)与连续(每SD增加)和分类(四分位q)评分的关系。STAMPEDE平台方案已在ClinicalTrials.gov注册,编号NCT00268476。结果:在2005年10月5日至2016年3月31日招募的5213名符合条件的患者中,有3167名患者被纳入该分析,其中1575名(49.7%)为非转移性疾病,1592名(50.3%)为转移性疾病;中位随访时间为6.9年(IQR为5.9 - 8.0),所有数据点均可用于评分生成。MMAI算法(每SD增加)与PCSM密切相关(风险比[HR] 1.40, 95% CI 1.30 - 1.51)。解释:诊断性前列腺活检样本包含放射学上明显转移性前列腺癌患者或高危患者的预后信息。MMAI算法结合疾病负担提高晚期前列腺癌的预后。资助:英国前列腺癌协会、英国医学研究委员会、英国癌症研究中心、约翰·布莱克慈善基金会、前列腺癌基金会、赛诺菲·安万特、杨森、安斯泰来、诺华、Artera。
{"title":"External validation of a digital pathology-based multimodal artificial intelligence-derived prognostic model in patients with advanced prostate cancer starting long-term androgen deprivation therapy: a post-hoc ancillary biomarker study of four phase 3 randomised controlled trials of the STAMPEDE platform protocol","authors":"Charles T A Parker MB BChir , Larissa Mendes PhD , Vinnie Y T Liu MSc , Emily Grist PhD , Songwan Joun MSc , Rikiya Yamashita MD PhD , Akinori Mitani MD PhD , Emmalyn Chen PhD , Marina A Parry PhD , Ashwin Sachdeva PhD , Laura Murphy PhD , Huei-Chung Huang MA , Jacqueline Griffin PhD , Douwe van der Wal MSc , Tamara Todorovic MPH , Sharanpreet Lall BSc , Sara Santos Vidal MSc , Miriam Goncalves BSc , Suparna Thakali BSc , Anna Wingate MSc , Prof Gerhardt Attard PhD","doi":"10.1016/j.landig.2025.100885","DOIUrl":"10.1016/j.landig.2025.100885","url":null,"abstract":"<div><h3>Background</h3><div>Effective prognostication improves selection of patients with prostate cancer for treatment combinations. We aimed to evaluate whether a previously developed multimodal artificial intelligence (MMAI) algorithm was prognostic in very advanced prostate cancer using data from four phase 3 trials of the STAMPEDE platform protocol.</div></div><div><h3>Methods</h3><div>We included patients starting androgen-deprivation therapy in the docetaxel, docetaxel plus zoledronic acid, abiraterone, or abiraterone plus enzalutamide trials. Patients were recruited at 112 sites. We combined all standard-of-care control patients (including those allocated to standard of care [SOC-ADT] consisting of testosterone suppression with luteinising hormone-releasing hormone agonists or antagonists, and radiotherapy when indicated), and we combined the rest of the patients into docetaxel-treated or abiraterone-treated groups. Patients had either metastatic disease or were at very high-risk of metastatic disease, determined by node-positivity or, if node-negative, by T stage, serum prostate-specific antigen (PSA) level, and Gleason score. We used the locked ArteraAI Prostate MMAI algorithm that combined these clinical variables, age, and digitised prostate biopsy pathology images. We performed Fine–Gray and Cox regression adjusted for treatment allocation and cumulative incidence analyses at 5 years to evaluate associations with prostate cancer-specific mortality (PCSM) for continuous (per SD increase) and categorical (quartile-Q) scores. The STAMPEDE platform protocol is registered with <span><span>ClinicalTrials.gov</span><svg><path></path></svg></span>, <span><span>NCT00268476</span><svg><path></path></svg></span>.</div></div><div><h3>Findings</h3><div>Of 5213 eligible patients recruited from Oct 5, 2005, to March 31, 2016, 3167 were included in this analysis (1575 [49·7%] with non-metastatic disease, 1592 [50·3%] with metastatic disease; median follow-up 6·9 years [IQR 5·9–8·0]) with all datapoints available for score generation. The MMAI algorithm (per SD increase) was strongly associated with PCSM (hazard ratio [HR] 1·40, 95% CI 1·30–1·51, p<0·0001). On ad-hoc inspection, the highest scoring quartile of patients in each disease and treatment allocation group (MMAI Q4; <em>vs</em> the bottom three quartiles, Q1–3) had the highest PCSM risk in both patients with non-metastatic disease (HR 2·12, 1·61–2·81, p<0·0001) and those with metastatic disease (HR 1·62, 1·39–1·88, p<0·0001). MMAI quartile stratification split patients categorised by disease burden into groups with notably different risks of 5-year PCSM: patients with non-metastatic disease that were node-negative could be further stratified by MMAI score quartile Q1–3 (3%, 2–4) versus Q4 (11%, 7–15), those with non-metastatic disease that were node-positive could be stratified by Q1–3 (11%, 8–14) versus Q4 (20%, 13–26), those with metastatic disease with low-volume could be strati","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 7","pages":"Article 100885"},"PeriodicalIF":24.1,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144227290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}