Pub Date : 2025-11-01DOI: 10.1016/j.landig.2025.100923
Jennifer K Kulke MSc , Lukas M Fuhrmann MSc , Prof Matthias Berking PhD , Prof David D Ebert PhD , Prof Harald Baumeister PhD , Ariqa Derfiora MSc , Avery Veldhouse MSc , Kiona K Weisel PhD
<div><h3>Background</h3><div>To map out the potential benefits of widely available smartphone apps for mental health, especially in contexts where face-to-face services are limited or unavailable, it is crucial to examine their efficacy compared with inactive controls. Standalone smartphone apps might offer an accessible option for individuals waiting for treatment or living in under-resourced settings. Given the currently inconclusive evidence regarding these apps, this systematic review and meta-analysis aimed to assess the efficacy and study quality of randomised controlled trials (RCTs) evaluating standalone smartphone apps for mental health.</div></div><div><h3>Methods</h3><div>In this systematic review and meta-analysis, based on a previously published study, we conducted an updated systematic search of PubMed, PsycINFO, Web of Science, Cochrane Clinical Trial, and Scopus for RCTs published from database inception to Nov 10, 2023. We included RCTs that examined the efficacy of standalone smartphone apps for mental health in adults (age ≥18 years) with heightened symptom severity compared with an inactive control group (eg, waitlist, informational material, and control apps). We excluded control groups that received active treatment. Two independent researchers (AV and AD) extracted summary data, which were verified by a third researcher (JKK). The effect size Hedges’ <em>g</em>, 95% CI, and p value were calculated for each target outcome. We applied a random-effects model to all analyses due to the expected heterogeneity between RCTs. We assessed quality using the Risk of Bias 2 tool (dated Aug 22, 2019) and assessed publication bias via the Egger's test, and the Duval and Tweedie trim-and-fill analysis. The study was registered with PROSPERO, CRD42022310762.</div></div><div><h3>Findings</h3><div>We retrieved 12 705 records from electronic databases and 74 records from other sources (ie, reviews and meta-analyses on digital interventions for mental health identified through database searches and their reference lists, reference lists of other studies, trial registrations in PROSPERO, and websites of researchers in the field). Of these, we included 72 RCTs (70 reports) with 21 702 participants (of the 21 048 participants with sex or gender data, 14 208 [67%] were female, 6744 [32%] were male, and 96 [<1%] were other). At post assessment (assessment after completion of intervention), we found significant effects of apps targeting depression (33 comparisons; Hedges’ <em>g</em> 0·45 [95% CI 0·30 to 0·60], p≤0·0001, <em>I</em><sup>2</sup>=81·30%), anxiety (23 comparisons; 0·35 [0·22 to 0·48], p≤0·0001, <em>I</em><sup>2</sup>=74·91%), sleep problems (14 comparisons; 0·71 [0·51 to 0·92], p≤0·0001, <em>I</em><sup>2</sup>=76·17%), post-traumatic stress disorder (nine comparisons; 0·15 [0·02 to 0·28], p=0·029, <em>I</em><sup>2</sup>=28·65%), eating disorders (four comparisons; 0·50 [0·29 to 0·71], p≤0·0001, <em>I</em><sup>2</sup>=50·49%), and body
{"title":"Efficacy of standalone smartphone apps for mental health: an updated systematic review and meta-analysis","authors":"Jennifer K Kulke MSc , Lukas M Fuhrmann MSc , Prof Matthias Berking PhD , Prof David D Ebert PhD , Prof Harald Baumeister PhD , Ariqa Derfiora MSc , Avery Veldhouse MSc , Kiona K Weisel PhD","doi":"10.1016/j.landig.2025.100923","DOIUrl":"10.1016/j.landig.2025.100923","url":null,"abstract":"<div><h3>Background</h3><div>To map out the potential benefits of widely available smartphone apps for mental health, especially in contexts where face-to-face services are limited or unavailable, it is crucial to examine their efficacy compared with inactive controls. Standalone smartphone apps might offer an accessible option for individuals waiting for treatment or living in under-resourced settings. Given the currently inconclusive evidence regarding these apps, this systematic review and meta-analysis aimed to assess the efficacy and study quality of randomised controlled trials (RCTs) evaluating standalone smartphone apps for mental health.</div></div><div><h3>Methods</h3><div>In this systematic review and meta-analysis, based on a previously published study, we conducted an updated systematic search of PubMed, PsycINFO, Web of Science, Cochrane Clinical Trial, and Scopus for RCTs published from database inception to Nov 10, 2023. We included RCTs that examined the efficacy of standalone smartphone apps for mental health in adults (age ≥18 years) with heightened symptom severity compared with an inactive control group (eg, waitlist, informational material, and control apps). We excluded control groups that received active treatment. Two independent researchers (AV and AD) extracted summary data, which were verified by a third researcher (JKK). The effect size Hedges’ <em>g</em>, 95% CI, and p value were calculated for each target outcome. We applied a random-effects model to all analyses due to the expected heterogeneity between RCTs. We assessed quality using the Risk of Bias 2 tool (dated Aug 22, 2019) and assessed publication bias via the Egger's test, and the Duval and Tweedie trim-and-fill analysis. The study was registered with PROSPERO, CRD42022310762.</div></div><div><h3>Findings</h3><div>We retrieved 12 705 records from electronic databases and 74 records from other sources (ie, reviews and meta-analyses on digital interventions for mental health identified through database searches and their reference lists, reference lists of other studies, trial registrations in PROSPERO, and websites of researchers in the field). Of these, we included 72 RCTs (70 reports) with 21 702 participants (of the 21 048 participants with sex or gender data, 14 208 [67%] were female, 6744 [32%] were male, and 96 [<1%] were other). At post assessment (assessment after completion of intervention), we found significant effects of apps targeting depression (33 comparisons; Hedges’ <em>g</em> 0·45 [95% CI 0·30 to 0·60], p≤0·0001, <em>I</em><sup>2</sup>=81·30%), anxiety (23 comparisons; 0·35 [0·22 to 0·48], p≤0·0001, <em>I</em><sup>2</sup>=74·91%), sleep problems (14 comparisons; 0·71 [0·51 to 0·92], p≤0·0001, <em>I</em><sup>2</sup>=76·17%), post-traumatic stress disorder (nine comparisons; 0·15 [0·02 to 0·28], p=0·029, <em>I</em><sup>2</sup>=28·65%), eating disorders (four comparisons; 0·50 [0·29 to 0·71], p≤0·0001, <em>I</em><sup>2</sup>=50·49%), and body","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 11","pages":"Article 100923"},"PeriodicalIF":24.1,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145606895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1016/j.landig.2025.100902
Marcella Montagnese PhD , Bojidar Rangelov PhD , Tom Doel PhD , Prof David Llewellyn PhD , Prof Zuzana Walker MD PhD , Timothy Rittman MD PhD , Neil P Oxtoby PhD
Dementia poses an increasing global health challenge, and the introduction of new drugs with diverse activity profiles underscores the need for the rapid development and deployment of tailored predictive models. Machine learning has shown promise in dementia research, but it remains largely untested in routine dementia health care—particularly for image-based decision support—owing to data unavailability. Thus, data drift remains a key barrier for equitable real-world translation. We propose and pilot a scalable, cloud-based infrastructure as code solution incorporating privacy-preserving federated learning. This architecture preserves patient privacy by keeping data localised and secure, while enabling the development of robust, adaptable artificial intelligence models. Although technology giants have successfully implemented such technologies in consumer applications, their potential in health-care applications remains largely underutilised. This Viewpoint outlines the key challenges and solutions in implementing cloud-based federated learning for dementia medicine and provides a well-documented codebase to support further research.
{"title":"Cloud computing for equitable, data-driven dementia medicine","authors":"Marcella Montagnese PhD , Bojidar Rangelov PhD , Tom Doel PhD , Prof David Llewellyn PhD , Prof Zuzana Walker MD PhD , Timothy Rittman MD PhD , Neil P Oxtoby PhD","doi":"10.1016/j.landig.2025.100902","DOIUrl":"10.1016/j.landig.2025.100902","url":null,"abstract":"<div><div>Dementia poses an increasing global health challenge, and the introduction of new drugs with diverse activity profiles underscores the need for the rapid development and deployment of tailored predictive models. Machine learning has shown promise in dementia research, but it remains largely untested in routine dementia health care—particularly for image-based decision support—owing to data unavailability. Thus, data drift remains a key barrier for equitable real-world translation. We propose and pilot a scalable, cloud-based infrastructure as code solution incorporating privacy-preserving federated learning. This architecture preserves patient privacy by keeping data localised and secure, while enabling the development of robust, adaptable artificial intelligence models. Although technology giants have successfully implemented such technologies in consumer applications, their potential in health-care applications remains largely underutilised. This Viewpoint outlines the key challenges and solutions in implementing cloud-based federated learning for dementia medicine and provides a well-documented codebase to support further research.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 11","pages":"Article 100902"},"PeriodicalIF":24.1,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145483530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1016/j.landig.2025.100914
Prof Alicja R Rudnicka PhD , Royce Shakespeare MSc , Ryan Chambers BEng , Louis Bolter MSc , John Anderson MD , Jiri Fajtl PhD , Roshan A Welikala PhD , Prof Sarah A Barman PhD , Abraham Olvera-Barrios MD , Laura Webster , Samantha Mann MD , Aaron Lee MD , Prof Paolo Remagnino PhD , Catherine Egan MD , Prof Christopher G Owen PhD , Prof Adnan Tufail MD
Background
The global prevalence of diabetes is rising, alongside costs and workload associated with screening for diabetic eye disease (diabetic retinopathy). Automated retinal image analysis systems (ARIAS) could replace primary human grading of images for diabetic retinopathy. We evaluated multiple ARIAS in a real-life screening programme.
Methods
Eight of 25 invited and potentially eligible CE-marked systems for diabetic retinopathy detection from retinal images agreed to participate. From 202 886 screening encounters at the North East London Diabetic Eye Screening Programme (between Jan 1, 2021, and Dec 31, 2022) we curated a database of 1·2 million images and sociodemographic and grading data. Images were manually graded by up to three graders according to a standard national protocol. ARIAS performance overall and by subgroups of age, sex, ethnicity, and index of multiple deprivation (IMD) were assessed against the reference standard, defined as the final human grade in the worst eye for referable diabetic retinopathy (primary outcome). Vendor algorithms did not have access to human grading data.
Findings
Sensitivity across vendors ranged from 83·7% to 98·7% for referable diabetic retinopathy, from 96·7% to 99·8% for moderate-to-severe non-proliferative diabetic retinopathy, and from 95·8% to 99·5% for proliferative diabetic retinopathy. Sensitivity was largely consistent for moderate-to-severe non-proliferative and proliferative diabetic retinopathy by subgroups of age, sex, ethnicity, and IMD for all ARIAS. For mild-to-moderate non-proliferative diabetic retinopathy with referable maculopathy, sensitivity across vendors ranged from 79·5% to 98·3%, with greater variability across population subgroups. False positive rates for no observable diabetic retinopathy ranged from 4·3% to 61·4% and within vendors varied by 0·5 to 44 percentage points across population subgroups.
Interpretation
ARIAS showed high sensitivity for medium-risk and high-risk diabetic retinopathy in a real-world screening service, with equitable performance across population subgroups. ARIAS could provide a cost-effective solution to deal with the rising burden of screening for diabetic retinopathy by safely triaging for human grading, substantially increasing grading capacity and rapid diabetic retinopathy detection.
Funding
NHS Transformation Directorate, The Health Foundation, and The Wellcome Trust.
{"title":"Automated retinal image analysis systems to triage for grading of diabetic retinopathy: a large-scale, open-label, national screening programme in England","authors":"Prof Alicja R Rudnicka PhD , Royce Shakespeare MSc , Ryan Chambers BEng , Louis Bolter MSc , John Anderson MD , Jiri Fajtl PhD , Roshan A Welikala PhD , Prof Sarah A Barman PhD , Abraham Olvera-Barrios MD , Laura Webster , Samantha Mann MD , Aaron Lee MD , Prof Paolo Remagnino PhD , Catherine Egan MD , Prof Christopher G Owen PhD , Prof Adnan Tufail MD","doi":"10.1016/j.landig.2025.100914","DOIUrl":"10.1016/j.landig.2025.100914","url":null,"abstract":"<div><h3>Background</h3><div>The global prevalence of diabetes is rising, alongside costs and workload associated with screening for diabetic eye disease (diabetic retinopathy). Automated retinal image analysis systems (ARIAS) could replace primary human grading of images for diabetic retinopathy. We evaluated multiple ARIAS in a real-life screening programme.</div></div><div><h3>Methods</h3><div>Eight of 25 invited and potentially eligible CE-marked systems for diabetic retinopathy detection from retinal images agreed to participate. From 202 886 screening encounters at the North East London Diabetic Eye Screening Programme (between Jan 1, 2021, and Dec 31, 2022) we curated a database of 1·2 million images and sociodemographic and grading data. Images were manually graded by up to three graders according to a standard national protocol. ARIAS performance overall and by subgroups of age, sex, ethnicity, and index of multiple deprivation (IMD) were assessed against the reference standard, defined as the final human grade in the worst eye for referable diabetic retinopathy (primary outcome). Vendor algorithms did not have access to human grading data.</div></div><div><h3>Findings</h3><div>Sensitivity across vendors ranged from 83·7% to 98·7% for referable diabetic retinopathy, from 96·7% to 99·8% for moderate-to-severe non-proliferative diabetic retinopathy, and from 95·8% to 99·5% for proliferative diabetic retinopathy. Sensitivity was largely consistent for moderate-to-severe non-proliferative and proliferative diabetic retinopathy by subgroups of age, sex, ethnicity, and IMD for all ARIAS. For mild-to-moderate non-proliferative diabetic retinopathy with referable maculopathy, sensitivity across vendors ranged from 79·5% to 98·3%, with greater variability across population subgroups. False positive rates for no observable diabetic retinopathy ranged from 4·3% to 61·4% and within vendors varied by 0·5 to 44 percentage points across population subgroups.</div></div><div><h3>Interpretation</h3><div>ARIAS showed high sensitivity for medium-risk and high-risk diabetic retinopathy in a real-world screening service, with equitable performance across population subgroups. ARIAS could provide a cost-effective solution to deal with the rising burden of screening for diabetic retinopathy by safely triaging for human grading, substantially increasing grading capacity and rapid diabetic retinopathy detection.</div></div><div><h3>Funding</h3><div>NHS Transformation Directorate, The Health Foundation, and The Wellcome Trust.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 11","pages":"Article 100914"},"PeriodicalIF":24.1,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145606945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1016/j.landig.2025.100959
The Lancet Digital Health
{"title":"Evidence and responsibility of artificial intelligence use in mental health care","authors":"The Lancet Digital Health","doi":"10.1016/j.landig.2025.100959","DOIUrl":"10.1016/j.landig.2025.100959","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 11","pages":"Article 100959"},"PeriodicalIF":24.1,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145745126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Individual prediction uncertainty is a key aspect of clinical prediction model performance; however, standard performance metrics do not capture it. Consequently, a model might offer sufficient certainty for some patients but not for others, raising concerns about fairness. To address this limitation, the effective sample size has been proposed as a measure of sampling uncertainty. We developed a computational method to estimate effective sample sizes for a wide range of prediction models, including machine learning approaches. In this Viewpoint, we illustrated the approach using a clinical dataset (N=23 034) across five model types: logistic regression, elastic net, XGBoost, neural network, and random forest. During simulations, our approach generated accurate estimates of effective sample sizes for logistic regression and elastic net models, with minor deviations noted for the other three models. Although model performance metrics were similar across models, substantial differences in effective sample sizes and risk predictions were observed among patients in the clinical dataset. In conclusion, prediction uncertainty at the individual prediction level can be substantial even when models are developed using large samples. Effective sample size is thus a promising measure to communicate the uncertainty of predicted risk to individual users of machine learning-based prediction models.
{"title":"Effective sample size for individual risk predictions: quantifying uncertainty in machine learning models","authors":"Doranne Thomassen PhD , Toby Hackmann MSc , Prof Jelle Goeman PhD , Prof Ewout Steyerberg PhD , Prof Saskia le Cessie PhD","doi":"10.1016/j.landig.2025.100911","DOIUrl":"10.1016/j.landig.2025.100911","url":null,"abstract":"<div><div>Individual prediction uncertainty is a key aspect of clinical prediction model performance; however, standard performance metrics do not capture it. Consequently, a model might offer sufficient certainty for some patients but not for others, raising concerns about fairness. To address this limitation, the effective sample size has been proposed as a measure of sampling uncertainty. We developed a computational method to estimate effective sample sizes for a wide range of prediction models, including machine learning approaches. In this Viewpoint, we illustrated the approach using a clinical dataset (N=23 034) across five model types: logistic regression, elastic net, XGBoost, neural network, and random forest. During simulations, our approach generated accurate estimates of effective sample sizes for logistic regression and elastic net models, with minor deviations noted for the other three models. Although model performance metrics were similar across models, substantial differences in effective sample sizes and risk predictions were observed among patients in the clinical dataset. In conclusion, prediction uncertainty at the individual prediction level can be substantial even when models are developed using large samples. Effective sample size is thus a promising measure to communicate the uncertainty of predicted risk to individual users of machine learning-based prediction models.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 11","pages":"Article 100911"},"PeriodicalIF":24.1,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145641240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1016/j.landig.2025.100944
Federica Miglietta , Maria Vittoria Dieci
{"title":"Artificial intelligence and tumour-infiltrating lymphocytes in breast cancer: bridging innovation and feasibility towards clinical utility","authors":"Federica Miglietta , Maria Vittoria Dieci","doi":"10.1016/j.landig.2025.100944","DOIUrl":"10.1016/j.landig.2025.100944","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 11","pages":"Article 100944"},"PeriodicalIF":24.1,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145745061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1016/j.landig.2025.100908
Alexandra J Zimmer PhD , Rishav Das MSc , Patricia Espinoza Lopez MD , Vaidehi Nafade MSc , Genevieve Gore MLIS , César Ugarte-Gil PhD , Prof Kian Fan Chung MD , Woo-Jung Song PhD , Prof Madhukar Pai PhD , Simon Grandjean Lapierre MD
Quantifying cough can offer value for respiratory disease assessment and monitoring. Traditionally, patient-reported outcomes have provided subjective insights into symptoms. Novel digital cough counting tools now enable objective assessments; however, their integration into clinical practice is limited. The aim of this scoping review was to address this gap in the literature by examining the use of automated and semiautomated cough counting tools in patient care and public health. A systematic search of six databases and preprint servers identified studies published up to Feb 12, 2025. From 6968 records found, 618 full-text articles were assessed for eligibility, and 77 were included. Five clinical use cases were identified—disease diagnosis, severity assessment, treatment monitoring, health outcome prediction, and syndromic surveillance—with scarce available evidence supporting each use case. Moderate correlations were found between objective cough frequency and patient-reported cough severity (median correlation coefficient of 0.42, IQR 0·38 to 0·59) and quality of life (median correlation coefficient of −0·49, −0·63 to −0·44), indicating a complex relationship between quantifiable measures and perceived symptoms. Feasibility challenges include device obtrusiveness, monitoring adherence, and addressing patient privacy concerns. Comprehensive studies are needed to validate these technologies in real-world settings and show their clinical value. Early feasibility and acceptability assessments are essential for successful integration.
{"title":"Objective cough counting in clinical practice and public health: a scoping review","authors":"Alexandra J Zimmer PhD , Rishav Das MSc , Patricia Espinoza Lopez MD , Vaidehi Nafade MSc , Genevieve Gore MLIS , César Ugarte-Gil PhD , Prof Kian Fan Chung MD , Woo-Jung Song PhD , Prof Madhukar Pai PhD , Simon Grandjean Lapierre MD","doi":"10.1016/j.landig.2025.100908","DOIUrl":"10.1016/j.landig.2025.100908","url":null,"abstract":"<div><div>Quantifying cough can offer value for respiratory disease assessment and monitoring. Traditionally, patient-reported outcomes have provided subjective insights into symptoms. Novel digital cough counting tools now enable objective assessments; however, their integration into clinical practice is limited. The aim of this scoping review was to address this gap in the literature by examining the use of automated and semiautomated cough counting tools in patient care and public health. A systematic search of six databases and preprint servers identified studies published up to Feb 12, 2025. From 6968 records found, 618 full-text articles were assessed for eligibility, and 77 were included. Five clinical use cases were identified—disease diagnosis, severity assessment, treatment monitoring, health outcome prediction, and syndromic surveillance—with scarce available evidence supporting each use case. Moderate correlations were found between objective cough frequency and patient-reported cough severity (median correlation coefficient of 0.42, IQR 0·38 to 0·59) and quality of life (median correlation coefficient of −0·49, −0·63 to −0·44), indicating a complex relationship between quantifiable measures and perceived symptoms. Feasibility challenges include device obtrusiveness, monitoring adherence, and addressing patient privacy concerns. Comprehensive studies are needed to validate these technologies in real-world settings and show their clinical value. Early feasibility and acceptability assessments are essential for successful integration.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 11","pages":"Article 100908"},"PeriodicalIF":24.1,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145582684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1016/j.landig.2025.100924
Arman Koul MS , Deborah Duran PhD , Tina Hernandez-Boussard PhD
In the evolving landscape of artificial intelligence (AI), the assumption that more data lead to better models has driven unchecked reliance on synthetic data to augment training datasets. Although synthetic data address crucial shortages of real-world training data, their overuse might propagate biases, accelerate model degradation, and compromise generalisability across populations. A concerning consequence of the rapid adoption of synthetic data in medical AI is the emergence of synthetic trust—an unwarranted confidence in models trained on artificially generated datasets that fail to preserve clinical validity or demographic realities. In this Viewpoint, we advocate for caution in using synthetic data to train clinical algorithms. We propose actionable safeguards for synthetic medical AI, including standards for training data, fragility testing during development, and deployment disclosures for synthetic origins to ensure end-to-end accountability. These safeguards uphold data integrity and fairness in clinical applications using synthetic data, offering new standards for responsible and equitable use of synthetic data in health care.
{"title":"Synthetic data, synthetic trust: navigating data challenges in the digital revolution","authors":"Arman Koul MS , Deborah Duran PhD , Tina Hernandez-Boussard PhD","doi":"10.1016/j.landig.2025.100924","DOIUrl":"10.1016/j.landig.2025.100924","url":null,"abstract":"<div><div>In the evolving landscape of artificial intelligence (AI), the assumption that more data lead to better models has driven unchecked reliance on synthetic data to augment training datasets. Although synthetic data address crucial shortages of real-world training data, their overuse might propagate biases, accelerate model degradation, and compromise generalisability across populations. A concerning consequence of the rapid adoption of synthetic data in medical AI is the emergence of synthetic trust—an unwarranted confidence in models trained on artificially generated datasets that fail to preserve clinical validity or demographic realities. In this Viewpoint, we advocate for caution in using synthetic data to train clinical algorithms. We propose actionable safeguards for synthetic medical AI, including standards for training data, fragility testing during development, and deployment disclosures for synthetic origins to ensure end-to-end accountability. These safeguards uphold data integrity and fairness in clinical applications using synthetic data, offering new standards for responsible and equitable use of synthetic data in health care.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 11","pages":"Article 100924"},"PeriodicalIF":24.1,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1016/j.landig.2025.100912
William J Bolton PhD , Richard Wilson MPharm , Prof Mark Gilchrist MSc , Prof Pantelis Georgiou PhD , Prof Alison Holmes MD , Timothy M Rawson PhD
<div><h3>Background</h3><div>Challenges exist when translating artificial intelligence (AI)-driven clinical decision support systems (CDSSs) from research into health-care settings, particularly in infectious diseases, an area in which behaviour, culture, uncertainty, and frequent absence of a ground truth enhance the complexity of medical decision making. We aimed to evaluate clinicians’ perceptions of an AI CDSS for intravenous-to-oral antibiotic switching and how the system influences their decision making.</div></div><div><h3>Methods</h3><div>This randomised, multimethod study enrolled health-care professionals in the UK who were regularly involved in antibiotic prescribing. Participants were recruited through personal networks and the general email list of the British Infection Association. The first part of the study involved a semistructured interview about participants’ experience of antibiotic prescribing and their perception of AI. The second part used a custom web app to run a clinical vignette experiment: each of the 12 case vignettes consisted of a patient currently receiving intravenous antibiotics, and participants were asked to decide whether or not the patient was suitable for switching to oral antibiotics. Participants were assigned to receive either standard of care (SOC) information, or SOC alongside our previously developed AI-driven CDSS and its explanations, for each vignette across two groups. We assessed differences in participant choices according to the intervention they were assigned, both for each vignette and overall; evaluated the aggregate effect of the CDSS across all switching decisions; and characterised the decision diversity across participants. In the third part of the study, participants completed the system usability scale (SUS) and technology acceptance model (TAM) questionnaires to enable their opinions of the AI CDSS to be assessed.</div></div><div><h3>Findings</h3><div>59 clinicians were directly contacted or responded to recruitment emails, 42 of whom from 23 hospitals in the UK completed the study between April 23, 2024, and Aug 16, 2024. The median age of participants was 39 years (IQR 37–47), 19 (45%) were female and 23 (55%) were male, 26 (62%) were consultants and 16 (38%) were training-grade doctors, and 14 (33%) specialised in infectious diseases. Interviews revealed mixed individualisation of prescribing and uneven use of technology, alongside enthusiasm for AI, which was conditional on evidence and usability but constrained by behavioural inertia and infrastructure limitations. Case vignette completion times and many decisions were equivalent between SOC and CDSS interventions, with clinicians able to identify and ignore incorrect advice. When a statistical difference was observed, the CDSS influenced participants towards not switching (χ<sup>2</sup> 7·73, p=0·0054; logistic regression odds ratio 0·13 [95% CI 0·03–0·50]; p=0·0031). AI explanations were used only 9% of the time when available.
{"title":"The impact of artificial intelligence-driven decision support on uncertain antimicrobial prescribing: a randomised, multimethod study","authors":"William J Bolton PhD , Richard Wilson MPharm , Prof Mark Gilchrist MSc , Prof Pantelis Georgiou PhD , Prof Alison Holmes MD , Timothy M Rawson PhD","doi":"10.1016/j.landig.2025.100912","DOIUrl":"10.1016/j.landig.2025.100912","url":null,"abstract":"<div><h3>Background</h3><div>Challenges exist when translating artificial intelligence (AI)-driven clinical decision support systems (CDSSs) from research into health-care settings, particularly in infectious diseases, an area in which behaviour, culture, uncertainty, and frequent absence of a ground truth enhance the complexity of medical decision making. We aimed to evaluate clinicians’ perceptions of an AI CDSS for intravenous-to-oral antibiotic switching and how the system influences their decision making.</div></div><div><h3>Methods</h3><div>This randomised, multimethod study enrolled health-care professionals in the UK who were regularly involved in antibiotic prescribing. Participants were recruited through personal networks and the general email list of the British Infection Association. The first part of the study involved a semistructured interview about participants’ experience of antibiotic prescribing and their perception of AI. The second part used a custom web app to run a clinical vignette experiment: each of the 12 case vignettes consisted of a patient currently receiving intravenous antibiotics, and participants were asked to decide whether or not the patient was suitable for switching to oral antibiotics. Participants were assigned to receive either standard of care (SOC) information, or SOC alongside our previously developed AI-driven CDSS and its explanations, for each vignette across two groups. We assessed differences in participant choices according to the intervention they were assigned, both for each vignette and overall; evaluated the aggregate effect of the CDSS across all switching decisions; and characterised the decision diversity across participants. In the third part of the study, participants completed the system usability scale (SUS) and technology acceptance model (TAM) questionnaires to enable their opinions of the AI CDSS to be assessed.</div></div><div><h3>Findings</h3><div>59 clinicians were directly contacted or responded to recruitment emails, 42 of whom from 23 hospitals in the UK completed the study between April 23, 2024, and Aug 16, 2024. The median age of participants was 39 years (IQR 37–47), 19 (45%) were female and 23 (55%) were male, 26 (62%) were consultants and 16 (38%) were training-grade doctors, and 14 (33%) specialised in infectious diseases. Interviews revealed mixed individualisation of prescribing and uneven use of technology, alongside enthusiasm for AI, which was conditional on evidence and usability but constrained by behavioural inertia and infrastructure limitations. Case vignette completion times and many decisions were equivalent between SOC and CDSS interventions, with clinicians able to identify and ignore incorrect advice. When a statistical difference was observed, the CDSS influenced participants towards not switching (χ<sup>2</sup> 7·73, p=0·0054; logistic regression odds ratio 0·13 [95% CI 0·03–0·50]; p=0·0031). AI explanations were used only 9% of the time when available. ","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 11","pages":"Article 100912"},"PeriodicalIF":24.1,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145726136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}