首页 > 最新文献

Lancet Digital Health最新文献

英文 中文
Mapping the susceptibility of large language models to medical misinformation across clinical notes and social media: a cross-sectional benchmarking analysis 绘制大型语言模型对临床记录和社交媒体中医疗错误信息的敏感性:横断面基准分析。
IF 24.1 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2026-01-01 Epub Date: 2026-02-09 DOI: 10.1016/j.landig.2025.100949
Mahmud Omar MD , Vera Sorin MD , Lothar H Wieler PhD , Alexander W Charney MD , Patricia Kovatch MD , Carol R Horowitz MD , Panagiotis Korfiatis PhD , Benjamin S Glicksberg PhD , Robert Freeman DNP , Girish N Nadkarni MD , Eyal Klang MD MPH

Background

Large language models (LLMs) are increasingly used in health care but remain vulnerable to medical misinformation. We aimed to evaluate how often these models accept or reject fabricated medical content, and how framing that content as a logical fallacy changes results.

Methods

In this cross-sectional benchmarking analysis, we probed 20 LLMs with more than 3·4 million prompts that all contained health misinformation drawn from three sources: public-forum and social-media dialogues, real hospital discharge notes in which we inserted a single false recommendation, and 300 physician-validated simulated vignettes. Logical fallacies—common patterns of flawed reasoning such as appeals to authority, popularity, or emotion—were used to test how rhetorical framing influences model behaviour. Each prompt was posed once in a neutral base form and ten times with a named logical fallacy. For every run we logged susceptibility (model accepts the false claim) and fallacy detection (model flags the rhetoric).

Findings

Across all models and corpora, LLMs were susceptible to fabricated data in 50 108 (31·7%) of 158 000 base prompts. Eight of ten fallacy framings significantly reduced or did not change that rate, led by appeal to popularity (susceptibility 11·9%; difference of –19·8 percentage points; p<0·0001); only the slippery-slope prompt (33·9%; difference of 2·2 percentage points; p<0·0001) and the appeal-to-authority prompt (34·6%; difference of 2·9 percentage points; p<0·0001) increased it. Real hospital notes (with fabricated inserted elements) produced the highest susceptibility to the base prompt (46 108 [46·1%] of 100 000), whereas social-media misinformation showed lower base prompt susceptibility (2479 [8·9%] of 28 000). Performance varied by model: GPT models were the least susceptible and most accurate at fallacy detection, whereas others, such as Gemma-3–4B-it, showed 63·6% (5023 of 7900) susceptibility.

Interpretation

These results show that LLMs still absorb harmful medical fabrications, especially when phrased in authoritative clinical prose, yet, counter-intuitively, become less vulnerable when the same claims are wrapped in most logical fallacy styles. Therefore, improving safety appears to depend less on model scale and more on fact-grounding and context-aware guardrails.

Funding

Scientific Computing and Data at Icahn School of Medicine and National Institutes of Health Office of Research Infrastructure.
背景:大型语言模型(llm)越来越多地用于医疗保健,但仍然容易受到医疗错误信息的影响。我们的目的是评估这些模型接受或拒绝捏造的医学内容的频率,以及将这些内容定义为逻辑谬误如何改变结果。方法:在这个横断面基准分析中,我们调查了20个llm,其中超过340万个提示都包含来自三个来源的健康错误信息:公共论坛和社交媒体对话,我们插入单个错误建议的真实出院记录,以及300个医生验证的模拟小片段。逻辑谬误——有缺陷推理的常见模式,如诉诸权威、受欢迎程度或情感——被用来测试修辞框架如何影响模型行为。每个提示一次以中性基本形式提出,十次以命名的逻辑谬误提出。对于每次运行,我们都会记录敏感性(模型接受错误声明)和谬误检测(模型标记修辞)。研究结果:在所有模型和语料库中,法学硕士在158,000个基础提示中有50108个(31.7%)容易受到伪造数据的影响。10个谬误框架中有8个显著降低或没有改变这一比率,主要是受欢迎的吸引力(易感性11.9%;差异为19.8个百分点);解释:这些结果表明法学硕士仍然吸收有害的医学捏造,特别是在权威的临床散文中,然而,与直觉相反,当相同的主张被大多数逻辑谬误风格包裹时,变得不那么容易受到伤害。因此,提高安全性似乎更少地依赖于模型规模,而更多地依赖于基于事实和上下文感知的护栏。资助:伊坎医学院和国立卫生研究院基础设施研究办公室的科学计算和数据。
{"title":"Mapping the susceptibility of large language models to medical misinformation across clinical notes and social media: a cross-sectional benchmarking analysis","authors":"Mahmud Omar MD ,&nbsp;Vera Sorin MD ,&nbsp;Lothar H Wieler PhD ,&nbsp;Alexander W Charney MD ,&nbsp;Patricia Kovatch MD ,&nbsp;Carol R Horowitz MD ,&nbsp;Panagiotis Korfiatis PhD ,&nbsp;Benjamin S Glicksberg PhD ,&nbsp;Robert Freeman DNP ,&nbsp;Girish N Nadkarni MD ,&nbsp;Eyal Klang MD MPH","doi":"10.1016/j.landig.2025.100949","DOIUrl":"10.1016/j.landig.2025.100949","url":null,"abstract":"<div><h3>Background</h3><div>Large language models (LLMs) are increasingly used in health care but remain vulnerable to medical misinformation. We aimed to evaluate how often these models accept or reject fabricated medical content, and how framing that content as a logical fallacy changes results.</div></div><div><h3>Methods</h3><div>In this cross-sectional benchmarking analysis, we probed 20 LLMs with more than 3·4 million prompts that all contained health misinformation drawn from three sources: public-forum and social-media dialogues, real hospital discharge notes in which we inserted a single false recommendation, and 300 physician-validated simulated vignettes. Logical fallacies—common patterns of flawed reasoning such as appeals to authority, popularity, or emotion—were used to test how rhetorical framing influences model behaviour. Each prompt was posed once in a neutral base form and ten times with a named logical fallacy. For every run we logged susceptibility (model accepts the false claim) and fallacy detection (model flags the rhetoric).</div></div><div><h3>Findings</h3><div>Across all models and corpora, LLMs were susceptible to fabricated data in 50 108 (31·7%) of 158 000 base prompts. Eight of ten fallacy framings significantly reduced or did not change that rate, led by appeal to popularity (susceptibility 11·9%; difference of –19·8 percentage points; p&lt;0·0001); only the slippery-slope prompt (33·9%; difference of 2·2 percentage points; p&lt;0·0001) and the appeal-to-authority prompt (34·6%; difference of 2·9 percentage points; p&lt;0·0001) increased it. Real hospital notes (with fabricated inserted elements) produced the highest susceptibility to the base prompt (46 108 [46·1%] of 100 000), whereas social-media misinformation showed lower base prompt susceptibility (2479 [8·9%] of 28 000). Performance varied by model: GPT models were the least susceptible and most accurate at fallacy detection, whereas others, such as Gemma-3–4B-it, showed 63·6% (5023 of 7900) susceptibility.</div></div><div><h3>Interpretation</h3><div>These results show that LLMs still absorb harmful medical fabrications, especially when phrased in authoritative clinical prose, yet, counter-intuitively, become less vulnerable when the same claims are wrapped in most logical fallacy styles. Therefore, improving safety appears to depend less on model scale and more on fact-grounding and context-aware guardrails.</div></div><div><h3>Funding</h3><div>Scientific Computing and Data at Icahn School of Medicine and National Institutes of Health Office of Research Infrastructure.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"8 1","pages":"Article 100949"},"PeriodicalIF":24.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146167432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large language models and misinformation 大型语言模型和错误信息。
IF 24.1 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2026-01-01 Epub Date: 2026-02-09 DOI: 10.1016/j.landig.2025.100975
The Lancet Digital Health
{"title":"Large language models and misinformation","authors":"The Lancet Digital Health","doi":"10.1016/j.landig.2025.100975","DOIUrl":"10.1016/j.landig.2025.100975","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"8 1","pages":"Article 100975"},"PeriodicalIF":24.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146167372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Are we heading towards a cybersecurity crisis in health care and are actions needed? 我们是否正在走向医疗保健领域的网络安全危机,是否需要采取行动?
IF 24.1 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2026-01-01 Epub Date: 2026-01-09 DOI: 10.1016/j.landig.2025.100946
Oscar Freyer , Kunal Rajput , Max Ostermann , Stephen Gilbert , Saira Ghafur
{"title":"Are we heading towards a cybersecurity crisis in health care and are actions needed?","authors":"Oscar Freyer ,&nbsp;Kunal Rajput ,&nbsp;Max Ostermann ,&nbsp;Stephen Gilbert ,&nbsp;Saira Ghafur","doi":"10.1016/j.landig.2025.100946","DOIUrl":"10.1016/j.landig.2025.100946","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"8 1","pages":"Article 100946"},"PeriodicalIF":24.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145949467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CARDBiomedBench: a benchmark for evaluating the performance of large language models in biomedical research CARDBiomedBench:用于评估生物医学研究中大型语言模型性能的基准。
IF 24.1 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2026-01-01 Epub Date: 2026-01-31 DOI: 10.1016/j.landig.2025.100943
Owen Bianchi MSE , Maya Willey BS , Chelsea X Alvarado MS , Benjamin Danek BS , Marzieh Khani PhD , Nicole Kuznetsov MPS , Anant Dadu PhD , Syed Shah PhD , Mathew J Koretsky BS , Mary B Makarious PhD , Cory Weller PhD , Kristin S Levine MS , Sungwon Kim MSE , Paige Jarreau PhD , Dan Vitale MS , Elise Marsan PhD , Hirotaka Iwaki MD , Hampton Leonard MS , Sara Bandres-Ciga PhD , Andrew B Singleton PhD , Faraz Faghri PhD
Although large language models (LLMs) have the potential to transform biomedical research, their ability to reason accurately across complex, data-rich domains remains unproven. To address this research gap, we introduce CARDBiomedBench, a large-scale question-and-answer benchmark for evaluating LLMs in biomedical science. This pilot release focuses on neurodegenerative disease research, a field requiring the integration of genomics, pharmacology, and statistical reasoning. CARDBiomedBench includes more than 68 000 curated question–answer pairs generated through expert annotation and structured data augmentation. The questions spanned ten biological categories and nine reasoning types, based on publicly available resources, such as genome-wide association studies, summary data-based mendelian randomisation results, and regulatory drug databases. We assessed model responses using BioScore, a rubric-based evaluation system that measures response accuracy (response quality rate, RQR) and the ability to abstain from incorrect answers (safety rate). Testing 18 state-of-the-art LLMs revealed considerable gaps. Claude-3.5-Sonnet achieved high caution but low accuracy (safety rate 75%, RQR 24%), whereas GPT-4.1 showed the opposite trade-off (safety rate 7%, RQR 51%). No model showed a successful balance of both metrics. CARDBiomedBench provides a new standard for benchmarking biomedical LLMs, revealing key limitations in existing models and offering a scalable path towards safer, more effective artificial intelligence systems in scientific research.
尽管大型语言模型(llm)有可能改变生物医学研究,但它们在复杂、数据丰富的领域中进行准确推理的能力仍未得到证实。为了解决这一研究差距,我们引入了CARDBiomedBench,这是一个大型问答基准,用于评估生物医学科学的法学硕士。这个试点版本侧重于神经退行性疾病的研究,这是一个需要整合基因组学、药理学和统计推理的领域。CARDBiomedBench包括通过专家注释和结构化数据增强生成的68000多个策划的问答对。这些问题涵盖了10个生物学类别和9种推理类型,基于公开可用的资源,如全基因组关联研究、基于数据的孟德尔随机化结果摘要和监管药物数据库。我们使用BioScore评估模型反应,这是一个基于评分的评估系统,测量反应准确性(反应质量率,RQR)和避免错误答案的能力(安全率)。测试了18个最先进的llm,发现了相当大的差距。Claude-3.5-Sonnet具有高谨慎性,但准确性较低(安全性为75%,RQR为24%),而GPT-4.1具有相反的权衡(安全性为7%,RQR为51%)。没有任何模型能够成功地平衡这两个指标。CARDBiomedBench为生物医学法学硕士提供了一个新的基准,揭示了现有模型的关键局限性,并为科学研究中更安全、更有效的人工智能系统提供了一条可扩展的道路。
{"title":"CARDBiomedBench: a benchmark for evaluating the performance of large language models in biomedical research","authors":"Owen Bianchi MSE ,&nbsp;Maya Willey BS ,&nbsp;Chelsea X Alvarado MS ,&nbsp;Benjamin Danek BS ,&nbsp;Marzieh Khani PhD ,&nbsp;Nicole Kuznetsov MPS ,&nbsp;Anant Dadu PhD ,&nbsp;Syed Shah PhD ,&nbsp;Mathew J Koretsky BS ,&nbsp;Mary B Makarious PhD ,&nbsp;Cory Weller PhD ,&nbsp;Kristin S Levine MS ,&nbsp;Sungwon Kim MSE ,&nbsp;Paige Jarreau PhD ,&nbsp;Dan Vitale MS ,&nbsp;Elise Marsan PhD ,&nbsp;Hirotaka Iwaki MD ,&nbsp;Hampton Leonard MS ,&nbsp;Sara Bandres-Ciga PhD ,&nbsp;Andrew B Singleton PhD ,&nbsp;Faraz Faghri PhD","doi":"10.1016/j.landig.2025.100943","DOIUrl":"10.1016/j.landig.2025.100943","url":null,"abstract":"<div><div>Although large language models (LLMs) have the potential to transform biomedical research, their ability to reason accurately across complex, data-rich domains remains unproven. To address this research gap, we introduce CARDBiomedBench, a large-scale question-and-answer benchmark for evaluating LLMs in biomedical science. This pilot release focuses on neurodegenerative disease research, a field requiring the integration of genomics, pharmacology, and statistical reasoning. CARDBiomedBench includes more than 68 000 curated question–answer pairs generated through expert annotation and structured data augmentation. The questions spanned ten biological categories and nine reasoning types, based on publicly available resources, such as genome-wide association studies, summary data-based mendelian randomisation results, and regulatory drug databases. We assessed model responses using BioScore, a rubric-based evaluation system that measures response accuracy (response quality rate, RQR) and the ability to abstain from incorrect answers (safety rate). Testing 18 state-of-the-art LLMs revealed considerable gaps. Claude-3.5-Sonnet achieved high caution but low accuracy (safety rate 75%, RQR 24%), whereas GPT-4.1 showed the opposite trade-off (safety rate 7%, RQR 51%). No model showed a successful balance of both metrics. CARDBiomedBench provides a new standard for benchmarking biomedical LLMs, revealing key limitations in existing models and offering a scalable path towards safer, more effective artificial intelligence systems in scientific research.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"8 1","pages":"Article 100943"},"PeriodicalIF":24.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146100984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of performance measures in predictive artificial intelligence models to support medical decisions: overview and guidance 在支持医疗决策的预测性人工智能模型中评估绩效指标:概述和指导。
IF 24.1 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2025-12-01 Epub Date: 2025-12-13 DOI: 10.1016/j.landig.2025.100916
Prof Ben Van Calster PhD , Prof Gary S Collins PhD , Prof Andrew J Vickers PhD , Laure Wynants PhD , Prof Kathleen F Kerr PhD , Lasai Barreñada MSc , Prof Gael Varoquaux PhD , Karandeep Singh PhD , Prof Karel GM Moons , Prof Tina Hernandez-Boussard PhD , Prof Dirk Timmerman PhD , David J McLernon PhD , Maarten van Smeden PhD , Prof Ewout W Steyerberg , Topic Group 6 of the STRATOS initiative
Numerous measures have been proposed to illustrate the performance of predictive artificial intelligence (AI) models. Selecting appropriate performance measures is essential for predictive AI models intended for use in medical practice. Poorly performing models are misleading and may lead to wrong clinical decisions that can be detrimental to patients and increase financial costs. In this Viewpoint, we assess the merits of classic and contemporary performance measures when validating predictive AI models for medical practice, focusing on models that estimate probabilities for a binary outcome. We discuss 32 performance measures covering five performance domains (discrimination, calibration, overall performance, classification, and clinical utility) along with corresponding graphical assessments. The first four domains address statistical performance, whereas the fifth domain covers decision–analytical performance. We discuss two key characteristics when selecting a performance measure and explain why these characteristics are important: (1) whether the measure’s expected value is optimised when calculated using the correct probabilities (ie, whether it is a proper measure) and (2) whether the measure solely reflects statistical performance or decision–analytical performance by properly accounting for misclassification costs. 17 measures showed both characteristics, 14 showed one, and one (F1 score) showed neither. All classification measures were improper for clinically relevant decision thresholds other than when the threshold was 0·5 or equal to the true prevalence. We illustrate these measures and characteristics using the ADNEX model which predicts the probability of malignancy in women with an ovarian tumour. We recommend the following measures and plots as essential to report: area under the receiver operating characteristic curve, calibration plot, a clinical utility measure such as net benefit with decision curve analysis, and a plot showing probability distributions by outcome category.
已经提出了许多措施来说明预测性人工智能(AI)模型的性能。选择适当的性能指标对于用于医疗实践的预测人工智能模型至关重要。表现不佳的模型具有误导性,并可能导致错误的临床决策,这可能对患者有害并增加财务成本。在本观点中,我们在验证用于医疗实践的预测人工智能模型时,评估了经典和现代性能指标的优点,重点关注估计二元结果概率的模型。我们讨论了32个性能指标,涵盖五个性能领域(鉴别、校准、总体性能、分类和临床效用)以及相应的图形评估。前四个领域涉及统计性能,而第五个领域涵盖决策分析性能。在选择绩效衡量标准时,我们讨论了两个关键特征,并解释了为什么这些特征很重要:(1)当使用正确的概率(即,它是否是一个适当的衡量标准)计算时,衡量标准的期望值是否得到了优化;(2)衡量标准是否仅反映了统计性能或决策分析性能,通过适当地考虑错误分类成本。17项测量同时显示两种特征,14项显示一种特征,1项(F1得分)不显示任何特征。除了阈值为0.5或等于真实患病率时,所有的分类措施都不适合临床相关的决策阈值。我们使用ADNEX模型来说明这些措施和特征,该模型预测卵巢肿瘤女性恶性肿瘤的概率。我们推荐以下测量和图表作为报告的必要指标:受试者工作特征曲线下的面积,校准图,临床效用测量,如决策曲线分析的净收益,以及显示结果类别概率分布的图表。
{"title":"Evaluation of performance measures in predictive artificial intelligence models to support medical decisions: overview and guidance","authors":"Prof Ben Van Calster PhD ,&nbsp;Prof Gary S Collins PhD ,&nbsp;Prof Andrew J Vickers PhD ,&nbsp;Laure Wynants PhD ,&nbsp;Prof Kathleen F Kerr PhD ,&nbsp;Lasai Barreñada MSc ,&nbsp;Prof Gael Varoquaux PhD ,&nbsp;Karandeep Singh PhD ,&nbsp;Prof Karel GM Moons ,&nbsp;Prof Tina Hernandez-Boussard PhD ,&nbsp;Prof Dirk Timmerman PhD ,&nbsp;David J McLernon PhD ,&nbsp;Maarten van Smeden PhD ,&nbsp;Prof Ewout W Steyerberg ,&nbsp;Topic Group 6 of the STRATOS initiative","doi":"10.1016/j.landig.2025.100916","DOIUrl":"10.1016/j.landig.2025.100916","url":null,"abstract":"<div><div>Numerous measures have been proposed to illustrate the performance of predictive artificial intelligence (AI) models. Selecting appropriate performance measures is essential for predictive AI models intended for use in medical practice. Poorly performing models are misleading and may lead to wrong clinical decisions that can be detrimental to patients and increase financial costs. In this Viewpoint, we assess the merits of classic and contemporary performance measures when validating predictive AI models for medical practice, focusing on models that estimate probabilities for a binary outcome. We discuss 32 performance measures covering five performance domains (discrimination, calibration, overall performance, classification, and clinical utility) along with corresponding graphical assessments. The first four domains address statistical performance, whereas the fifth domain covers decision–analytical performance. We discuss two key characteristics when selecting a performance measure and explain why these characteristics are important: (1) whether the measure’s expected value is optimised when calculated using the correct probabilities (ie, whether it is a proper measure) and (2) whether the measure solely reflects statistical performance or decision–analytical performance by properly accounting for misclassification costs. 17 measures showed both characteristics, 14 showed one, and one (F1 score) showed neither. All classification measures were improper for clinically relevant decision thresholds other than when the threshold was 0·5 or equal to the true prevalence. We illustrate these measures and characteristics using the ADNEX model which predicts the probability of malignancy in women with an ovarian tumour. We recommend the following measures and plots as essential to report: area under the receiver operating characteristic curve, calibration plot, a clinical utility measure such as net benefit with decision curve analysis, and a plot showing probability distributions by outcome category.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 12","pages":"Article 100916"},"PeriodicalIF":24.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145757993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Preserving the integrity of clinical trials 保持临床试验的完整性。
IF 24.1 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2025-12-01 Epub Date: 2025-12-23 DOI: 10.1016/j.landig.2025.100970
The Lancet Digital Health
{"title":"Preserving the integrity of clinical trials","authors":"The Lancet Digital Health","doi":"10.1016/j.landig.2025.100970","DOIUrl":"10.1016/j.landig.2025.100970","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 12","pages":"Article 100970"},"PeriodicalIF":24.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145828977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bringing tuberculosis genomics to the clinic: development and validation of a comprehensive pipeline to predict antimicrobial susceptibility from genomic data, accredited to ISO standards 将结核病基因组学带入临床:开发和验证通过ISO标准认证的从基因组数据预测抗菌药物敏感性的综合管道。
IF 24.1 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2025-12-01 Epub Date: 2025-12-22 DOI: 10.1016/j.landig.2025.100939
Kristy A Horan PhD , Linda Viberg PhD , Susan A Ballard PhD , Maria Globan BSc , Wytamma Wirth PhD , Katherine Bond MBBS , Jessica R Webb PhD , Thinley Dorji PhD , Prof Deborah A Williamson PhD , Michelle L Sait PhD , Ee Laine Tay MPH , Prof Justin T Denholm PhD , Prof Benjamin P Howden PhD , Torsten Seemann PhD , Norelle L Sherry PhD
<div><h3>Background</h3><div>Whole-genome sequencing is increasingly contributing to the clinical management of tuberculosis. Although the availability of bioinformatics tools for analysis and clinical reporting of <em>Mycobacterium tuberculosis</em> sequence data is improving, there remains a need for accessible, flexible bioinformatics tools that can be easily tailored for clinical reporting needs in different settings and that are suitable for accreditation to international standards. We aimed to develop a robust software tool to identify <em>M tuberculosis</em> lineages and antimicrobial resistance from genomic data, tailored for clinical reporting and accessible to clinical microbiology laboratories.</div></div><div><h3>Methods</h3><div>We developed tbtAMR, a flexible yet comprehensive data-driven tool for analysis of <em>M tuberculosis</em> genomic data, including inference of phenotypic susceptibility and lineage calling. tbtAMR takes short-read sequencing data (fastq files) or an annotated vcf file (from short-read or long-read sequencing), maps genomic variants (single nucleotide polymorphisms, insertions or deletions, large structural changes, and gene loss or loss of function), identifies resistance-associated mutations from the WHO catalogue (or user-defined database), and interprets and classifies drug resistance to produce an output file ready for clinical reporting. Validation was undertaken by comparing tbtAMR results with phenotypic and genomic data from our laboratory (n=2005), and publicly available databases and literature (n=13 777), plus simulated genomic data (known variants introduced into a genome sequence) to determine the appropriate quality control metrics and extensively validate the pipeline for clinical use. We compared tbtAMR’s performance with selected publicly available tools (TBProfiler and Mykrobe) to evaluate performance.</div></div><div><h3>Findings</h3><div>tbtAMR accurately predicted lineages and phenotypic susceptibility for first-line (sensitivity 94·6% [95% CI 94·2–95·0], specificity 97·5% [97·3–97·7]) and second-line (sensitivity 83·7% [82·7–84·7], specificity 98·0% [97·9–98·1]) drugs, with equivalent computational and predictive performance compared with other bioinformatics tools currently used, including TBProfiler (first-line sensitivity 94·2% [93·0–95·3], specificity 97·9% [97·6–98·2]) and Mykrobe (first-line sensitivity 91·5% [90·0–92·8], specificity 98·4% [98·2–98·6]). tbtAMR is flexible, with modifiable criteria to tailor results to users’ needs.</div></div><div><h3>Interpretation</h3><div>The tbtAMR tool is suitable for use in clinical and public health microbiology laboratory settings and can be tailored to specific local needs by non-programmers. We have accredited this tool to ISO standards in our laboratory, and it has been implemented for routine reporting of antimicrobial resistance from genomic sequence data in a clinically relevant timeframe (similar to phenotypic susceptibility testing
背景:全基因组测序在结核病的临床管理中发挥着越来越重要的作用。虽然用于分析和临床报告结核分枝杆菌序列数据的生物信息学工具的可用性正在改善,但仍然需要可获得的、灵活的生物信息学工具,这些工具可以很容易地针对不同环境下的临床报告需求进行定制,并且适合国际标准的认证。我们的目标是开发一个强大的软件工具,从基因组数据中识别结核分枝杆菌谱系和抗微生物药物耐药性,为临床报告量身定制,并可供临床微生物实验室使用。方法:我们开发了tbtAMR,这是一个灵活而全面的数据驱动工具,用于分析结核分枝杆菌基因组数据,包括表型易感性推断和谱系召唤。tbtAMR获取短读测序数据(fastq文件)或带注释的vcf文件(来自短读或长读测序),绘制基因组变异图谱(单核苷酸多态性、插入或缺失、大结构变化以及基因丢失或功能丧失),从世卫组织目录(或用户定义数据库)中识别耐药性相关突变,并对耐药性进行解释和分类,以生成准备用于临床报告的输出文件。通过将tbtAMR结果与我们实验室的表型和基因组数据(n=2005)、公开数据库和文献(n=13 777)以及模拟基因组数据(引入基因组序列的已知变异)进行比较,以确定适当的质量控制指标,并广泛验证临床使用的管道。我们将tbtAMR的性能与选定的公开可用的工具(TBProfiler和Mykrobe)进行比较,以评估性能。发现:tbtAMR可准确预测一线(敏感性94.6% [95% CI 94.2 - 99.5],特异性97.5%[93.7 - 93.7])和二线(敏感性83.7%[88.7 - 88.7],特异性98.0%[99.7 - 99.1])药物的谱系和表型敏感性,与目前使用的其他生物信息学工具相比,具有相当的计算和预测性能,包括TBProfiler(一线敏感性94.2% [93.0 - 95.3],特异性97.9%[97.6 - 98.2])和Mykrobe(一线敏感性91.5%[90·0- 92.8],特异性98.4%[98.2 - 98.6])。tbtAMR是灵活的,具有可修改的标准,可以根据用户的需要定制结果。解释:tbtAMR工具适合在临床和公共卫生微生物实验室环境中使用,并且可以根据非程序员的特定当地需求进行定制。我们的实验室已将该工具认证为ISO标准,并已将其用于在临床相关时间框架内(类似于表型敏感性测试,阳性培养后3-4周)从基因组序列数据中常规报告抗菌素耐药性。提供了报告模板、验证方法和数据集,为实验室采用和寻求自己对这一关键检测的认可提供了途径,以改善全球结核病管理。资助:维多利亚卫生部和医学研究未来基金。
{"title":"Bringing tuberculosis genomics to the clinic: development and validation of a comprehensive pipeline to predict antimicrobial susceptibility from genomic data, accredited to ISO standards","authors":"Kristy A Horan PhD ,&nbsp;Linda Viberg PhD ,&nbsp;Susan A Ballard PhD ,&nbsp;Maria Globan BSc ,&nbsp;Wytamma Wirth PhD ,&nbsp;Katherine Bond MBBS ,&nbsp;Jessica R Webb PhD ,&nbsp;Thinley Dorji PhD ,&nbsp;Prof Deborah A Williamson PhD ,&nbsp;Michelle L Sait PhD ,&nbsp;Ee Laine Tay MPH ,&nbsp;Prof Justin T Denholm PhD ,&nbsp;Prof Benjamin P Howden PhD ,&nbsp;Torsten Seemann PhD ,&nbsp;Norelle L Sherry PhD","doi":"10.1016/j.landig.2025.100939","DOIUrl":"10.1016/j.landig.2025.100939","url":null,"abstract":"&lt;div&gt;&lt;h3&gt;Background&lt;/h3&gt;&lt;div&gt;Whole-genome sequencing is increasingly contributing to the clinical management of tuberculosis. Although the availability of bioinformatics tools for analysis and clinical reporting of &lt;em&gt;Mycobacterium tuberculosis&lt;/em&gt; sequence data is improving, there remains a need for accessible, flexible bioinformatics tools that can be easily tailored for clinical reporting needs in different settings and that are suitable for accreditation to international standards. We aimed to develop a robust software tool to identify &lt;em&gt;M tuberculosis&lt;/em&gt; lineages and antimicrobial resistance from genomic data, tailored for clinical reporting and accessible to clinical microbiology laboratories.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Methods&lt;/h3&gt;&lt;div&gt;We developed tbtAMR, a flexible yet comprehensive data-driven tool for analysis of &lt;em&gt;M tuberculosis&lt;/em&gt; genomic data, including inference of phenotypic susceptibility and lineage calling. tbtAMR takes short-read sequencing data (fastq files) or an annotated vcf file (from short-read or long-read sequencing), maps genomic variants (single nucleotide polymorphisms, insertions or deletions, large structural changes, and gene loss or loss of function), identifies resistance-associated mutations from the WHO catalogue (or user-defined database), and interprets and classifies drug resistance to produce an output file ready for clinical reporting. Validation was undertaken by comparing tbtAMR results with phenotypic and genomic data from our laboratory (n=2005), and publicly available databases and literature (n=13 777), plus simulated genomic data (known variants introduced into a genome sequence) to determine the appropriate quality control metrics and extensively validate the pipeline for clinical use. We compared tbtAMR’s performance with selected publicly available tools (TBProfiler and Mykrobe) to evaluate performance.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Findings&lt;/h3&gt;&lt;div&gt;tbtAMR accurately predicted lineages and phenotypic susceptibility for first-line (sensitivity 94·6% [95% CI 94·2–95·0], specificity 97·5% [97·3–97·7]) and second-line (sensitivity 83·7% [82·7–84·7], specificity 98·0% [97·9–98·1]) drugs, with equivalent computational and predictive performance compared with other bioinformatics tools currently used, including TBProfiler (first-line sensitivity 94·2% [93·0–95·3], specificity 97·9% [97·6–98·2]) and Mykrobe (first-line sensitivity 91·5% [90·0–92·8], specificity 98·4% [98·2–98·6]). tbtAMR is flexible, with modifiable criteria to tailor results to users’ needs.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Interpretation&lt;/h3&gt;&lt;div&gt;The tbtAMR tool is suitable for use in clinical and public health microbiology laboratory settings and can be tailored to specific local needs by non-programmers. We have accredited this tool to ISO standards in our laboratory, and it has been implemented for routine reporting of antimicrobial resistance from genomic sequence data in a clinically relevant timeframe (similar to phenotypic susceptibility testing","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 12","pages":"Article 100939"},"PeriodicalIF":24.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the effect of visual data on multimodal artificial intelligence diagnostic performance 评估视觉数据对多模态人工智能诊断性能的影响。
IF 24.1 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2025-12-01 DOI: 10.1016/j.landig.2025.100938
Arjun Mahajan , Callie Fry , Li Zhou , David W Bates
{"title":"Evaluating the effect of visual data on multimodal artificial intelligence diagnostic performance","authors":"Arjun Mahajan ,&nbsp;Callie Fry ,&nbsp;Li Zhou ,&nbsp;David W Bates","doi":"10.1016/j.landig.2025.100938","DOIUrl":"10.1016/j.landig.2025.100938","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 12","pages":"Article 100938"},"PeriodicalIF":24.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance of universal and stratified computer-aided detection thresholds for chest x-ray-based tuberculosis screening: a cross-sectional, diagnostic accuracy study 通用和分层计算机辅助检测阈值在胸部x线肺结核筛查中的表现:一项横断面诊断准确性研究。
IF 24.1 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2025-12-01 Epub Date: 2025-11-26 DOI: 10.1016/j.landig.2025.100934
Joowhan Sung MD , Peter James Kitonsa MBChB , Annet Nalutaaya MS , David Isooba , Susan Birabwa , Keneth Ndyabayunga , Rogers Okura , Jonathan Magezi , Deborah Nantale , Ivan Mugabi , Violet Nakiiza , Prof David W Dowdy MD , Achilles Katamba PhD , Emily A Kendall MD
<div><h3>Background</h3><div>Computer-aided detection (CAD) software analyses chest x-rays for features suggestive of tuberculosis and provides a numeric abnormality score. However, estimates of CAD accuracy for tuberculosis screening are hindered by the scarcity of confirmatory data among people with lower x-ray scores, including those without symptoms. Additionally, the appropriate x-ray score thresholds for obtaining further testing might vary according to population and client characteristics. We aimed to evaluate the accuracy of CAD among all screened individuals and assess whether stratifying CAD thresholds by age and sex could improve performance.</div></div><div><h3>Methods</h3><div>In this cross-sectional, diagnostic accuracy study, we screened for tuberculosis in individuals aged 15 years and older in Uganda using portable chest x-rays with CAD (qXR version 3.2). Participants not on active tuberculosis treatment were offered screening regardless of their symptoms. We included data from all participants from both facility-based and community-based sites who were screened from June 1, 2022 (study start), to March 31, 2024. Individuals with x-ray scores above a threshold of 0·1 (range 0–1) were asked to provide sputum for Xpert MTB/RIF Ultra (Xpert) testing. We estimated the diagnostic accuracy (sensitivity, specificity, and area under the curve [AUC]) of CAD for detecting Xpert-positive tuberculosis when using the same threshold for all individuals (under different assumptions about tuberculosis prevalence among people with x-ray scores <0·1), and compared this estimate with approaches stratified by age, sex, or both.</div></div><div><h3>Findings</h3><div>54 840 individuals were assessed for eligibility, 52 835 of whom were screened for tuberculosis using CAD. The median age was 38 years (IQR 26–50), 23 586 (44·6%) participants were male, and 29 249 (55·4%) were female. 8949 (16·9%) had x-ray scores of 0·1 or more. Of 7219 participants with valid Xpert results, 382 (5·3%) were Xpert-positive, including 81 with trace results. Assuming 0·1% of participants with x-ray scores less than 0·1 would have been Xpert-positive if tested, qXR had an estimated AUC of 0·92 (95% CI 0·90–0·94) for Xpert-positive tuberculosis. Stratifying x-ray score thresholds according to age and sex improved accuracy; for example, at 96·1% (95% CI 95·9–96·3) specificity, estimated sensitivity was 75·0% (69·9–79·5) for a universal threshold (of ≥0·65) versus 76·9% (71·9–81·2) for thresholds stratified by age and sex (p=0·046).</div></div><div><h3>Interpretation</h3><div>Our findings suggest that the accuracy of CAD for tuberculosis screening among all screening participants, including those without symptoms or abnormal chest x-rays, is higher than previously estimated. Stratifying x-ray score thresholds based on client characteristics such as age and sex could further improve accuracy, enabling a more effective and personalised approach to tuberculosis screening.</di
背景:计算机辅助检测(CAD)软件分析胸部x光片的特征提示结核,并提供一个数字异常评分。然而,由于缺乏x线评分较低的人群(包括没有症状的人群)的证实性数据,对肺结核筛查CAD准确性的估计受到阻碍。此外,进行进一步检查的x线评分阈值可能因人群和患者特征而异。我们的目的是评估所有筛查个体中CAD的准确性,并评估按年龄和性别分层CAD阈值是否可以改善表现。方法:在这项横断面的诊断准确性研究中,我们使用携带CAD的便携式胸部x光片(qXR版本3.2)筛查乌干达15岁及以上个体的结核病。未接受积极结核病治疗的参与者无论其症状如何都进行了筛查。我们纳入了从2022年6月1日(研究开始)到2024年3月31日筛选的来自设施和社区站点的所有参与者的数据。x线评分高于0.1阈值(范围0-1)的个体被要求提供痰用于Xpert MTB/RIF Ultra (Xpert)检测。当对所有个体使用相同的阈值(对x线评分人群中结核病患病率的不同假设)时,我们估计了CAD检测expert阳性结核病的诊断准确性(敏感性、特异性和曲线下面积[AUC])。结果:54 840人被评估为合格,其中52 835人使用CAD进行了结核病筛查。中位年龄为38岁(IQR 26-50),男性23 586例(44.6%),女性29 249例(55.4%)。x线评分≥0.1者8949例(16.9%)。在7219名具有有效Xpert结果的参与者中,382名(5.3%)为Xpert阳性,其中81名为微量结果。假设有0.1%的x线评分低于0.1的参与者在检测时为专家阳性,那么对于专家阳性结核病,qXR的估计AUC为0.92 (95% CI为0.90 - 0.94)。根据年龄和性别分层x线评分阈值提高了准确性;例如,在96.1% (95% CI 95.9 - 96.3)的特异性下,通用阈值(≥0.65)的估计敏感性为75.0%(69.9 - 79.5),而按年龄和性别分层的阈值的估计敏感性为76.9% (79.1 - 81.2)(p= 0.046)。解释:我们的研究结果表明,在所有筛查参与者中,包括那些没有症状或胸部x线异常的参与者,CAD用于结核病筛查的准确性高于先前的估计。基于客户特征(如年龄和性别)分层x线评分阈值可以进一步提高准确性,从而实现更有效和个性化的结核病筛查方法。资助:美国国立卫生研究院。
{"title":"Performance of universal and stratified computer-aided detection thresholds for chest x-ray-based tuberculosis screening: a cross-sectional, diagnostic accuracy study","authors":"Joowhan Sung MD ,&nbsp;Peter James Kitonsa MBChB ,&nbsp;Annet Nalutaaya MS ,&nbsp;David Isooba ,&nbsp;Susan Birabwa ,&nbsp;Keneth Ndyabayunga ,&nbsp;Rogers Okura ,&nbsp;Jonathan Magezi ,&nbsp;Deborah Nantale ,&nbsp;Ivan Mugabi ,&nbsp;Violet Nakiiza ,&nbsp;Prof David W Dowdy MD ,&nbsp;Achilles Katamba PhD ,&nbsp;Emily A Kendall MD","doi":"10.1016/j.landig.2025.100934","DOIUrl":"10.1016/j.landig.2025.100934","url":null,"abstract":"&lt;div&gt;&lt;h3&gt;Background&lt;/h3&gt;&lt;div&gt;Computer-aided detection (CAD) software analyses chest x-rays for features suggestive of tuberculosis and provides a numeric abnormality score. However, estimates of CAD accuracy for tuberculosis screening are hindered by the scarcity of confirmatory data among people with lower x-ray scores, including those without symptoms. Additionally, the appropriate x-ray score thresholds for obtaining further testing might vary according to population and client characteristics. We aimed to evaluate the accuracy of CAD among all screened individuals and assess whether stratifying CAD thresholds by age and sex could improve performance.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Methods&lt;/h3&gt;&lt;div&gt;In this cross-sectional, diagnostic accuracy study, we screened for tuberculosis in individuals aged 15 years and older in Uganda using portable chest x-rays with CAD (qXR version 3.2). Participants not on active tuberculosis treatment were offered screening regardless of their symptoms. We included data from all participants from both facility-based and community-based sites who were screened from June 1, 2022 (study start), to March 31, 2024. Individuals with x-ray scores above a threshold of 0·1 (range 0–1) were asked to provide sputum for Xpert MTB/RIF Ultra (Xpert) testing. We estimated the diagnostic accuracy (sensitivity, specificity, and area under the curve [AUC]) of CAD for detecting Xpert-positive tuberculosis when using the same threshold for all individuals (under different assumptions about tuberculosis prevalence among people with x-ray scores &lt;0·1), and compared this estimate with approaches stratified by age, sex, or both.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Findings&lt;/h3&gt;&lt;div&gt;54 840 individuals were assessed for eligibility, 52 835 of whom were screened for tuberculosis using CAD. The median age was 38 years (IQR 26–50), 23 586 (44·6%) participants were male, and 29 249 (55·4%) were female. 8949 (16·9%) had x-ray scores of 0·1 or more. Of 7219 participants with valid Xpert results, 382 (5·3%) were Xpert-positive, including 81 with trace results. Assuming 0·1% of participants with x-ray scores less than 0·1 would have been Xpert-positive if tested, qXR had an estimated AUC of 0·92 (95% CI 0·90–0·94) for Xpert-positive tuberculosis. Stratifying x-ray score thresholds according to age and sex improved accuracy; for example, at 96·1% (95% CI 95·9–96·3) specificity, estimated sensitivity was 75·0% (69·9–79·5) for a universal threshold (of ≥0·65) versus 76·9% (71·9–81·2) for thresholds stratified by age and sex (p=0·046).&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Interpretation&lt;/h3&gt;&lt;div&gt;Our findings suggest that the accuracy of CAD for tuberculosis screening among all screening participants, including those without symptoms or abnormal chest x-rays, is higher than previously estimated. Stratifying x-ray score thresholds based on client characteristics such as age and sex could further improve accuracy, enabling a more effective and personalised approach to tuberculosis screening.&lt;/di","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 12","pages":"Article 100934"},"PeriodicalIF":24.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145641207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reflecting on lived experience expertise in digital mental health research 反思数字心理健康研究中的生活经验专业知识。
IF 24.1 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2025-12-01 Epub Date: 2025-12-19 DOI: 10.1016/j.landig.2025.100920
Shuranjeet Singh MSc , Alexandra Kenny BA , Laura Ospina-Pinillos PhD , Prof Sandra Bucci DClinPsy
Lived experience draws on unique insights gained from personal encounters with mental health challenges, irrespective of formal diagnoses. In this Viewpoint, we outline the important contribution of lived experience in digital mental health research in shaping research priorities and the design and delivery of digital mental health interventions and also examine ethical considerations involved. Although digital health technologies are frequently developed by researchers and industry experts, lived experience experts bring in an important voice to address issues such as usability, data privacy, and accessibility of digital tools in daily life. We draw on two case examples—the Wellcome Trust-funded Contributions of Social Networks to Community Thriving (CONNECT) study and the Wellcome Data Prize—that show how engaging lived experience experts can enhance recruitment, design, and equitable participation. We further recommend improved data governance, digital accessibility measures, capacity-building initiatives, and a global commitment to meaningful engagement to ensure that digital mental health research genuinely reflects and benefits the communities it intends to serve.
生活经验吸取了从个人遭遇精神健康挑战中获得的独特见解,而不管正式诊断如何。在本观点中,我们概述了数字心理健康研究中生活经验在塑造研究优先事项以及数字心理健康干预措施的设计和交付方面的重要贡献,并审查了所涉及的伦理考虑。虽然数字健康技术经常由研究人员和行业专家开发,但生活体验专家在解决日常生活中数字工具的可用性、数据隐私和可访问性等问题上发出了重要的声音。我们借鉴了两个案例——由惠康信托基金资助的“社交网络对社区繁荣的贡献”研究(CONNECT)和惠康数据奖——这两个案例展示了生活体验专家的参与如何能促进招聘、设计和公平参与。我们进一步建议改进数据治理、数字可及性措施、能力建设举措,并在全球范围内承诺进行有意义的参与,以确保数字心理健康研究真正反映并惠及其打算服务的社区。
{"title":"Reflecting on lived experience expertise in digital mental health research","authors":"Shuranjeet Singh MSc ,&nbsp;Alexandra Kenny BA ,&nbsp;Laura Ospina-Pinillos PhD ,&nbsp;Prof Sandra Bucci DClinPsy","doi":"10.1016/j.landig.2025.100920","DOIUrl":"10.1016/j.landig.2025.100920","url":null,"abstract":"<div><div>Lived experience draws on unique insights gained from personal encounters with mental health challenges, irrespective of formal diagnoses. In this Viewpoint, we outline the important contribution of lived experience in digital mental health research in shaping research priorities and the design and delivery of digital mental health interventions and also examine ethical considerations involved. Although digital health technologies are frequently developed by researchers and industry experts, lived experience experts bring in an important voice to address issues such as usability, data privacy, and accessibility of digital tools in daily life. We draw on two case examples—the Wellcome Trust-funded Contributions of Social Networks to Community Thriving (CONNECT) study and the Wellcome Data Prize—that show how engaging lived experience experts can enhance recruitment, design, and equitable participation. We further recommend improved data governance, digital accessibility measures, capacity-building initiatives, and a global commitment to meaningful engagement to ensure that digital mental health research genuinely reflects and benefits the communities it intends to serve.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 12","pages":"Article 100920"},"PeriodicalIF":24.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145800748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Lancet Digital Health
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1