首页 > 最新文献

Lancet Digital Health最新文献

英文 中文
A deep learning-based model to estimate pulmonary function from chest x-rays: multi-institutional model development and validation study in Japan 基于深度学习的胸部 X 射线肺功能估算模型:日本多机构模型开发与验证研究。
IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2024-08-01 DOI: 10.1016/S2589-7500(24)00113-4
<div><h3>Background</h3><p>Chest x-ray is a basic, cost-effective, and widely available imaging method that is used for static assessments of organic diseases and anatomical abnormalities, but its ability to estimate dynamic measurements such as pulmonary function is unknown. We aimed to estimate two major pulmonary functions from chest x-rays.</p></div><div><h3>Methods</h3><p>In this retrospective model development and validation study, we trained, validated, and externally tested a deep learning-based artificial intelligence (AI) model to estimate forced vital capacity (FVC) and forced expiratory volume in 1 s (FEV<sub>1</sub>) from chest x-rays. We included consecutively collected results of spirometry and any associated chest x-rays that had been obtained between July 1, 2003, and Dec 31, 2021, from five institutions in Japan (labelled institutions A–E). Eligible x-rays had been acquired within 14 days of spirometry and were labelled with the FVC and FEV<sub>1</sub>. X-rays from three institutions (A–C) were used for training, validation, and internal testing, with the testing dataset being independent of the training and validation datasets, and then x-rays from the two other institutions (D and E) were used for independent external testing. Performance for estimating FVC and FEV<sub>1</sub> was evaluated by calculating the Pearson's correlation coefficient (<em>r</em>), intraclass correlation coefficient (ICC), mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) compared with the results of spirometry.</p></div><div><h3>Findings</h3><p>We included 141 734 x-ray and spirometry pairs from 81 902 patients from the five institutions. The training, validation, and internal test datasets included 134 307 x-rays from 75 768 patients (37 718 [50%] female, 38 050 [50%] male; mean age 56 years [SD 18]), and the external test datasets included 2137 x-rays from 1861 patients (742 [40%] female, 1119 [60%] male; mean age 65 years [SD 17]) from institution D and 5290 x-rays from 4273 patients (1972 [46%] female, 2301 [54%] male; mean age 63 years [SD 17]) from institution E. External testing for FVC yielded <em>r</em> values of 0·91 (99% CI 0·90–0·92) for institution D and 0·90 (0·89–0·91) for institution E, ICC of 0·91 (99% CI 0·90–0·92) and 0·89 (0·88–0·90), MSE of 0·17 L<sup>2</sup> (99% CI 0·15–0·19) and 0·17 L<sup>2</sup> (0·16–0·19), RMSE of 0·41 L (99% CI 0·39–0·43) and 0·41 L (0·39–0·43), and MAE of 0·31 L (99% CI 0·29–0·32) and 0·31 L (0·30–0·32). External testing for FEV<sub>1</sub> yielded <em>r</em> values of 0·91 (99% CI 0·90–0·92) for institution D and 0·91 (0·90–0·91) for institution E, ICC of 0·90 (99% CI 0·89–0·91) and 0·90 (0·90–0·91), MSE of 0·13 L<sup>2</sup> (99% CI 0·12–0·15) and 0·11 L<sup>2</sup> (0·10–0·12), RMSE of 0·37 L (99% CI 0·35–0·38) and 0·33 L (0·32–0·35), and MAE of 0·28 L (99% CI 0·27–0·29) and 0·25 L (0·25–0·26).</p></div><div><h3>Interpretation</h3><p>This deep learning model allowed
背景:胸部 X 光片是一种基本、经济、广泛使用的成像方法,可用于器质性疾病和解剖异常的静态评估,但其估算肺功能等动态测量值的能力尚不清楚。我们的目的是通过胸部 X 光片估测两种主要的肺功能:在这项回顾性模型开发和验证研究中,我们对基于深度学习的人工智能(AI)模型进行了训练、验证和外部测试,以便从胸部 X 光片中估算出用力肺活量(FVC)和 1 秒用力呼气容积(FEV1)。我们纳入了从 2003 年 7 月 1 日到 2021 年 12 月 31 日期间从日本五家机构(标注为机构 A-E)连续收集的肺活量测定结果和任何相关的胸部 X 光片。符合条件的 X 光片是在肺活量测定后 14 天内获得的,并标有 FVC 和 FEV1。来自三个机构(A-C)的 X 光片被用于训练、验证和内部测试,测试数据集独立于训练和验证数据集,然后来自其他两个机构(D 和 E)的 X 光片被用于独立的外部测试。通过计算与肺活量测定结果相比的皮尔逊相关系数(r)、类内相关系数(ICC)、均方误差(MSE)、均方根误差(RMSE)和平均绝对误差(MAE)来评估 FVC 和 FEV1 的估算结果:我们纳入了五家机构 81 902 名患者的 141 734 对 X 光片和肺活量测定结果。训练、验证和内部测试数据集包括 75 768 名患者的 134 307 张 X 光片(女性 37 718 [50%],男性 38 050 [50%];平均年龄 56 岁 [SD 18]),外部测试数据集包括 1861 名患者的 2137 张 X 光片(女性 742 [40%],男性 1119 [60%];平均年龄 65 岁 [SD 17]);外部检测数据集包括 D 机构 1861 名患者(女性 742 人 [40%],男性 1119 人 [60%];平均年龄 65 岁 [SD 17])的 2137 张 X 光片和 E 机构 4273 名患者(女性 1972 人 [46%],男性 2301 人 [54%];平均年龄 63 岁 [SD 17])的 5290 张 X 光片。对 FVC 的外部测试结果显示,D 机构的 r 值为 0-91(99% CI 0-90-0-92),E 机构为 0-90(0-89-0-91),ICC 为 0-91(99% CI 0-90-0-92)和 0-89(0-88-0-90)、MSE为 0-17 L2 (99% CI 0-15-0-19) 和 0-17 L2 (0-16-0-19),RMSE为 0-41 L (99% CI 0-39-0-43) 和 0-41 L (0-39-0-43),MAE为 0-31 L (99% CI 0-29-0-32) 和 0-31 L (0-30-0-32)。对 FEV1 的外部测试结果显示,D 机构的 r 值为 0-91(99% CI 0-90-0-92),E 机构为 0-91(0-90-0-91),ICC 为 0-90(99% CI 0-89-0-91)和 0-90(0-90-0-91)、MSE 为 0-13 L2 (99% CI 0-12-0-15) 和 0-11 L2 (0-10-0-12),RMSE 为 0-37 L (99% CI 0-35-0-38) 和 0-33 L (0-32-0-35),MAE 为 0-28 L (99% CI 0-27-0-29) 和 0-25 L (0-25-0-26)。解释:该深度学习模型可通过胸部 X 光片估算出 FVC 和 FEV1,与肺活量测量法显示出很高的一致性。该模型为肺活量测定提供了一种评估肺功能的替代方法,尤其适用于无法进行肺活量测定的患者,并可根据从胸部X光片中获得的信息加强CT成像方案的定制,从而改善肺部疾病的诊断和管理。未来的研究应调查该人工智能模型与临床信息相结合的性能,以便更恰当、更有针对性地使用:无。
{"title":"A deep learning-based model to estimate pulmonary function from chest x-rays: multi-institutional model development and validation study in Japan","authors":"","doi":"10.1016/S2589-7500(24)00113-4","DOIUrl":"10.1016/S2589-7500(24)00113-4","url":null,"abstract":"&lt;div&gt;&lt;h3&gt;Background&lt;/h3&gt;&lt;p&gt;Chest x-ray is a basic, cost-effective, and widely available imaging method that is used for static assessments of organic diseases and anatomical abnormalities, but its ability to estimate dynamic measurements such as pulmonary function is unknown. We aimed to estimate two major pulmonary functions from chest x-rays.&lt;/p&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Methods&lt;/h3&gt;&lt;p&gt;In this retrospective model development and validation study, we trained, validated, and externally tested a deep learning-based artificial intelligence (AI) model to estimate forced vital capacity (FVC) and forced expiratory volume in 1 s (FEV&lt;sub&gt;1&lt;/sub&gt;) from chest x-rays. We included consecutively collected results of spirometry and any associated chest x-rays that had been obtained between July 1, 2003, and Dec 31, 2021, from five institutions in Japan (labelled institutions A–E). Eligible x-rays had been acquired within 14 days of spirometry and were labelled with the FVC and FEV&lt;sub&gt;1&lt;/sub&gt;. X-rays from three institutions (A–C) were used for training, validation, and internal testing, with the testing dataset being independent of the training and validation datasets, and then x-rays from the two other institutions (D and E) were used for independent external testing. Performance for estimating FVC and FEV&lt;sub&gt;1&lt;/sub&gt; was evaluated by calculating the Pearson's correlation coefficient (&lt;em&gt;r&lt;/em&gt;), intraclass correlation coefficient (ICC), mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) compared with the results of spirometry.&lt;/p&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Findings&lt;/h3&gt;&lt;p&gt;We included 141 734 x-ray and spirometry pairs from 81 902 patients from the five institutions. The training, validation, and internal test datasets included 134 307 x-rays from 75 768 patients (37 718 [50%] female, 38 050 [50%] male; mean age 56 years [SD 18]), and the external test datasets included 2137 x-rays from 1861 patients (742 [40%] female, 1119 [60%] male; mean age 65 years [SD 17]) from institution D and 5290 x-rays from 4273 patients (1972 [46%] female, 2301 [54%] male; mean age 63 years [SD 17]) from institution E. External testing for FVC yielded &lt;em&gt;r&lt;/em&gt; values of 0·91 (99% CI 0·90–0·92) for institution D and 0·90 (0·89–0·91) for institution E, ICC of 0·91 (99% CI 0·90–0·92) and 0·89 (0·88–0·90), MSE of 0·17 L&lt;sup&gt;2&lt;/sup&gt; (99% CI 0·15–0·19) and 0·17 L&lt;sup&gt;2&lt;/sup&gt; (0·16–0·19), RMSE of 0·41 L (99% CI 0·39–0·43) and 0·41 L (0·39–0·43), and MAE of 0·31 L (99% CI 0·29–0·32) and 0·31 L (0·30–0·32). External testing for FEV&lt;sub&gt;1&lt;/sub&gt; yielded &lt;em&gt;r&lt;/em&gt; values of 0·91 (99% CI 0·90–0·92) for institution D and 0·91 (0·90–0·91) for institution E, ICC of 0·90 (99% CI 0·89–0·91) and 0·90 (0·90–0·91), MSE of 0·13 L&lt;sup&gt;2&lt;/sup&gt; (99% CI 0·12–0·15) and 0·11 L&lt;sup&gt;2&lt;/sup&gt; (0·10–0·12), RMSE of 0·37 L (99% CI 0·35–0·38) and 0·33 L (0·32–0·35), and MAE of 0·28 L (99% CI 0·27–0·29) and 0·25 L (0·25–0·26).&lt;/p&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Interpretation&lt;/h3&gt;&lt;p&gt;This deep learning model allowed ","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 8","pages":"Pages e580-e588"},"PeriodicalIF":23.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589750024001134/pdfft?md5=d7024a15c05d0bb8522e24f48c3cce86&pid=1-s2.0-S2589750024001134-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141564862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Medical artificial intelligence for clinicians: the lost cognitive perspective 面向临床医生的医学人工智能:迷失的认知视角。
IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2024-08-01 DOI: 10.1016/S2589-7500(24)00095-5
Lana Tikhomirov BPsych [Hons] , Prof Carolyn Semmler PhD , Melissa McCradden PhD , Rachel Searston PhD , Marzyeh Ghassemi PhD , Lauren Oakden-Rayner MD PhD

The development and commercialisation of medical decision systems based on artificial intelligence (AI) far outpaces our understanding of their value for clinicians. Although applicable across many forms of medicine, we focus on characterising the diagnostic decisions of radiologists through the concept of ecologically bounded reasoning, review the differences between clinician decision making and medical AI model decision making, and reveal how these differences pose fundamental challenges for integrating AI into radiology. We argue that clinicians are contextually motivated, mentally resourceful decision makers, whereas AI models are contextually stripped, correlational decision makers, and discuss misconceptions about clinician–AI interaction stemming from this misalignment of capabilities. We outline how future research on clinician–AI interaction could better address the cognitive considerations of decision making and be used to enhance the safety and usability of AI models in high-risk medical decision-making contexts.

基于人工智能(AI)的医疗决策系统的开发和商业化速度远远超过了我们对其对临床医生价值的理解。虽然人工智能适用于多种形式的医学,但我们重点通过生态约束推理的概念来描述放射科医生诊断决策的特点,回顾临床医生决策与医学人工智能模型决策之间的差异,并揭示这些差异如何为将人工智能融入放射学带来根本性的挑战。我们认为,临床医生是情境激励型、心智资源型的决策者,而人工智能模型则是情境剥离型、关联型的决策者,并讨论了这种能力错位导致的临床医生与人工智能互动的误解。我们概述了未来关于临床医生与人工智能互动的研究如何能更好地处理决策过程中的认知因素,并用于提高人工智能模型在高风险医疗决策环境中的安全性和可用性。
{"title":"Medical artificial intelligence for clinicians: the lost cognitive perspective","authors":"Lana Tikhomirov BPsych [Hons] ,&nbsp;Prof Carolyn Semmler PhD ,&nbsp;Melissa McCradden PhD ,&nbsp;Rachel Searston PhD ,&nbsp;Marzyeh Ghassemi PhD ,&nbsp;Lauren Oakden-Rayner MD PhD","doi":"10.1016/S2589-7500(24)00095-5","DOIUrl":"10.1016/S2589-7500(24)00095-5","url":null,"abstract":"<div><p>The development and commercialisation of medical decision systems based on artificial intelligence (AI) far outpaces our understanding of their value for clinicians. Although applicable across many forms of medicine, we focus on characterising the diagnostic decisions of radiologists through the concept of ecologically bounded reasoning, review the differences between clinician decision making and medical AI model decision making, and reveal how these differences pose fundamental challenges for integrating AI into radiology. We argue that clinicians are contextually motivated, mentally resourceful decision makers, whereas AI models are contextually stripped, correlational decision makers, and discuss misconceptions about clinician–AI interaction stemming from this misalignment of capabilities. We outline how future research on clinician–AI interaction could better address the cognitive considerations of decision making and be used to enhance the safety and usability of AI models in high-risk medical decision-making contexts.</p></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 8","pages":"Pages e589-e594"},"PeriodicalIF":23.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589750024000955/pdfft?md5=c4262279ee0696247e86b8dc47f4a153&pid=1-s2.0-S2589750024000955-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141767680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A prognostic and predictive computational pathology immune signature for ductal carcinoma in situ: retrospective results from a cohort within the UK/ANZ DCIS trial 导管原位癌的预后和预测性计算病理学免疫特征:英国/新西兰 DCIS 试验队列的回顾性结果。
IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2024-08-01 DOI: 10.1016/S2589-7500(24)00116-X

Background

The density of tumour-infiltrating lymphocytes (TILs) could be prognostic in ductal carcinoma in situ (DCIS). However, manual TIL quantification is time-consuming and suffers from interobserver and intraobserver variability. In this study, we developed a TIL-based computational pathology biomarker and evaluated its association with the risk of recurrence and benefit of adjuvant treatment in a clinical trial cohort.

Methods

In this retrospective cohort study, a computational pathology pipeline was developed to generate a TIL-based biomarker (CPath TIL categories). Subsequently, the signature underwent a masked independent validation on H&E-stained whole-section images of 755 patients with DCIS from the UK/ANZ DCIS randomised controlled trial. Specifically, continuous biomarker CPath TIL score was calculated as the average TIL density in the DCIS microenvironment and dichotomised into binary biomarker CPath TIL categories (CPath TIL-high vs CPath TIL-low) using the median value as a cutoff. The primary outcome was ipsilateral breast event (IBE; either recurrence of DCIS [DCIS-IBE] or invasive progression [I-IBE]). The Cox proportional hazards model was used to estimate the hazard ratio (HR).

Findings

CPath TIL-score was evaluable in 718 (95%) of 755 patients (151 IBEs). Patients with CPath TIL-high DCIS had a greater risk of IBE than those with CPath TIL-low DCIS (HR 2·10 [95% CI 1·39–3·18]; p=0·0004). The risk of I-IBE was greater in patients with CPath TIL-high DCIS than those with CPath TIL-low DCIS (3·09 [1·56–6·14]; p=0·0013), and the risk of DCIS-IBE was non-significantly higher in those with CPath TIL-high DCIS (1·61 [0·95–2·72]; p=0·077). A significant interaction (pinteraction=0·025) between CPath TIL categories and radiotherapy was observed with a greater magnitude of radiotherapy benefit in preventing IBE in CPath TIL-high DCIS (0·32 [0·19–0·54]) than CPath TIL-low DCIS (0·40 [0·20–0·81]).

Interpretation

High TIL density is associated with higher recurrence risk—particularly of invasive recurrence—and greater radiotherapy benefit in patients with DCIS. Our TIL-based computational pathology signature has a prognostic and predictive role in DCIS.

Funding

National Cancer Institute under award number U01CA269181, Cancer Research UK (C569/A12061; C569/A16891), and the Breast Cancer Research Foundation, New York (NY, USA).

背景:肿瘤浸润淋巴细胞(TIL)的密度可作为导管原位癌(DCIS)的预后指标。然而,人工定量 TIL 不仅耗时,而且存在观察者之间和观察者内部的差异。在本研究中,我们开发了一种基于TIL的计算病理学生物标志物,并在临床试验队列中评估了其与复发风险和辅助治疗获益的相关性:在这项回顾性队列研究中,开发了一个计算病理学管道,以生成基于TIL的生物标志物(CPath TIL类别)。随后,对英国/新西兰 DCIS 随机对照试验中 755 名 DCIS 患者的 H&E 染色全切片图像进行了独立的掩蔽验证。具体来说,连续生物标志物 CPath TIL 评分计算为 DCIS 微环境中的平均 TIL 密度,并以中值作为分界点,将其分为二元生物标志物 CPath TIL 类别(CPath TIL 高 vs CPath TIL 低)。主要结果是同侧乳腺事件(IBE;DCIS复发[DCIS-IBE]或浸润性进展[I-IBE])。采用 Cox 比例危险模型估算危险比 (HR):在 755 例患者(151 例 IBE)中,有 718 例(95%)的 CPath TIL 评分可进行评估。CPath TIL 高的 DCIS 患者比 CPath TIL 低的 DCIS 患者发生 IBE 的风险更高(HR 2-10 [95% CI 1-39-3-18]; p=0-0004)。CPath TIL高的DCIS患者发生I-BE的风险高于CPath TIL低的DCIS患者(3-09 [1-56-6-14]; p=0-0013),CPath TIL高的DCIS患者发生DCIS-IBE的风险无显著性差异(1-61 [0-95-2-72]; p=0-077)。CPath TIL类别与放疗之间存在明显的交互作用(pinteraction=0-025),CPath TIL高的DCIS(0-32 [0-19-0-54])比CPath TIL低的DCIS(0-40 [0-20-0-81])在预防IBE方面的放疗获益更大:高TIL密度与DCIS患者较高的复发风险(尤其是侵袭性复发)和更大的放疗获益相关。我们基于TIL的计算病理学特征对DCIS具有预后和预测作用:美国国立癌症研究所(获奖号:U01CA269181)、英国癌症研究中心(C569/A12061; C569/A16891)和美国纽约乳腺癌研究基金会。
{"title":"A prognostic and predictive computational pathology immune signature for ductal carcinoma in situ: retrospective results from a cohort within the UK/ANZ DCIS trial","authors":"","doi":"10.1016/S2589-7500(24)00116-X","DOIUrl":"10.1016/S2589-7500(24)00116-X","url":null,"abstract":"<div><h3>Background</h3><p>The density of tumour-infiltrating lymphocytes (TILs) could be prognostic in ductal carcinoma in situ (DCIS). However, manual TIL quantification is time-consuming and suffers from interobserver and intraobserver variability. In this study, we developed a TIL-based computational pathology biomarker and evaluated its association with the risk of recurrence and benefit of adjuvant treatment in a clinical trial cohort.</p></div><div><h3>Methods</h3><p>In this retrospective cohort study, a computational pathology pipeline was developed to generate a TIL-based biomarker (CPath TIL categories). Subsequently, the signature underwent a masked independent validation on H&amp;E-stained whole-section images of 755 patients with DCIS from the UK/ANZ DCIS randomised controlled trial. Specifically, continuous biomarker CPath TIL score was calculated as the average TIL density in the DCIS microenvironment and dichotomised into binary biomarker CPath TIL categories (CPath TIL-high <em>vs</em> CPath TIL-low) using the median value as a cutoff. The primary outcome was ipsilateral breast event (IBE; either recurrence of DCIS [DCIS-IBE] or invasive progression [I-IBE]). The Cox proportional hazards model was used to estimate the hazard ratio (HR).</p></div><div><h3>Findings</h3><p>CPath TIL-score was evaluable in 718 (95%) of 755 patients (151 IBEs). Patients with CPath TIL-high DCIS had a greater risk of IBE than those with CPath TIL-low DCIS (HR 2·10 [95% CI 1·39–3·18]; p=0·0004). The risk of I-IBE was greater in patients with CPath TIL-high DCIS than those with CPath TIL-low DCIS (3·09 [1·56–6·14]; p=0·0013), and the risk of DCIS-IBE was non-significantly higher in those with CPath TIL-high DCIS (1·61 [0·95–2·72]; p=0·077). A significant interaction (p<sub>interaction</sub>=0·025) between CPath TIL categories and radiotherapy was observed with a greater magnitude of radiotherapy benefit in preventing IBE in CPath TIL-high DCIS (0·32 [0·19–0·54]) than CPath TIL-low DCIS (0·40 [0·20–0·81]).</p></div><div><h3>Interpretation</h3><p>High TIL density is associated with higher recurrence risk—particularly of invasive recurrence—and greater radiotherapy benefit in patients with DCIS. Our TIL-based computational pathology signature has a prognostic and predictive role in DCIS.</p></div><div><h3>Funding</h3><p>National Cancer Institute under award number U01CA269181, Cancer Research UK (C569/A12061; C569/A16891), and the Breast Cancer Research Foundation, New York (NY, USA).</p></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 8","pages":"Pages e562-e569"},"PeriodicalIF":23.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S258975002400116X/pdfft?md5=995a38719dfb36fc24e8288708c57372&pid=1-s2.0-S258975002400116X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141581170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to Lancet Digit Health 2024; 6: e562–69 Lancet Digit Health 2024; 6: e562-69 更正。
IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2024-08-01 DOI: 10.1016/S2589-7500(24)00156-0
{"title":"Correction to Lancet Digit Health 2024; 6: e562–69","authors":"","doi":"10.1016/S2589-7500(24)00156-0","DOIUrl":"10.1016/S2589-7500(24)00156-0","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 8","pages":"Page e545"},"PeriodicalIF":23.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589750024001560/pdfft?md5=499968d7a55949c6abf6cf414a476422&pid=1-s2.0-S2589750024001560-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141767676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation and communication of pandemic scenarios 大流行病情景的评估和传播。
IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2024-08-01 DOI: 10.1016/S2589-7500(24)00144-4
Philip Gerlee , Henrik Thorén , Anna Saxne Jöud , Torbjörn Lundh , Armin Spreco , Anders Nordlund , Thomas Brezicka , Tom Britton , Magnus Kjellberg , Henrik Källberg , Anders Tegnell , Lisa Brouwers , Toomas Timpka
{"title":"Evaluation and communication of pandemic scenarios","authors":"Philip Gerlee ,&nbsp;Henrik Thorén ,&nbsp;Anna Saxne Jöud ,&nbsp;Torbjörn Lundh ,&nbsp;Armin Spreco ,&nbsp;Anders Nordlund ,&nbsp;Thomas Brezicka ,&nbsp;Tom Britton ,&nbsp;Magnus Kjellberg ,&nbsp;Henrik Källberg ,&nbsp;Anders Tegnell ,&nbsp;Lisa Brouwers ,&nbsp;Toomas Timpka","doi":"10.1016/S2589-7500(24)00144-4","DOIUrl":"10.1016/S2589-7500(24)00144-4","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 8","pages":"Pages e543-e544"},"PeriodicalIF":23.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589750024001444/pdfft?md5=208ba0fe86d5c6cdff7a0fec55078935&pid=1-s2.0-S2589750024001444-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141767677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feasibility of wearable sensor signals and self-reported symptoms to prompt at-home testing for acute respiratory viruses in the USA (DETECT-AHEAD): a decentralised, randomised controlled trial 美国利用可穿戴传感器信号和自我报告的症状提示进行急性呼吸道病毒居家检测的可行性(DETECT-AHEAD):一项分散的随机对照试验。
IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2024-08-01 DOI: 10.1016/S2589-7500(24)00096-7
Giorgio Quer PhD , Erin Coughlin BSN , Jorge Villacian MD , Felipe Delgado BS , Katherine Harris MPH , John Verrant MS , Matteo Gadaleta PhD , Ting-Yang Hung BS , Janna Ter Meer PhD , Jennifer M Radin PhD , Edward Ramos PhD , Monique Adams PhD , Lomi Kim DVM , Jason W Chien MD , Katie Baca-Motes MBA , Jay A Pandit MD , Dmitri Talantov MD , Prof Steven R Steinhubl MD
<div><h3>Background</h3><p>Early identification of an acute respiratory infection is important for reducing transmission and enabling earlier therapeutic intervention. We aimed to prospectively evaluate the feasibility of home-based diagnostic self-testing of viral pathogens in individuals prompted to do so on the basis of self-reported symptoms or individual changes in physiological parameters detected via a wearable sensor.</p></div><div><h3>Methods</h3><p>DETECT-AHEAD was a prospective, decentralised, randomised controlled trial carried out in a subpopulation of an existing cohort (DETECT) of individuals enrolled in a digital-only observational study in the USA. Participants aged 18 years or older were randomly assigned (1:1:1) with a block randomisation scheme stratified by under-represented in biomedical research status. All participants were offered a wearable sensor (Fitbit Sense smartwatch). Participants in groups 1 and 2 received an at-home self-test kit (Alveo be.well) for two acute respiratory viral pathogens: SARS-CoV-2 and respiratory syncytial virus. Participants in group 1 could be alerted through the DETECT study app to take the at-home test on the basis of changes in their physiological data (as detected by our algorithm) or due to self-reported symptoms; those in group 2 were prompted via the app to self-test only due to symptoms. Group 3 served as the control group, without alerts or home testing capability. The primary endpoints, assessed on an intention-to-treat basis, were the number of acute respiratory infections presented (self-reported) and diagnosed (electronic health record), and the number of participants using at-home testing in groups 1 and 2. This trial is registered with <span><span>ClinicalTrials.gov</span><svg><path></path></svg></span>, <span><span>NCT04336020</span><svg><path></path></svg></span>.</p></div><div><h3>Findings</h3><p>Between Sept 28 and Dec 30, 2021, 450 participants were recruited and randomly assigned to group 1 (n=149), group 2 (n=151), or group 3 (n=150). 179 (40%) participants were male, 264 (59%) were female, and seven (2%) identified as other. 232 (52%) were from populations historically under-represented in biomedical research. 118 (39%) of the 300 participants in groups 1 and 2 were prompted to self-test, with 61 (52%) successfully completing self-testing. Participants were prompted to home-test more frequently due to symptoms (41 [28%] in group 1 and 51 [34%] in group 2) than due to detected physiological changes (26 [17%] in group 1). Significantly more participants in group 1 received alerts to test than did those in group 2 (67 [45%] <em>vs</em> 51 [34%]; p=0·047). Of the 61 individuals who were prompted to test and successfully did so, 19 (31%) tested positive for a viral pathogen—all for SARS-CoV-2. The individuals diagnosed as positive for SARS-CoV-2 in the electronic health record were eight (5%) in group 1, four (3%) in group 2, and two (1%) in group 3, but it was difficult to c
背景:早期识别急性呼吸道感染对于减少传播和早期治疗干预非常重要。我们的目的是前瞻性地评估根据自我报告的症状或通过可穿戴传感器检测到的个人生理参数变化,对个人进行基于家庭的病毒病原体诊断性自我检测的可行性:DETECT-AHEAD是一项前瞻性、分散的随机对照试验,在美国参加纯数字观察研究的现有人群(DETECT)的一个子人群中进行。年龄在 18 岁或以上的参与者按照生物医学研究中代表性不足的状况进行分层随机分配(1:1:1)。所有参与者都获得了一个可穿戴传感器(Fitbit Sense 智能手表)。第 1 组和第 2 组的参与者接受了两种急性呼吸道病毒病原体的居家自我检测试剂盒(Alveo be.well):SARS-CoV-2 和呼吸道合胞病毒。第 1 组的参与者可根据生理数据的变化(由我们的算法检测到)或自我报告的症状,通过 DETECT 研究应用程序提醒他们进行居家检测;第 2 组的参与者仅在出现症状时才通过应用程序提示他们进行自我检测。第 3 组为对照组,没有提示或家庭测试功能。在意向治疗基础上评估的主要终点是急性呼吸道感染(自报)和诊断(电子健康记录)的数量,以及在第1组和第2组中使用家庭检测的参与者数量。该试验已在 ClinicalTrials.gov 注册,编号为 NCT04336020:2021年9月28日至12月30日期间,共招募了450名参与者,并随机分配到第1组(人数=149)、第2组(人数=151)或第3组(人数=150)。179名参与者(40%)为男性,264名(59%)为女性,7名(2%)为其他身份。232人(52%)来自历史上在生物医学研究中代表性不足的人群。在第一组和第二组的 300 名参与者中,有 118 人(39%)在提示下进行了自我检测,其中 61 人(52%)成功完成了自我检测。因症状(第一组 41 人 [28%],第二组 51 人 [34%])而提示参与者进行家庭检测的频率高于因检测到的生理变化(第一组 26 人 [17%])。收到测试提示的第一组参与者明显多于第二组(67 [45%] vs 51 [34%];P=0-047)。在收到检测提示并成功进行检测的 61 人中,有 19 人(31%)的病毒病原体检测呈阳性,全部为 SARS-CoV-2。在电子健康记录中被诊断为 SARS-CoV-2 阳性的患者中,第一组有 8 人(5%),第二组有 4 人(3%),第三组有 2 人(1%),但很难确认他们是否与试验中记录的症状发作有关。没有不良事件发生:在这项直接面向参与者的试验中,我们展示了一项分散计划的早期可行性,该计划可根据研究应用程序中跟踪的症状或使用可穿戴传感器检测到的生理变化,提示个人使用病毒病原体诊断测试。此外,还发现了阻碍充分参与和表现的因素,这需要在大规模实施前加以解决:资金来源:杨森制药公司
{"title":"Feasibility of wearable sensor signals and self-reported symptoms to prompt at-home testing for acute respiratory viruses in the USA (DETECT-AHEAD): a decentralised, randomised controlled trial","authors":"Giorgio Quer PhD ,&nbsp;Erin Coughlin BSN ,&nbsp;Jorge Villacian MD ,&nbsp;Felipe Delgado BS ,&nbsp;Katherine Harris MPH ,&nbsp;John Verrant MS ,&nbsp;Matteo Gadaleta PhD ,&nbsp;Ting-Yang Hung BS ,&nbsp;Janna Ter Meer PhD ,&nbsp;Jennifer M Radin PhD ,&nbsp;Edward Ramos PhD ,&nbsp;Monique Adams PhD ,&nbsp;Lomi Kim DVM ,&nbsp;Jason W Chien MD ,&nbsp;Katie Baca-Motes MBA ,&nbsp;Jay A Pandit MD ,&nbsp;Dmitri Talantov MD ,&nbsp;Prof Steven R Steinhubl MD","doi":"10.1016/S2589-7500(24)00096-7","DOIUrl":"10.1016/S2589-7500(24)00096-7","url":null,"abstract":"&lt;div&gt;&lt;h3&gt;Background&lt;/h3&gt;&lt;p&gt;Early identification of an acute respiratory infection is important for reducing transmission and enabling earlier therapeutic intervention. We aimed to prospectively evaluate the feasibility of home-based diagnostic self-testing of viral pathogens in individuals prompted to do so on the basis of self-reported symptoms or individual changes in physiological parameters detected via a wearable sensor.&lt;/p&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Methods&lt;/h3&gt;&lt;p&gt;DETECT-AHEAD was a prospective, decentralised, randomised controlled trial carried out in a subpopulation of an existing cohort (DETECT) of individuals enrolled in a digital-only observational study in the USA. Participants aged 18 years or older were randomly assigned (1:1:1) with a block randomisation scheme stratified by under-represented in biomedical research status. All participants were offered a wearable sensor (Fitbit Sense smartwatch). Participants in groups 1 and 2 received an at-home self-test kit (Alveo be.well) for two acute respiratory viral pathogens: SARS-CoV-2 and respiratory syncytial virus. Participants in group 1 could be alerted through the DETECT study app to take the at-home test on the basis of changes in their physiological data (as detected by our algorithm) or due to self-reported symptoms; those in group 2 were prompted via the app to self-test only due to symptoms. Group 3 served as the control group, without alerts or home testing capability. The primary endpoints, assessed on an intention-to-treat basis, were the number of acute respiratory infections presented (self-reported) and diagnosed (electronic health record), and the number of participants using at-home testing in groups 1 and 2. This trial is registered with &lt;span&gt;&lt;span&gt;ClinicalTrials.gov&lt;/span&gt;&lt;svg&gt;&lt;path&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;, &lt;span&gt;&lt;span&gt;NCT04336020&lt;/span&gt;&lt;svg&gt;&lt;path&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;.&lt;/p&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Findings&lt;/h3&gt;&lt;p&gt;Between Sept 28 and Dec 30, 2021, 450 participants were recruited and randomly assigned to group 1 (n=149), group 2 (n=151), or group 3 (n=150). 179 (40%) participants were male, 264 (59%) were female, and seven (2%) identified as other. 232 (52%) were from populations historically under-represented in biomedical research. 118 (39%) of the 300 participants in groups 1 and 2 were prompted to self-test, with 61 (52%) successfully completing self-testing. Participants were prompted to home-test more frequently due to symptoms (41 [28%] in group 1 and 51 [34%] in group 2) than due to detected physiological changes (26 [17%] in group 1). Significantly more participants in group 1 received alerts to test than did those in group 2 (67 [45%] &lt;em&gt;vs&lt;/em&gt; 51 [34%]; p=0·047). Of the 61 individuals who were prompted to test and successfully did so, 19 (31%) tested positive for a viral pathogen—all for SARS-CoV-2. The individuals diagnosed as positive for SARS-CoV-2 in the electronic health record were eight (5%) in group 1, four (3%) in group 2, and two (1%) in group 3, but it was difficult to c","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 8","pages":"Pages e546-e554"},"PeriodicalIF":23.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11296689/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141767678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pathology in the era of generative AI 生成式人工智能时代的病理学。
IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2024-08-01 DOI: 10.1016/S2589-7500(24)00157-2
The Lancet Digital Health
{"title":"Pathology in the era of generative AI","authors":"The Lancet Digital Health","doi":"10.1016/S2589-7500(24)00157-2","DOIUrl":"10.1016/S2589-7500(24)00157-2","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 8","pages":"Page e536"},"PeriodicalIF":23.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589750024001572/pdfft?md5=b34ff67d1c10eff92c7810c03c24a9b8&pid=1-s2.0-S2589750024001572-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141767682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ChatGPT for digital pathology research 用于数字病理学研究的 ChatGPT。
IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2024-08-01 DOI: 10.1016/S2589-7500(24)00114-6

The rapid evolution of generative artificial intelligence (AI) models including OpenAI's ChatGPT signals a promising era for medical research. In this Viewpoint, we explore the integration and challenges of large language models (LLMs) in digital pathology, a rapidly evolving domain demanding intricate contextual understanding. The restricted domain-specific efficiency of LLMs necessitates the advent of tailored AI tools, as illustrated by advancements seen in the last few years including FrugalGPT and BioBERT. Our initiative in digital pathology emphasises the potential of domain-specific AI tools, where a curated literature database coupled with a user-interactive web application facilitates precise, referenced information retrieval. Motivated by the success of this initiative, we discuss how domain-specific approaches substantially minimise the risk of inaccurate responses, enhancing the reliability and accuracy of information extraction. We also highlight the broader implications of such tools, particularly in streamlining access to scientific research and democratising access to computational pathology techniques for scientists with little coding experience. This Viewpoint calls for an enhanced integration of domain-specific text-generation AI tools in academic settings to facilitate continuous learning and adaptation to the dynamically evolving landscape of medical research.

包括 OpenAI 的 ChatGPT 在内的生成式人工智能(AI)模型的快速发展标志着医学研究进入了一个充满希望的时代。在本视点中,我们将探讨大型语言模型(LLM)在数字病理学中的整合与挑战,这是一个需要复杂语境理解的快速发展领域。由于 LLMs 在特定领域的效率有限,因此有必要推出量身定制的人工智能工具,过去几年的进步(包括 FrugalGPT 和 BioBERT)就说明了这一点。我们在数字病理学方面的举措强调了特定领域人工智能工具的潜力,其中经过整理的文献数据库与用户交互式网络应用程序相结合,有助于进行精确的参考信息检索。在这一举措取得成功的激励下,我们讨论了针对特定领域的方法如何最大限度地降低不准确回答的风险,提高信息提取的可靠性和准确性。我们还强调了此类工具的更广泛意义,尤其是在简化科学研究的获取途径,以及使缺乏编码经验的科学家更容易获得计算病理学技术方面。本观点呼吁在学术环境中加强整合特定领域的文本生成人工智能工具,以促进不断学习和适应动态演变的医学研究环境。
{"title":"ChatGPT for digital pathology research","authors":"","doi":"10.1016/S2589-7500(24)00114-6","DOIUrl":"10.1016/S2589-7500(24)00114-6","url":null,"abstract":"<div><p>The rapid evolution of generative artificial intelligence (AI) models including OpenAI's ChatGPT signals a promising era for medical research. In this Viewpoint, we explore the integration and challenges of large language models (LLMs) in digital pathology, a rapidly evolving domain demanding intricate contextual understanding. The restricted domain-specific efficiency of LLMs necessitates the advent of tailored AI tools, as illustrated by advancements seen in the last few years including FrugalGPT and BioBERT. Our initiative in digital pathology emphasises the potential of domain-specific AI tools, where a curated literature database coupled with a user-interactive web application facilitates precise, referenced information retrieval. Motivated by the success of this initiative, we discuss how domain-specific approaches substantially minimise the risk of inaccurate responses, enhancing the reliability and accuracy of information extraction. We also highlight the broader implications of such tools, particularly in streamlining access to scientific research and democratising access to computational pathology techniques for scientists with little coding experience. This Viewpoint calls for an enhanced integration of domain-specific text-generation AI tools in academic settings to facilitate continuous learning and adaptation to the dynamically evolving landscape of medical research.</p></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 8","pages":"Pages e595-e600"},"PeriodicalIF":23.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11299190/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141581171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Migration background, skin colour, gender, and infectious disease presentation in clinical vignettes 临床案例中的移民背景、肤色、性别和传染病表现。
IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2024-08-01 DOI: 10.1016/S2589-7500(24)00112-2
Yael Lohse , Katharina Last , Dogus Darici , Sören L Becker , Cihan Papan
{"title":"Migration background, skin colour, gender, and infectious disease presentation in clinical vignettes","authors":"Yael Lohse ,&nbsp;Katharina Last ,&nbsp;Dogus Darici ,&nbsp;Sören L Becker ,&nbsp;Cihan Papan","doi":"10.1016/S2589-7500(24)00112-2","DOIUrl":"10.1016/S2589-7500(24)00112-2","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 8","pages":"Pages e539-e540"},"PeriodicalIF":23.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589750024001122/pdfft?md5=79136ea6973726c095e2ef5ef718b584&pid=1-s2.0-S2589750024001122-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141767681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The diagnostic and triage accuracy of the GPT-3 artificial intelligence model: an observational study GPT-3 人工智能模型的诊断和分诊准确性:一项观察研究。
IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS Pub Date : 2024-08-01 DOI: 10.1016/S2589-7500(24)00097-9
David M Levine MD , Rudraksh Tuwani BS , Benjamin Kompa MPhil , Amita Varma BS , Samuel G Finlayson MD PhD , Prof Ateev Mehrotra MD , Andrew Beam PhD
<div><h3>Background</h3><p>Artificial intelligence (AI) applications in health care have been effective in many areas of medicine, but they are often trained for a single task using labelled data, making deployment and generalisability challenging. How well a general-purpose AI language model performs diagnosis and triage relative to physicians and laypeople is not well understood.</p></div><div><h3>Methods</h3><p>We compared the predictive accuracy of Generative Pre-trained Transformer 3 (GPT-3)'s diagnostic and triage ability for 48 validated synthetic case vignettes (<50 words; sixth-grade reading level or below) of both common (eg, viral illness) and severe (eg, heart attack) conditions to a nationally representative sample of 5000 lay people from the USA who could use the internet to find the correct options and 21 practising physicians at Harvard Medical School. There were 12 vignettes for each of four triage categories: emergent, within one day, within 1 week, and self-care. The correct diagnosis and triage category (ie, ground truth) for each vignette was determined by two general internists at Harvard Medical School. For each vignette, human respondents and GPT-3 were prompted to list diagnoses in order of likelihood, and the vignette was marked as correct if the ground-truth diagnosis was in the top three of the listed diagnoses. For triage accuracy, we examined whether the human respondents’ and GPT-3's selected triage was exactly correct according to the four triage categories, or matched a dichotomised triage variable (emergent or within 1 day <em>vs</em> within 1 week or self-care). We estimated GPT-3's diagnostic and triage confidence on a given vignette using a modified bootstrap resampling procedure, and examined how well calibrated GPT-3's confidence was by computing calibration curves and Brier scores. We also performed subgroup analysis by case acuity, and an error analysis for triage advice to characterise how its advice might affect patients using this tool to decide if they should seek medical care immediately.</p></div><div><h3>Findings</h3><p>Among all cases, GPT-3 replied with the correct diagnosis in its top three for 88% (42/48, 95% CI 75–94) of cases, compared with 54% (2700/5000, 53–55) for lay individuals (p<0.0001) and 96% (637/666, 94–97) for physicians (p=0·012). GPT-3 triaged 70% correct (34/48, 57–82) versus 74% (3706/5000, 73–75; p=0.60) for lay individuals and 91% (608/666, 89–93%; p<0.0001) for physicians. As measured by the Brier score, GPT-3 confidence in its top prediction was reasonably well calibrated for diagnosis (Brier score=0·18) and triage (Brier score=0·22). We observed an inverse relationship between case acuity and GPT-3 accuracy (p<0·0001) with a fitted trend line of –8·33% decrease in accuracy for every level of increase in case acuity. For triage error analysis, GPT-3 deprioritised truly emergent cases in seven instances.</p></div><div><h3>Interpretation</h3><p>A general-purpose A
背景:人工智能(AI)在医疗保健领域的应用在许多医学领域都很有效,但它们通常是使用标记数据针对单一任务进行训练的,这使得部署和通用性具有挑战性。相对于医生和非专业人士而言,通用人工智能语言模型在诊断和分流方面的表现如何还不甚了解:方法:我们比较了生成式预训练转换器 3(GPT-3)对 48 个经过验证的合成病例的诊断和分流能力的预测准确性(结果:在所有病例中,GPT-3 回答了医生和非专业人员的问题;在所有病例中,GPT-3 回答了医生和非专业人员的问题:在所有病例中,GPT-3 对 88% 的病例(42/48,95% CI 75-94)给出了前三位的正确诊断答复,而对非专业人士的答复则为 54%(2700/5000,53-55)(p解释:GPT-3 对所有病例都给出了前三位的正确诊断答复,而对非专业人士的答复则为 54%(2700/5000,53-55):没有经过任何特定内容训练的通用人工智能语言模型的诊断水平接近但低于医生,优于非专业人士。我们发现,在分诊方面,GPT-3 的表现不如医生,有时差距还很大,而它的表现则更接近非专业人士。虽然 GPT-3 的诊断性能与医生不相上下,但它明显优于使用搜索引擎的普通人:国家心肺血液研究所。
{"title":"The diagnostic and triage accuracy of the GPT-3 artificial intelligence model: an observational study","authors":"David M Levine MD ,&nbsp;Rudraksh Tuwani BS ,&nbsp;Benjamin Kompa MPhil ,&nbsp;Amita Varma BS ,&nbsp;Samuel G Finlayson MD PhD ,&nbsp;Prof Ateev Mehrotra MD ,&nbsp;Andrew Beam PhD","doi":"10.1016/S2589-7500(24)00097-9","DOIUrl":"10.1016/S2589-7500(24)00097-9","url":null,"abstract":"&lt;div&gt;&lt;h3&gt;Background&lt;/h3&gt;&lt;p&gt;Artificial intelligence (AI) applications in health care have been effective in many areas of medicine, but they are often trained for a single task using labelled data, making deployment and generalisability challenging. How well a general-purpose AI language model performs diagnosis and triage relative to physicians and laypeople is not well understood.&lt;/p&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Methods&lt;/h3&gt;&lt;p&gt;We compared the predictive accuracy of Generative Pre-trained Transformer 3 (GPT-3)'s diagnostic and triage ability for 48 validated synthetic case vignettes (&lt;50 words; sixth-grade reading level or below) of both common (eg, viral illness) and severe (eg, heart attack) conditions to a nationally representative sample of 5000 lay people from the USA who could use the internet to find the correct options and 21 practising physicians at Harvard Medical School. There were 12 vignettes for each of four triage categories: emergent, within one day, within 1 week, and self-care. The correct diagnosis and triage category (ie, ground truth) for each vignette was determined by two general internists at Harvard Medical School. For each vignette, human respondents and GPT-3 were prompted to list diagnoses in order of likelihood, and the vignette was marked as correct if the ground-truth diagnosis was in the top three of the listed diagnoses. For triage accuracy, we examined whether the human respondents’ and GPT-3's selected triage was exactly correct according to the four triage categories, or matched a dichotomised triage variable (emergent or within 1 day &lt;em&gt;vs&lt;/em&gt; within 1 week or self-care). We estimated GPT-3's diagnostic and triage confidence on a given vignette using a modified bootstrap resampling procedure, and examined how well calibrated GPT-3's confidence was by computing calibration curves and Brier scores. We also performed subgroup analysis by case acuity, and an error analysis for triage advice to characterise how its advice might affect patients using this tool to decide if they should seek medical care immediately.&lt;/p&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Findings&lt;/h3&gt;&lt;p&gt;Among all cases, GPT-3 replied with the correct diagnosis in its top three for 88% (42/48, 95% CI 75–94) of cases, compared with 54% (2700/5000, 53–55) for lay individuals (p&lt;0.0001) and 96% (637/666, 94–97) for physicians (p=0·012). GPT-3 triaged 70% correct (34/48, 57–82) versus 74% (3706/5000, 73–75; p=0.60) for lay individuals and 91% (608/666, 89–93%; p&lt;0.0001) for physicians. As measured by the Brier score, GPT-3 confidence in its top prediction was reasonably well calibrated for diagnosis (Brier score=0·18) and triage (Brier score=0·22). We observed an inverse relationship between case acuity and GPT-3 accuracy (p&lt;0·0001) with a fitted trend line of –8·33% decrease in accuracy for every level of increase in case acuity. For triage error analysis, GPT-3 deprioritised truly emergent cases in seven instances.&lt;/p&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Interpretation&lt;/h3&gt;&lt;p&gt;A general-purpose A","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 8","pages":"Pages e555-e561"},"PeriodicalIF":23.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589750024000979/pdfft?md5=ea4e50c92b21c03fc0e3ebee146bfe6e&pid=1-s2.0-S2589750024000979-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141767684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Lancet Digital Health
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1