A Multicenter, Scan-Rescan, Human and Machine Learning CMR Study to Test Generalizability and Precision in Imaging Biomarker Analysis.

IF 0.4 4区医学 Q4 PARASITOLOGY Comparative Parasitology Pub Date : 2019-10-01 Epub Date: 2019-09-24 DOI:10.1161/CIRCIMAGING.119.009214

Anish Bhuva, Wenjia Bai, Clement Lau, Rhodri Davies, Yang Ye, Heeraj Bulluck, Elisa McAlindon, Veronica Culotta, Peter Swoboda, Gabriella Captur, Thomas Treibel, Joao Augusto, Kristopher Knott, Andreas Seraphim, Graham Cole, Steffen Petersen, Nicola Edwards, John Greenwood, Chiara Bucciarelli-Ducci, Alun Hughes, Daniel Rueckert, James Moon, Charlotte Manisty

{"title":"A Multicenter, Scan-Rescan, Human and Machine Learning CMR Study to Test Generalizability and Precision in Imaging Biomarker Analysis.","authors":"Anish Bhuva, Wenjia Bai, Clement Lau, Rhodri Davies, Yang Ye, Heeraj Bulluck, Elisa McAlindon, Veronica Culotta, Peter Swoboda, Gabriella Captur, Thomas Treibel, Joao Augusto, Kristopher Knott, Andreas Seraphim, Graham Cole, Steffen Petersen, Nicola Edwards, John Greenwood, Chiara Bucciarelli-Ducci, Alun Hughes, Daniel Rueckert, James Moon, Charlotte Manisty","doi":"10.1161/CIRCIMAGING.119.009214","DOIUrl":null,"url":null,"abstract":"Background: Automated analysis of cardiac structure and function using machine learning (ML) has great potential, but is currently hindered by poor generalizability. Comparison is traditionally against clinicians as a reference, ignoring inherent human inter- and intraobserver error, and ensuring that ML cannot demonstrate superiority. Measuring precision (scan:rescan reproducibility) addresses this. We compared precision of ML and humans using a multicenter, multi-disease, scan:rescan cardiovascular magnetic resonance data set.Methods: One hundred ten patients (5 disease categories, 5 institutions, 2 scanner manufacturers, and 2 field strengths) underwent scan:rescan cardiovascular magnetic resonance (96% within one week). After identification of the most precise human technique, left ventricular chamber volumes, mass, and ejection fraction were measured by an expert, a trained junior clinician, and a fully automated convolutional neural network trained on 599 independent multicenter disease cases. Scan:rescan coefficient of variation and 1000 bootstrapped 95% CIs were calculated and compared using mixed linear effects models.Results: Clinicians can be confident in detecting a 9% change in left ventricular ejection fraction, with greater than half of coefficient of variation attributable to intraobserver variation. Expert, trained junior, and automated scan:rescan precision were similar (for left ventricular ejection fraction, coefficient of variation 6.1 [5.2%-7.1%], P=0.2581; 8.3 [5.6%-10.3%], P=0.3653; 8.8 [6.1%-11.1%], P=0.8620). Automated analysis was 186× faster than humans (0.07 versus 13 minutes).Conclusions: Automated ML analysis is faster with similar precision to the most precise human techniques, even when challenged with real-world scan:rescan data. Assessment of multicenter, multi-vendor, multi-field strength scan:rescan data (available at www.thevolumesresource.com) permits a generalizable assessment of ML precision and may facilitate direct translation of ML to clinical practice.","PeriodicalId":50655,"journal":{"name":"Comparative Parasitology","volume":"85 1","pages":"e009214"},"PeriodicalIF":0.4000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Comparative Parasitology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1161/CIRCIMAGING.119.009214","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/9/24 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PARASITOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Automated analysis of cardiac structure and function using machine learning (ML) has great potential, but is currently hindered by poor generalizability. Comparison is traditionally against clinicians as a reference, ignoring inherent human inter- and intraobserver error, and ensuring that ML cannot demonstrate superiority. Measuring precision (scan:rescan reproducibility) addresses this. We compared precision of ML and humans using a multicenter, multi-disease, scan:rescan cardiovascular magnetic resonance data set.

Methods: One hundred ten patients (5 disease categories, 5 institutions, 2 scanner manufacturers, and 2 field strengths) underwent scan:rescan cardiovascular magnetic resonance (96% within one week). After identification of the most precise human technique, left ventricular chamber volumes, mass, and ejection fraction were measured by an expert, a trained junior clinician, and a fully automated convolutional neural network trained on 599 independent multicenter disease cases. Scan:rescan coefficient of variation and 1000 bootstrapped 95% CIs were calculated and compared using mixed linear effects models.

Results: Clinicians can be confident in detecting a 9% change in left ventricular ejection fraction, with greater than half of coefficient of variation attributable to intraobserver variation. Expert, trained junior, and automated scan:rescan precision were similar (for left ventricular ejection fraction, coefficient of variation 6.1 [5.2%-7.1%], P=0.2581; 8.3 [5.6%-10.3%], P=0.3653; 8.8 [6.1%-11.1%], P=0.8620). Automated analysis was 186× faster than humans (0.07 versus 13 minutes).

Conclusions: Automated ML analysis is faster with similar precision to the most precise human techniques, even when challenged with real-world scan:rescan data. Assessment of multicenter, multi-vendor, multi-field strength scan:rescan data (available at www.thevolumesresource.com) permits a generalizable assessment of ML precision and may facilitate direct translation of ML to clinical practice.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一项多中心、扫描-扫描、人类和机器学习 CMR 研究，旨在测试成像生物标记分析的通用性和精确性。

背景：使用机器学习（ML）对心脏结构和功能进行自动分析具有很大的潜力，但目前因通用性差而受到阻碍。传统的比较方法是以临床医生为参照，忽略了观察者之间和观察者内部固有的人为误差，从而确保 ML 无法证明其优越性。测量精确度（扫描：重新扫描的再现性）可以解决这个问题。我们使用多中心、多疾病、扫描：重扫描心血管磁共振数据集比较了ML和人类的精确度：110名患者（5种疾病类别、5家机构、2家扫描仪制造商和2种磁场强度）接受了扫描：重扫描心血管磁共振检查（96%在一周内完成）。在确定了最精确的人类技术后，由一名专家、一名经过培训的初级临床医生和一个在 599 个独立的多中心疾病病例上训练过的全自动卷积神经网络测量左心室腔容积、质量和射血分数。使用混合线性效应模型计算并比较了扫描：再扫描变异系数和1000个自引导95% CI：结果：临床医生有信心检测出左心室射血分数 9% 的变化，其中一半以上的变异系数归因于观察者内部的差异。专家、训练有素的初级人员和自动扫描：重新扫描的精确度相似（左室射血分数，变异系数为 6.1 [5.2%-7.1%]，P=0.2581；8.3 [5.6%-10.3%]，P=0.3653；8.8 [6.1%-11.1%]，P=0.8620）。自动分析比人工分析快 186 倍（0.07 对 13 分钟）：结论：自动 ML 分析速度更快，精确度与最精确的人工技术相似，即使面对真实世界扫描：再扫描数据的挑战也是如此。对多中心、多供应商、多场强扫描：再扫描数据（可在 www.thevolumesresource.com 上获取）的评估允许对 ML 精确度进行可推广的评估，并有助于将 ML 直接应用于临床实践。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Comparative Parasitology 医学-动物学

CiteScore

1.00

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Comparative Parasitology (continuing the Journal of the Helminthological Society of Washington in its 67th volume) focuses on parasitological research of a comparative nature, emphasizing taxonomy, systematics, ecology, biogeography, evolution, faunal survey, and biological inventory within a morphological and/or molecular context. The scope of Comparative Parasitology extends to all parasitic faunas, including helminths, protistans and arthropods.