A Multi-centric Evaluation of Deep Learning Models for Segmentation of COVID-19 Lung Lesions on Chest CT Scans

IF 0.2 4区 医学 Q4 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Iranian Journal of Radiology Pub Date : 2022-11-17 DOI:10.5812/iranjradiol-117992
S. Sotoudeh-Paima, Navid Hasanzadeh, Ali Bashirgonbadi, Amin Aref, M. Naghibi, Mostafa Zoorpaikar, Arvin Arian, M. Gity, H. Soltanian-Zadeh
{"title":"A Multi-centric Evaluation of Deep Learning Models for Segmentation of COVID-19 Lung Lesions on Chest CT Scans","authors":"S. Sotoudeh-Paima, Navid Hasanzadeh, Ali Bashirgonbadi, Amin Aref, M. Naghibi, Mostafa Zoorpaikar, Arvin Arian, M. Gity, H. Soltanian-Zadeh","doi":"10.5812/iranjradiol-117992","DOIUrl":null,"url":null,"abstract":"Background: Chest computed tomography (CT) scan is one of the most common tools used for the diagnosis of patients with coronavirus disease 2019 (COVID-19). While segmentation of COVID-19 lung lesions by radiologists can be time-consuming, the application of advanced deep learning techniques for automated segmentation can be a promising step toward the management of this infection and similar diseases in the future. Objectives: This study aimed to evaluate the performance and generalizability of deep learning-based models for the automated segmentation of COVID-19 lung lesions. Patients and Methods: Four datasets (2 private and 2 public) were used in this study. The first and second private datasets included 297 (147 healthy and 150 COVID-19 cases) and 82 COVID-19 subjects. The public datasets included the COVID19-P20 (20 COVID-19 cases from 2 centers) and the MosMedData datasets (50 COVID-19 patients from a single center). Model comparisons were made based on the Dice similarity coefficient (DSC), receiver operating characteristic (ROC) curve, and area under the curve (AUC). The predicted CT severity scores by the model were compared with those of radiologists by measuring the Pearson’s correlation coefficients (PCC). Also, DSC was used to compare the inter-rater agreement of the model and expert against that of 2 experts on an unseen dataset. Finally, the generalizability of the model was evaluated, and a simple calibration strategy was proposed. Results: The VGG16-UNet model showed the best performance across both private datasets, with a DSC of 84.23% ± 1.73% on the first private dataset and 56.61% ± 1.48% on the second private dataset. Similar results were obtained on public datasets, with a DSC of 60.10% ± 2.34% on the COVID19-P20 dataset and 66.28% ± 2.80% on a combined dataset of COVID19-P20 and MosMedData. The predicted CT severity scores of the model were compared against those of radiologists and were found to be 0.89 and 0.85 on the first private dataset and 0.77 and 0.74 on the second private dataset for the right and left lungs, respectively. Moreover, the model trained on the first private dataset was examined on the second private dataset and compared against the radiologist, which revealed a performance gap of 5.74% based on DSCs. A calibration strategy was employed to reduce this gap to 0.53%. Conclusion: The results demonstrated the potential of the proposed model in localizing COVID-19 lesions on CT scans across multiple datasets; its accuracy competed with the radiologists and could assist them in diagnostic and treatment procedures. The effect of model calibration on the performance of an unseen dataset was also reported, increasing the DSC by more than 5%.","PeriodicalId":50273,"journal":{"name":"Iranian Journal of Radiology","volume":"1 1","pages":""},"PeriodicalIF":0.2000,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Iranian Journal of Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.5812/iranjradiol-117992","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Chest computed tomography (CT) scan is one of the most common tools used for the diagnosis of patients with coronavirus disease 2019 (COVID-19). While segmentation of COVID-19 lung lesions by radiologists can be time-consuming, the application of advanced deep learning techniques for automated segmentation can be a promising step toward the management of this infection and similar diseases in the future. Objectives: This study aimed to evaluate the performance and generalizability of deep learning-based models for the automated segmentation of COVID-19 lung lesions. Patients and Methods: Four datasets (2 private and 2 public) were used in this study. The first and second private datasets included 297 (147 healthy and 150 COVID-19 cases) and 82 COVID-19 subjects. The public datasets included the COVID19-P20 (20 COVID-19 cases from 2 centers) and the MosMedData datasets (50 COVID-19 patients from a single center). Model comparisons were made based on the Dice similarity coefficient (DSC), receiver operating characteristic (ROC) curve, and area under the curve (AUC). The predicted CT severity scores by the model were compared with those of radiologists by measuring the Pearson’s correlation coefficients (PCC). Also, DSC was used to compare the inter-rater agreement of the model and expert against that of 2 experts on an unseen dataset. Finally, the generalizability of the model was evaluated, and a simple calibration strategy was proposed. Results: The VGG16-UNet model showed the best performance across both private datasets, with a DSC of 84.23% ± 1.73% on the first private dataset and 56.61% ± 1.48% on the second private dataset. Similar results were obtained on public datasets, with a DSC of 60.10% ± 2.34% on the COVID19-P20 dataset and 66.28% ± 2.80% on a combined dataset of COVID19-P20 and MosMedData. The predicted CT severity scores of the model were compared against those of radiologists and were found to be 0.89 and 0.85 on the first private dataset and 0.77 and 0.74 on the second private dataset for the right and left lungs, respectively. Moreover, the model trained on the first private dataset was examined on the second private dataset and compared against the radiologist, which revealed a performance gap of 5.74% based on DSCs. A calibration strategy was employed to reduce this gap to 0.53%. Conclusion: The results demonstrated the potential of the proposed model in localizing COVID-19 lesions on CT scans across multiple datasets; its accuracy competed with the radiologists and could assist them in diagnostic and treatment procedures. The effect of model calibration on the performance of an unseen dataset was also reported, increasing the DSC by more than 5%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
深度学习模型在胸部CT扫描中分割新冠肺炎肺部病变的多中心评价
背景:胸部计算机断层扫描(CT)是诊断2019冠状病毒病(新冠肺炎)患者最常见的工具之一。虽然放射科医生对新冠肺炎肺部病变的分割可能很耗时,但应用先进的深度学习技术进行自动分割可能是未来管理这种感染和类似疾病的一个有希望的步骤。目的:本研究旨在评估基于深度学习的模型在新冠肺炎肺部病变自动分割中的性能和可推广性。患者和方法:本研究使用了四个数据集(2个私人数据集和2个公共数据集)。第一和第二私人数据集包括297名(147名健康和150名新冠肺炎病例)和82名新冠肺炎受试者。公共数据集包括COVID19-P20(来自两个中心的20例新冠肺炎病例)和MosMedData数据集(来自一个中心的50名新冠肺炎患者)。基于Dice相似系数(DSC)、受试者工作特性(ROC)曲线和曲线下面积(AUC)进行模型比较。通过测量Pearson相关系数(PCC),将模型预测的CT严重程度评分与放射科医生的评分进行比较。此外,DSC用于将模型和专家的评分者之间的一致性与2名专家在未公开数据集上的一致性进行比较。最后,对模型的可推广性进行了评价,并提出了一种简单的标定策略。结果:VGG16-UNet模型在两个私有数据集上都表现出最好的性能,第一个私有数据集中的DSC为84.23%±1.73%,第二个私有数据上的DSC为56.61%±1.48%。在公共数据集上也获得了类似的结果,在COVID19-P20数据集上的DSC为60.10%±2.34%,在COVID19-P20和MosMedData的组合数据集上为66.28%±2.80%。将该模型的预测CT严重程度评分与放射科医生的预测评分进行比较,发现第一个私人数据集的CT严重程度得分分别为0.89和0.85,第二个私人数据集中的右肺和左肺CT严重程度分数分别为0.77和0.74。此外,在第一个私有数据集上训练的模型在第二个私有数据集中进行了检查,并与放射科医生进行了比较,结果显示基于DSC的性能差距为5.74%。采用校准策略将这一差距缩小到0.53%。结论:结果证明了所提出的模型在多个数据集的CT扫描上定位新冠肺炎病变的潜力;它的准确性与放射科医生竞争,可以帮助他们进行诊断和治疗程序。还报道了模型校准对看不见的数据集性能的影响,使DSC增加了5%以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Iranian Journal of Radiology
Iranian Journal of Radiology RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-
CiteScore
0.50
自引率
0.00%
发文量
33
审稿时长
>12 weeks
期刊介绍: The Iranian Journal of Radiology is the official journal of Tehran University of Medical Sciences and the Iranian Society of Radiology. It is a scientific forum dedicated primarily to the topics relevant to radiology and allied sciences of the developing countries, which have been neglected or have received little attention in the Western medical literature. This journal particularly welcomes manuscripts which deal with radiology and imaging from geographic regions wherein problems regarding economic, social, ethnic and cultural parameters affecting prevalence and course of the illness are taken into consideration. The Iranian Journal of Radiology has been launched in order to interchange information in the field of radiology and other related scientific spheres. In accordance with the objective of developing the scientific ability of the radiological population and other related scientific fields, this journal publishes research articles, evidence-based review articles, and case reports focused on regional tropics. Iranian Journal of Radiology operates in agreement with the below principles in compliance with continuous quality improvement: 1-Increasing the satisfaction of the readers, authors, staff, and co-workers. 2-Improving the scientific content and appearance of the journal. 3-Advancing the scientific validity of the journal both nationally and internationally. Such basics are accomplished only by aggregative effort and reciprocity of the radiological population and related sciences, authorities, and staff of the journal.
期刊最新文献
Application of Elastography in the Diagnosis of Idiopathic Granulomatous Mastitis (IGM): A Systematic Review Transarterial Chemoembolization for Hepatic Metastasis of Solitary Fibrous Tumor: Report of Five Patients Magnetic Resonance Spectroscopy Findings of Intracranial Chondroma and Chondrosarcoma with a Non-Skull Base Origin: A Report of Two Cases Evaluation of the Relationship Between the Characteristics and Dimensions of Calcified Plaques and Coronary Artery Stenosis in Patients Undergoing Coronary Computed Tomography Angiography Improvement of Bone Age Assessment Using a Deep Learning Model in Young Children: Significance of Carpal Bone Analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1