Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision Transformers.

IF 2.2 4区 医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Tomography Pub Date : 2023-10-18 DOI:10.3390/tomography9050151
Xiaofan Xiong, Brian J Smith, Stephen A Graves, Michael M Graham, John M Buatti, Reinhard R Beichel
{"title":"Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision Transformers.","authors":"Xiaofan Xiong, Brian J Smith, Stephen A Graves, Michael M Graham, John M Buatti, Reinhard R Beichel","doi":"10.3390/tomography9050151","DOIUrl":null,"url":null,"abstract":"<p><p>Convolutional neural networks (CNNs) have a proven track record in medical image segmentation. Recently, Vision Transformers were introduced and are gaining popularity for many computer vision applications, including object detection, classification, and segmentation. Machine learning algorithms such as CNNs or Transformers are subject to an inductive bias, which can have a significant impact on the performance of machine learning models. This is especially relevant for medical image segmentation applications where limited training data are available, and a model's inductive bias should help it to generalize well. In this work, we quantitatively assess the performance of two CNN-based networks (U-Net and U-Net-CBAM) and three popular Transformer-based segmentation network architectures (UNETR, TransBTS, and VT-UNet) in the context of HNC lesion segmentation in volumetric [F-18] fluorodeoxyglucose (FDG) PET scans. For performance assessment, 272 FDG PET-CT scans of a clinical trial (ACRIN 6685) were utilized, which includes a total of 650 lesions (primary: 272 and secondary: 378). The image data used are highly diverse and representative for clinical use. For performance analysis, several error metrics were utilized. The achieved Dice coefficient ranged from 0.833 to 0.809 with the best performance being achieved by CNN-based approaches. U-Net-CBAM, which utilizes spatial and channel attention, showed several advantages for smaller lesions compared to the standard U-Net. Furthermore, our results provide some insight regarding the image features relevant for this specific segmentation application. In addition, results highlight the need to utilize primary as well as secondary lesions to derive clinically relevant segmentation performance estimates avoiding biases.</p>","PeriodicalId":51330,"journal":{"name":"Tomography","volume":"9 5","pages":"1933-1948"},"PeriodicalIF":2.2000,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10611182/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tomography","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/tomography9050151","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Convolutional neural networks (CNNs) have a proven track record in medical image segmentation. Recently, Vision Transformers were introduced and are gaining popularity for many computer vision applications, including object detection, classification, and segmentation. Machine learning algorithms such as CNNs or Transformers are subject to an inductive bias, which can have a significant impact on the performance of machine learning models. This is especially relevant for medical image segmentation applications where limited training data are available, and a model's inductive bias should help it to generalize well. In this work, we quantitatively assess the performance of two CNN-based networks (U-Net and U-Net-CBAM) and three popular Transformer-based segmentation network architectures (UNETR, TransBTS, and VT-UNet) in the context of HNC lesion segmentation in volumetric [F-18] fluorodeoxyglucose (FDG) PET scans. For performance assessment, 272 FDG PET-CT scans of a clinical trial (ACRIN 6685) were utilized, which includes a total of 650 lesions (primary: 272 and secondary: 378). The image data used are highly diverse and representative for clinical use. For performance analysis, several error metrics were utilized. The achieved Dice coefficient ranged from 0.833 to 0.809 with the best performance being achieved by CNN-based approaches. U-Net-CBAM, which utilizes spatial and channel attention, showed several advantages for smaller lesions compared to the standard U-Net. Furthermore, our results provide some insight regarding the image features relevant for this specific segmentation application. In addition, results highlight the need to utilize primary as well as secondary lesions to derive clinically relevant segmentation performance estimates avoiding biases.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
FDG PET图像中癌症的头部和颈部分割:卷积神经网络和视觉变换器的性能比较。
卷积神经网络在医学图像分割方面有着良好的记录。最近,视觉转换器被引入,并在许多计算机视觉应用中越来越受欢迎,包括对象检测、分类和分割。机器学习算法(如CNNs或Transformers)会受到归纳偏差的影响,这会对机器学习模型的性能产生重大影响。这对于可用的训练数据有限的医学图像分割应用尤其重要,并且模型的归纳偏差应该有助于它很好地推广。在这项工作中,我们定量评估了两种基于CNN的网络(U-Net和U-Net-CBAM)和三种流行的基于Transformer的分割网络架构(UNETR、TransBTS和VT-UNet)在体积[F-18]氟脱氧葡萄糖(FDG)PET扫描中HNC病变分割的情况下的性能。为了进行性能评估,使用了临床试验(ACRIN 6685)的272次FDG PET-CT扫描,其中包括总共650个病变(原发性:272个,继发性:378个)。所使用的图像数据是高度多样化的,并且对于临床使用具有代表性。对于性能分析,使用了几个错误度量。实现的Dice系数在0.833到0.809之间,其中基于CNN的方法实现了最佳性能。U-Net-CBAM利用空间和通道注意力,与标准U-Net相比,在较小的病变方面显示出一些优势。此外,我们的结果提供了一些关于与该特定分割应用相关的图像特征的见解。此外,研究结果强调了利用原发性和继发性病变来推导临床相关分割性能估计的必要性,以避免偏差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Tomography
Tomography Medicine-Radiology, Nuclear Medicine and Imaging
CiteScore
2.70
自引率
10.50%
发文量
222
期刊介绍: TomographyTM publishes basic (technical and pre-clinical) and clinical scientific articles which involve the advancement of imaging technologies. Tomography encompasses studies that use single or multiple imaging modalities including for example CT, US, PET, SPECT, MR and hyperpolarization technologies, as well as optical modalities (i.e. bioluminescence, photoacoustic, endomicroscopy, fiber optic imaging and optical computed tomography) in basic sciences, engineering, preclinical and clinical medicine. Tomography also welcomes studies involving exploration and refinement of contrast mechanisms and image-derived metrics within and across modalities toward the development of novel imaging probes for image-based feedback and intervention. The use of imaging in biology and medicine provides unparalleled opportunities to noninvasively interrogate tissues to obtain real-time dynamic and quantitative information required for diagnosis and response to interventions and to follow evolving pathological conditions. As multi-modal studies and the complexities of imaging technologies themselves are ever increasing to provide advanced information to scientists and clinicians. Tomography provides a unique publication venue allowing investigators the opportunity to more precisely communicate integrated findings related to the diverse and heterogeneous features associated with underlying anatomical, physiological, functional, metabolic and molecular genetic activities of normal and diseased tissue. Thus Tomography publishes peer-reviewed articles which involve the broad use of imaging of any tissue and disease type including both preclinical and clinical investigations. In addition, hardware/software along with chemical and molecular probe advances are welcome as they are deemed to significantly contribute towards the long-term goal of improving the overall impact of imaging on scientific and clinical discovery.
期刊最新文献
Left Ventricular Diastolic Dysfunction with Elevated Filling Pressures Is Associated with Embolic Stroke of Undetermined Source and Atrial Fibrillation. A Hybrid CNN-Transformer Model for Predicting N Staging and Survival in Non-Small Cell Lung Cancer Patients Based on CT-Scan. The Correlation between the Elastic Modulus of the Achilles Tendon Enthesis and Bone Microstructure in the Calcaneal Crescent. Feature Reviews for Tomography 2023. Optimal DaTQUANT Thresholds for Diagnostic Accuracy of Dementia with Lewy Bodies (DLB) and Parkinson's Disease (PD).
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1