揭示用于医学图像分割的 U-Net 模型中感受野大小的影响。

IF 1.9 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Journal of Medical Imaging Pub Date : 2024-09-01 Epub Date: 2024-10-29 DOI:10.1117/1.JMI.11.5.054004

Vincent Loos, Rohit Pardasani, Navchetan Awasthi

{"title":"揭示用于医学图像分割的 U-Net 模型中感受野大小的影响。","authors":"Vincent Loos, Rohit Pardasani, Navchetan Awasthi","doi":"10.1117/1.JMI.11.5.054004","DOIUrl":null,"url":null,"abstract":"Purpose: Medical image segmentation is a critical task in healthcare applications, and U-Nets have demonstrated promising results in this domain. We delve into the understudied aspect of receptive field (RF) size and its impact on the U-Net and attention U-Net architectures used for medical imaging segmentation.Approach: We explore several critical elements including the relationship among RF size, characteristics of the region of interest, and model performance, as well as the balance between RF size and computational costs for U-Net and attention U-Net methods for different datasets. We also propose a mathematical notation for representing the theoretical receptive field (TRF) of a given layer in a network and propose two new metrics, namely, the effective receptive field (ERF) rate and the object rate, to quantify the fraction of significantly contributing pixels within the ERF against the TRF area and assessing the relative size of the segmentation object compared with the TRF size, respectively.Results: The results demonstrate that there exists an optimal TRF size that successfully strikes a balance between capturing a wider global context and maintaining computational efficiency, thereby optimizing model performance. Interestingly, a distinct correlation is observed between the data complexity and the required TRF size; segmentation based solely on contrast achieved peak performance even with smaller TRF sizes, whereas more complex segmentation tasks necessitated larger TRFs. Attention U-Net models consistently outperformed their U-Net counterparts, highlighting the value of attention mechanisms regardless of TRF size.Conclusions: These insights present an invaluable resource for developing more efficient U-Net-based architectures for medical imaging and pave the way for future exploration of other segmentation architectures. A tool is also developed, which calculates the TRF for a U-Net (and attention U-Net) model and also suggests an appropriate TRF size for a given model and dataset.","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"11 5","pages":"054004"},"PeriodicalIF":1.9000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520766/pdf/","citationCount":"0","resultStr":"{\"title\":\"Demystifying the effect of receptive field size in U-Net models for medical image segmentation.\",\"authors\":\"Vincent Loos, Rohit Pardasani, Navchetan Awasthi\",\"doi\":\"10.1117/1.JMI.11.5.054004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: Medical image segmentation is a critical task in healthcare applications, and U-Nets have demonstrated promising results in this domain. We delve into the understudied aspect of receptive field (RF) size and its impact on the U-Net and attention U-Net architectures used for medical imaging segmentation.Approach: We explore several critical elements including the relationship among RF size, characteristics of the region of interest, and model performance, as well as the balance between RF size and computational costs for U-Net and attention U-Net methods for different datasets. We also propose a mathematical notation for representing the theoretical receptive field (TRF) of a given layer in a network and propose two new metrics, namely, the effective receptive field (ERF) rate and the object rate, to quantify the fraction of significantly contributing pixels within the ERF against the TRF area and assessing the relative size of the segmentation object compared with the TRF size, respectively.Results: The results demonstrate that there exists an optimal TRF size that successfully strikes a balance between capturing a wider global context and maintaining computational efficiency, thereby optimizing model performance. Interestingly, a distinct correlation is observed between the data complexity and the required TRF size; segmentation based solely on contrast achieved peak performance even with smaller TRF sizes, whereas more complex segmentation tasks necessitated larger TRFs. Attention U-Net models consistently outperformed their U-Net counterparts, highlighting the value of attention mechanisms regardless of TRF size.Conclusions: These insights present an invaluable resource for developing more efficient U-Net-based architectures for medical imaging and pave the way for future exploration of other segmentation architectures. A tool is also developed, which calculates the TRF for a U-Net (and attention U-Net) model and also suggests an appropriate TRF size for a given model and dataset.\",\"PeriodicalId\":47707,\"journal\":{\"name\":\"Journal of Medical Imaging\",\"volume\":\"11 5\",\"pages\":\"054004\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520766/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Medical Imaging\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1117/1.JMI.11.5.054004\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/10/29 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1117/1.JMI.11.5.054004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/29 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

摘要

目的：医学图像分割是医疗保健应用中的一项关键任务，U-Net 在这一领域取得了令人鼓舞的成果。我们深入研究了未被充分研究的感受野（RF）大小及其对用于医学影像分割的 U-Net 和注意力 U-Net 架构的影响：方法：我们探讨了几个关键因素，包括感受野大小、感兴趣区域特征和模型性能之间的关系，以及针对不同数据集的 U-Net 和注意力 U-Net 方法在感受野大小和计算成本之间的平衡。我们还提出了一种数学符号，用于表示网络中给定层的理论感受野（TRF），并提出了两个新指标，即有效感受野（ERF）率和对象率，分别用于量化ERF内对TRF区域有显著贡献的像素的比例，以及评估分割对象与TRF大小相比的相对大小：结果表明，存在一个最佳的 TRF 大小，它能成功地在捕捉更广泛的全局背景和保持计算效率之间取得平衡，从而优化模型性能。有趣的是，数据复杂度与所需的 TRF 大小之间存在明显的相关性；即使 TRF 大小较小，仅基于对比度的分割也能达到峰值性能，而更复杂的分割任务则需要更大的 TRF。注意力 U-Net 模型的表现始终优于其 U-Net 模型，这凸显了注意力机制的价值，无论 TRF 大小如何：这些见解为开发更高效的基于 U-Net 的医学成像架构提供了宝贵的资源，并为未来探索其他分割架构铺平了道路。此外，还开发了一种工具，用于计算 U-Net（和注意力 U-Net）模型的 TRF，并为给定模型和数据集建议合适的 TRF 大小。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Demystifying the effect of receptive field size in U-Net models for medical image segmentation.

Purpose: Medical image segmentation is a critical task in healthcare applications, and U-Nets have demonstrated promising results in this domain. We delve into the understudied aspect of receptive field (RF) size and its impact on the U-Net and attention U-Net architectures used for medical imaging segmentation.

Approach: We explore several critical elements including the relationship among RF size, characteristics of the region of interest, and model performance, as well as the balance between RF size and computational costs for U-Net and attention U-Net methods for different datasets. We also propose a mathematical notation for representing the theoretical receptive field (TRF) of a given layer in a network and propose two new metrics, namely, the effective receptive field (ERF) rate and the object rate, to quantify the fraction of significantly contributing pixels within the ERF against the TRF area and assessing the relative size of the segmentation object compared with the TRF size, respectively.

Results: The results demonstrate that there exists an optimal TRF size that successfully strikes a balance between capturing a wider global context and maintaining computational efficiency, thereby optimizing model performance. Interestingly, a distinct correlation is observed between the data complexity and the required TRF size; segmentation based solely on contrast achieved peak performance even with smaller TRF sizes, whereas more complex segmentation tasks necessitated larger TRFs. Attention U-Net models consistently outperformed their U-Net counterparts, highlighting the value of attention mechanisms regardless of TRF size.

Conclusions: These insights present an invaluable resource for developing more efficient U-Net-based architectures for medical imaging and pave the way for future exploration of other segmentation architectures. A tool is also developed, which calculates the TRF for a U-Net (and attention U-Net) model and also suggests an appropriate TRF size for a given model and dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Medical Imaging RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-

CiteScore

4.10

自引率

4.20%

发文量

期刊介绍： JMI covers fundamental and translational research, as well as applications, focused on medical imaging, which continue to yield physical and biomedical advancements in the early detection, diagnostics, and therapy of disease as well as in the understanding of normal. The scope of JMI includes: Imaging physics, Tomographic reconstruction algorithms (such as those in CT and MRI), Image processing and deep learning, Computer-aided diagnosis and quantitative image analysis, Visualization and modeling, Picture archiving and communications systems (PACS), Image perception and observer performance, Technology assessment, Ultrasonic imaging, Image-guided procedures, Digital pathology, Biomedical applications of biomedical imaging. JMI allows for the peer-reviewed communication and archiving of scientific developments, translational and clinical applications, reviews, and recommendations for the field.