{"title":"Enhanced breast mass segmentation in mammograms using a hybrid transformer UNet model.","authors":"Shahriar Mohammadi, Mohammad Ahmadi Livani","doi":"10.1016/j.compbiomed.2024.109432","DOIUrl":null,"url":null,"abstract":"<p><p>Breast mass segmentation plays a crucial role in early breast cancer detection and diagnosis, and while Convolutional Neural Networks (CNN) have been widely used for this task, their reliance on local receptive fields limits ability to capture long-range dependencies. Vision Transformers (ViTs), on the other hand, excel in this area by leveraging multi-head self-attention mechanisms to generate attention maps that dynamically gather global spatial information, significantly outperforming CNN-based architectures in various tasks. However, traditional transformer-based models come with challenges, including high computational complexity due to the self-attention mechanism and inefficiency in the static MLP fusion process. To overcome these issues, the Hybrid Transformer U-Net (HTU-net) model is proposed for breast mass segmentation in mammography. Channel and spatial enhanced self-attention mechanisms are integrated with convolutions layers in HTU-Net, creating a hybrid architecture that combines the strengths of both CNNs and ViTs. The introduction of a multiscale attention mechanism further improves the model's ability to fuse information from different resolutions, enhancing the decoder's capacity to reconstruct fine details in the segmented output. By using both local texture-based features and global contextual information, HTU-Net excels in capturing essential features, thus improving segmentation performance. The experimental results across multiple datasets, including CBIS-DDSM and INbreast, demonstrate that HTU-Net outperforms several state-of-the-art methods, achieving superior accuracy, dice similarity coefficient, and intersection over union. This work highlights the potential of hybrid architectures in advancing computer-aided diagnosis systems, particularly in improving segmentation quality and reliability for breast cancer detection.</p>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"184 ","pages":"109432"},"PeriodicalIF":7.0000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.compbiomed.2024.109432","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Breast mass segmentation plays a crucial role in early breast cancer detection and diagnosis, and while Convolutional Neural Networks (CNN) have been widely used for this task, their reliance on local receptive fields limits ability to capture long-range dependencies. Vision Transformers (ViTs), on the other hand, excel in this area by leveraging multi-head self-attention mechanisms to generate attention maps that dynamically gather global spatial information, significantly outperforming CNN-based architectures in various tasks. However, traditional transformer-based models come with challenges, including high computational complexity due to the self-attention mechanism and inefficiency in the static MLP fusion process. To overcome these issues, the Hybrid Transformer U-Net (HTU-net) model is proposed for breast mass segmentation in mammography. Channel and spatial enhanced self-attention mechanisms are integrated with convolutions layers in HTU-Net, creating a hybrid architecture that combines the strengths of both CNNs and ViTs. The introduction of a multiscale attention mechanism further improves the model's ability to fuse information from different resolutions, enhancing the decoder's capacity to reconstruct fine details in the segmented output. By using both local texture-based features and global contextual information, HTU-Net excels in capturing essential features, thus improving segmentation performance. The experimental results across multiple datasets, including CBIS-DDSM and INbreast, demonstrate that HTU-Net outperforms several state-of-the-art methods, achieving superior accuracy, dice similarity coefficient, and intersection over union. This work highlights the potential of hybrid architectures in advancing computer-aided diagnosis systems, particularly in improving segmentation quality and reliability for breast cancer detection.
期刊介绍:
Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.