Zhenyang Huang, Yixing Zhao, Jinjiang Li, Yepeng Liu
{"title":"Bgman: Boundary-Prior-Guided Multi-scale Aggregation Network for skin lesion segmentation","authors":"Zhenyang Huang, Yixing Zhao, Jinjiang Li, Yepeng Liu","doi":"10.1007/s13042-024-02284-3","DOIUrl":null,"url":null,"abstract":"<p>Skin lesion segmentation is a fundamental task in the field of medical image analysis. Deep learning approaches have become essential tools for segmenting medical images, as their accuracy in effectively analyzing abnormalities plays a critical role in determining the ultimate diagnostic results. Because of the inherent difficulties presented by medical images, including variations in shapes and sizes, along with the indistinct boundaries between lesions and the surrounding backgrounds, certain conventional algorithms face difficulties in fulfilling the growing requirements for elevated accuracy in processing medical images. To enhance the performance in capturing edge features and fine details of lesion processing, this paper presents the Boundary-Prior-Guided Multi-Scale Aggregation Network for skin lesion segmentation (BGMAN). The proposed BGMAN follows a basic Encoder–Decoder structure, wherein the encoder network employs prevalent CNN-based architectures to capture semantic information. We propose the Transformer Bridge Block (TBB) and employ it to enhance multi-scale features captured by the encoder. The TBB strengthens the intensity of weak feature information, establishing long-distance relationships between feature information. In order to augment BGMAN’s capability to identify boundaries, a boundary-guided decoder is designed, utilizing the Boundary Aware Block (BAB) and Cross Scale Fusion Block (CSFB) to guide the decoding learning process. BAB can acquire features embedded with explicit boundary information under the supervision of a boundary mask, while CSFB aggregates boundary features from different scales using learnable embeddings. The proposed method has been validated on the ISIC2016, ISIC2017, and <span>\\(PH^2\\)</span> datasets. It outperforms current mainstream networks with the following results: F1 92.99 and IoU 87.71 on ISIC2016, F1 86.42 and IoU 78.34 on ISIC2017, and F1 94.83 and IoU 90.26 on <span>\\(PH^2\\)</span>.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"43 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02284-3","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Skin lesion segmentation is a fundamental task in the field of medical image analysis. Deep learning approaches have become essential tools for segmenting medical images, as their accuracy in effectively analyzing abnormalities plays a critical role in determining the ultimate diagnostic results. Because of the inherent difficulties presented by medical images, including variations in shapes and sizes, along with the indistinct boundaries between lesions and the surrounding backgrounds, certain conventional algorithms face difficulties in fulfilling the growing requirements for elevated accuracy in processing medical images. To enhance the performance in capturing edge features and fine details of lesion processing, this paper presents the Boundary-Prior-Guided Multi-Scale Aggregation Network for skin lesion segmentation (BGMAN). The proposed BGMAN follows a basic Encoder–Decoder structure, wherein the encoder network employs prevalent CNN-based architectures to capture semantic information. We propose the Transformer Bridge Block (TBB) and employ it to enhance multi-scale features captured by the encoder. The TBB strengthens the intensity of weak feature information, establishing long-distance relationships between feature information. In order to augment BGMAN’s capability to identify boundaries, a boundary-guided decoder is designed, utilizing the Boundary Aware Block (BAB) and Cross Scale Fusion Block (CSFB) to guide the decoding learning process. BAB can acquire features embedded with explicit boundary information under the supervision of a boundary mask, while CSFB aggregates boundary features from different scales using learnable embeddings. The proposed method has been validated on the ISIC2016, ISIC2017, and \(PH^2\) datasets. It outperforms current mainstream networks with the following results: F1 92.99 and IoU 87.71 on ISIC2016, F1 86.42 and IoU 78.34 on ISIC2017, and F1 94.83 and IoU 90.26 on \(PH^2\).
期刊介绍:
Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data.
The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC.
Key research areas to be covered by the journal include:
Machine Learning for modeling interactions between systems
Pattern Recognition technology to support discovery of system-environment interaction
Control of system-environment interactions
Biochemical interaction in biological and biologically-inspired systems
Learning for improvement of communication schemes between systems