{"title":"FI-Net: Rethinking Feature Interactions for Medical Image Segmentation","authors":"Yuhan Ding, Jinhui Liu, Yunbo He, Jinliang Huang, Haisu Liang, Zhenglin Yi, Yongjie Wang","doi":"10.1002/aisy.202400201","DOIUrl":null,"url":null,"abstract":"<p>To solve the problems of existing hybrid networks based on convolutional neural networks (CNN) and Transformers, we propose a new encoder–decoder network FI-Net based on CNN-Transformer for medical image segmentation. In the encoder part, a dual-stream encoder is used to capture local details and long-range dependencies. Moreover, the attentional feature fusion module is used to perform interactive feature fusion of dual-branch features, maximizing the retention of local details and global semantic information in medical images. At the same time, the multi-scale feature aggregation module is used to aggregate local information and capture multi-scale context to mine more semantic details. The multi-level feature bridging module is used in skip connections to bridge multi-level features and mask information to assist multi-scale feature interaction. Experimental results on seven public medical image datasets fully demonstrate the effectiveness and advancement of our method. In future work, we plan to extend FI-Net to support 3D medical image segmentation tasks and combine self-supervised learning and knowledge distillation to alleviate the overfitting problem of limited data training.</p>","PeriodicalId":93858,"journal":{"name":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","volume":"6 12","pages":""},"PeriodicalIF":6.1000,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aisy.202400201","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","FirstCategoryId":"1085","ListUrlMain":"https://advanced.onlinelibrary.wiley.com/doi/10.1002/aisy.202400201","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
To solve the problems of existing hybrid networks based on convolutional neural networks (CNN) and Transformers, we propose a new encoder–decoder network FI-Net based on CNN-Transformer for medical image segmentation. In the encoder part, a dual-stream encoder is used to capture local details and long-range dependencies. Moreover, the attentional feature fusion module is used to perform interactive feature fusion of dual-branch features, maximizing the retention of local details and global semantic information in medical images. At the same time, the multi-scale feature aggregation module is used to aggregate local information and capture multi-scale context to mine more semantic details. The multi-level feature bridging module is used in skip connections to bridge multi-level features and mask information to assist multi-scale feature interaction. Experimental results on seven public medical image datasets fully demonstrate the effectiveness and advancement of our method. In future work, we plan to extend FI-Net to support 3D medical image segmentation tasks and combine self-supervised learning and knowledge distillation to alleviate the overfitting problem of limited data training.