基于交叉注意引导损失的深度双分支融合网络用于肝脏肿瘤分类

IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Information Fusion Pub Date : 2024-09-24 DOI:10.1016/j.inffus.2024.102713
Rui Wang , Xiaoshuang Shi , Shuting Pang , Yidi Chen , Xiaofeng Zhu , Wentao Wang , Jiabin Cai , Danjun Song , Kang Li
{"title":"基于交叉注意引导损失的深度双分支融合网络用于肝脏肿瘤分类","authors":"Rui Wang ,&nbsp;Xiaoshuang Shi ,&nbsp;Shuting Pang ,&nbsp;Yidi Chen ,&nbsp;Xiaofeng Zhu ,&nbsp;Wentao Wang ,&nbsp;Jiabin Cai ,&nbsp;Danjun Song ,&nbsp;Kang Li","doi":"10.1016/j.inffus.2024.102713","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, convolutional neural networks (CNNs) and multiple instance learning (MIL) methods have been successfully applied to MRI images. However, CNNs directly utilize the whole image as the model input and the downsampling strategy (like max or mean pooling) to reduce the size of the feature map, thereby possibly neglecting some local details. And MIL methods learn instance-level or local features without considering spatial information. To overcome these issues, in this paper, we propose a novel cross-attention guided loss-based dual-branch framework (LCA-DB) to leverage spatial and local image information simultaneously, which is composed of an image-based attention network (IA-Net), a patch-based attention network (PA-Net) and a cross-attention module (CA). Specifically, IA-Net directly learns image features with loss-based attention to mine significant regions, meanwhile, PA-Net captures patch-specific representations to extract crucial patches related to the tumor. Additionally, the cross-attention module is designed to integrate patch-level features by using attention weights generated from each other, thereby assisting them in mining supplement region information and enhancing the interactive collaboration of the two branches. Moreover, we employ an attention similarity loss to further reduce the semantic inconsistency of attention weights obtained from the two branches. Finally, extensive experiments on three liver tumor classification tasks demonstrate the effectiveness of the proposed framework, e.g., on the LLD-MMRI–7, our method achieves 69.2%, 65.9% and 88.5% on the seven-class liver tumor classification tasks in terms of accuracy, F<span><math><msub><mrow></mrow><mrow><mn>1</mn></mrow></msub></math></span> score and AUC, with the superior classification and interpretation performance over recent state-of-the-art methods. The source code of LCA-DB is available at <span><span>https://github.com/Wangrui-berry/Cross-attention</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102713"},"PeriodicalIF":14.7000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-attention guided loss-based deep dual-branch fusion network for liver tumor classification\",\"authors\":\"Rui Wang ,&nbsp;Xiaoshuang Shi ,&nbsp;Shuting Pang ,&nbsp;Yidi Chen ,&nbsp;Xiaofeng Zhu ,&nbsp;Wentao Wang ,&nbsp;Jiabin Cai ,&nbsp;Danjun Song ,&nbsp;Kang Li\",\"doi\":\"10.1016/j.inffus.2024.102713\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recently, convolutional neural networks (CNNs) and multiple instance learning (MIL) methods have been successfully applied to MRI images. However, CNNs directly utilize the whole image as the model input and the downsampling strategy (like max or mean pooling) to reduce the size of the feature map, thereby possibly neglecting some local details. And MIL methods learn instance-level or local features without considering spatial information. To overcome these issues, in this paper, we propose a novel cross-attention guided loss-based dual-branch framework (LCA-DB) to leverage spatial and local image information simultaneously, which is composed of an image-based attention network (IA-Net), a patch-based attention network (PA-Net) and a cross-attention module (CA). Specifically, IA-Net directly learns image features with loss-based attention to mine significant regions, meanwhile, PA-Net captures patch-specific representations to extract crucial patches related to the tumor. Additionally, the cross-attention module is designed to integrate patch-level features by using attention weights generated from each other, thereby assisting them in mining supplement region information and enhancing the interactive collaboration of the two branches. Moreover, we employ an attention similarity loss to further reduce the semantic inconsistency of attention weights obtained from the two branches. Finally, extensive experiments on three liver tumor classification tasks demonstrate the effectiveness of the proposed framework, e.g., on the LLD-MMRI–7, our method achieves 69.2%, 65.9% and 88.5% on the seven-class liver tumor classification tasks in terms of accuracy, F<span><math><msub><mrow></mrow><mrow><mn>1</mn></mrow></msub></math></span> score and AUC, with the superior classification and interpretation performance over recent state-of-the-art methods. The source code of LCA-DB is available at <span><span>https://github.com/Wangrui-berry/Cross-attention</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"114 \",\"pages\":\"Article 102713\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2024-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253524004913\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524004913","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

最近,卷积神经网络(CNN)和多实例学习(MIL)方法已成功应用于核磁共振成像。然而,卷积神经网络直接利用整个图像作为模型输入,并采用降采样策略(如最大值或均值池化)来缩小特征图的大小,从而可能忽略一些局部细节。而 MIL 方法只学习实例级或局部特征,不考虑空间信息。为了克服这些问题,我们在本文中提出了一种新颖的基于交叉注意力引导的损失双分支框架(LCA-DB),它由基于图像的注意力网络(IA-Net)、基于斑块的注意力网络(PA-Net)和交叉注意力模块(CA)组成,可同时利用空间和局部图像信息。具体来说,IA-Net 通过基于损失的注意力直接学习图像特征,挖掘重要区域;PA-Net 则捕捉特定的斑块表征,提取与肿瘤相关的关键斑块。此外,交叉注意力模块旨在通过使用彼此生成的注意力权重来整合补丁级特征,从而帮助它们挖掘补充区域信息,增强两个分支的交互协作。此外,我们还采用了注意力相似性损失来进一步降低两个分支所获得的注意力权重在语义上的不一致性。最后,在三个肝脏肿瘤分类任务上的大量实验证明了所提框架的有效性,例如,在 LLD-MMRI-7 七类肝脏肿瘤分类任务上,我们的方法在准确率、F1 分数和 AUC 方面分别达到了 69.2%、65.9% 和 88.5%,分类和解释性能均优于最近的先进方法。LCA-DB的源代码可在https://github.com/Wangrui-berry/Cross-attention。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Cross-attention guided loss-based deep dual-branch fusion network for liver tumor classification
Recently, convolutional neural networks (CNNs) and multiple instance learning (MIL) methods have been successfully applied to MRI images. However, CNNs directly utilize the whole image as the model input and the downsampling strategy (like max or mean pooling) to reduce the size of the feature map, thereby possibly neglecting some local details. And MIL methods learn instance-level or local features without considering spatial information. To overcome these issues, in this paper, we propose a novel cross-attention guided loss-based dual-branch framework (LCA-DB) to leverage spatial and local image information simultaneously, which is composed of an image-based attention network (IA-Net), a patch-based attention network (PA-Net) and a cross-attention module (CA). Specifically, IA-Net directly learns image features with loss-based attention to mine significant regions, meanwhile, PA-Net captures patch-specific representations to extract crucial patches related to the tumor. Additionally, the cross-attention module is designed to integrate patch-level features by using attention weights generated from each other, thereby assisting them in mining supplement region information and enhancing the interactive collaboration of the two branches. Moreover, we employ an attention similarity loss to further reduce the semantic inconsistency of attention weights obtained from the two branches. Finally, extensive experiments on three liver tumor classification tasks demonstrate the effectiveness of the proposed framework, e.g., on the LLD-MMRI–7, our method achieves 69.2%, 65.9% and 88.5% on the seven-class liver tumor classification tasks in terms of accuracy, F1 score and AUC, with the superior classification and interpretation performance over recent state-of-the-art methods. The source code of LCA-DB is available at https://github.com/Wangrui-berry/Cross-attention.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
期刊最新文献
Pretraining graph transformer for molecular representation with fusion of multimodal information Pan-Mamba: Effective pan-sharpening with state space model An autoencoder-based confederated clustering leveraging a robust model fusion strategy for federated unsupervised learning FairDPFL-SCS: Fair Dynamic Personalized Federated Learning with strategic client selection for improved accuracy and fairness M-IPISincNet: An explainable multi-source physics-informed neural network based on improved SincNet for rolling bearings fault diagnosis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1