Structure-Preserved Self-Attention for Fusion Image Information in Multiple Color Spaces

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE transactions on neural networks and learning systems Pub Date : 2024-11-12 DOI:10.1109/TNNLS.2024.3490800

Zhu He;Mingwei Lin;Xin Luo;Zeshui Xu

{"title":"Structure-Preserved Self-Attention for Fusion Image Information in Multiple Color Spaces","authors":"Zhu He;Mingwei Lin;Xin Luo;Zeshui Xu","doi":"10.1109/TNNLS.2024.3490800","DOIUrl":null,"url":null,"abstract":"The selection and utilization of different color spaces significantly impact the recognition performance of deep learning models in downstream tasks. Existing studies typically leverage image information from various color spaces through model integration or channel concatenation. However, these methods result in excessive model size and suboptimal utilization of image information. In this study, we propose the structure-preserved self-attention network (SPSANet) model for efficient fusion of image information from different color spaces. This model incorporates a novel structure-preserved self-attention (SPSA) module that employs a single-head pixel-wise attention mechanism, as opposed to the conventional multihead self-attention (MHSA) approach. Specifically, feature maps from all color space grouping paths are utilized for similarity matching, enabling the model to focus on critical pixel locations across different color spaces. This design mitigates the dependence of the SPSANet model on the choice of color space while enhancing the advantages of integrating multiple color spaces. The SPSANet model also employs channel shuffle operations to facilitate limited interaction between information flows from different color space paths. Experimental results demonstrate that the SPSANet model, utilizing eight common color spaces—RGB, Luv, XYZ, Lab, HSV, YCrCb, YUV, and HLS—achieves superior recognition performance with reduced parameters and computational cost.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 7","pages":"13021-13035"},"PeriodicalIF":8.9000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10750905/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The selection and utilization of different color spaces significantly impact the recognition performance of deep learning models in downstream tasks. Existing studies typically leverage image information from various color spaces through model integration or channel concatenation. However, these methods result in excessive model size and suboptimal utilization of image information. In this study, we propose the structure-preserved self-attention network (SPSANet) model for efficient fusion of image information from different color spaces. This model incorporates a novel structure-preserved self-attention (SPSA) module that employs a single-head pixel-wise attention mechanism, as opposed to the conventional multihead self-attention (MHSA) approach. Specifically, feature maps from all color space grouping paths are utilized for similarity matching, enabling the model to focus on critical pixel locations across different color spaces. This design mitigates the dependence of the SPSANet model on the choice of color space while enhancing the advantages of integrating multiple color spaces. The SPSANet model also employs channel shuffle operations to facilitate limited interaction between information flows from different color space paths. Experimental results demonstrate that the SPSANet model, utilizing eight common color spaces—RGB, Luv, XYZ, Lab, HSV, YCrCb, YUV, and HLS—achieves superior recognition performance with reduced parameters and computational cost.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在多种色彩空间中融合图像信息的结构保留自注意力

不同颜色空间的选择和利用显著影响深度学习模型在下游任务中的识别性能。现有的研究通常通过模型集成或通道连接来利用来自各种颜色空间的图像信息。然而，这些方法会导致过大的模型尺寸和图像信息的次优利用。在这项研究中，我们提出了一种结构保留自注意网络（SPSANet）模型，用于有效融合来自不同颜色空间的图像信息。与传统的多头自注意（MHSA）方法相反，该模型采用了一种新颖的结构保持自注意（SPSA）模块，该模块采用单头部像素级注意机制。具体来说，利用来自所有颜色空间分组路径的特征映射进行相似性匹配，使模型能够专注于不同颜色空间中的关键像素位置。该设计减轻了SPSANet模型对色彩空间选择的依赖，同时增强了多色彩空间集成的优势。SPSANet模型还使用通道洗牌操作来促进来自不同颜色空间路径的信息流之间的有限交互。实验结果表明，SPSANet模型利用了rgb、Luv、XYZ、Lab、HSV、YCrCb、YUV和hls 8种常用颜色空间，在降低参数和计算成本的同时，取得了较好的识别效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.