SNN using color-opponent and attention mechanisms for object recognition

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Pub Date : 2024-10-05 DOI:10.1016/j.patcog.2024.111070

Zhiwei Yao , Shaobing Gao , Wenjuan Li

{"title":"SNN using color-opponent and attention mechanisms for object recognition","authors":"Zhiwei Yao , Shaobing Gao , Wenjuan Li","doi":"10.1016/j.patcog.2024.111070","DOIUrl":null,"url":null,"abstract":"<div><div>The current spiking neural network (SNN) relies on spike-timing-dependent plasticity (STDP) primarily for shape learning in object recognition tasks, overlooking the equally critical aspect of color information. To address this gap, our study introduces an unsupervised variant of STDP that incorporates principles from color-opponency mechanisms (COM) and classical receptive fields (CRF) found in the biological visual system, facilitating the integration of color information during parameter updates within the SNN architecture. Our approach initially preprocesses images into two distinct feature maps: one for shape and another for color. Then, signals derived from COM and intensity concurrently drive the STDP process, thereby updating parameters associated with both color and shape feature maps. Furthermore, we propose a channel-wise attention mechanism to enhance differentiation among objects sharing similar shapes or colors. Specifically, this mechanism utilizes convolution to generate an output spike-wave, identifying a winner based on earliest spike timing and maximal potential. The winning kernel computes attention, which is then applied via convolution to each input image feature map, generating post-feature maps. A STDP-like normalization rule compares firing times between pre- and post-feature maps, dynamically adjusting channel weights to optimize object recognition during the training phase.</div><div>We assessed the proposed algorithm using SNN with both single-layer and multi-layer architectures across three datasets. Experimental findings highlight its efficacy and superiority in complex object recognition tasks compared to state-of-the-art (SOTA) algorithms. Notably, our approach achieved a significant 20% performance improvement over the SOTA on the Caltech-101 dataset. Moreover, the algorithm is well-suited for hardware implementation and energy efficiency, leveraging a winner-selection mechanism based on the earliest spike time.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"158 ","pages":"Article 111070"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324008215","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The current spiking neural network (SNN) relies on spike-timing-dependent plasticity (STDP) primarily for shape learning in object recognition tasks, overlooking the equally critical aspect of color information. To address this gap, our study introduces an unsupervised variant of STDP that incorporates principles from color-opponency mechanisms (COM) and classical receptive fields (CRF) found in the biological visual system, facilitating the integration of color information during parameter updates within the SNN architecture. Our approach initially preprocesses images into two distinct feature maps: one for shape and another for color. Then, signals derived from COM and intensity concurrently drive the STDP process, thereby updating parameters associated with both color and shape feature maps. Furthermore, we propose a channel-wise attention mechanism to enhance differentiation among objects sharing similar shapes or colors. Specifically, this mechanism utilizes convolution to generate an output spike-wave, identifying a winner based on earliest spike timing and maximal potential. The winning kernel computes attention, which is then applied via convolution to each input image feature map, generating post-feature maps. A STDP-like normalization rule compares firing times between pre- and post-feature maps, dynamically adjusting channel weights to optimize object recognition during the training phase.

We assessed the proposed algorithm using SNN with both single-layer and multi-layer architectures across three datasets. Experimental findings highlight its efficacy and superiority in complex object recognition tasks compared to state-of-the-art (SOTA) algorithms. Notably, our approach achieved a significant 20% performance improvement over the SOTA on the Caltech-101 dataset. Moreover, the algorithm is well-suited for hardware implementation and energy efficiency, leveraging a winner-selection mechanism based on the earliest spike time.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用颜色-对手和注意力机制进行物体识别的 SNN

目前的尖峰神经网络（SNN）主要依靠尖峰计时可塑性（STDP）进行物体识别任务中的形状学习，而忽略了同样重要的颜色信息。为了弥补这一缺陷，我们的研究引入了一种无监督的 STDP 变体，该变体结合了生物视觉系统中的色彩反应机制（COM）和经典感受野（CRF）原理，有助于在 SNN 架构内的参数更新过程中整合色彩信息。我们的方法首先将图像预处理成两个不同的特征图：一个是形状特征图，另一个是颜色特征图。然后，来自 COM 和强度的信号同时驱动 STDP 流程，从而更新与颜色和形状特征图相关的参数。此外，我们还提出了一种通道关注机制，以加强对具有相似形状或颜色的物体的区分。具体来说，该机制利用卷积生成输出尖峰波，并根据最早的尖峰时间和最大电位确定获胜者。获胜内核计算注意力，然后通过卷积应用于每个输入图像特征图，生成后特征图。在训练阶段，类似 STDP 的归一化规则会比较前特征图和后特征图之间的点燃时间，动态调整通道权重以优化目标识别。实验结果表明，与最先进的（SOTA）算法相比，该算法在复杂的物体识别任务中更有效、更优越。值得注意的是，在 Caltech-101 数据集上，我们的方法比 SOTA 算法显著提高了 20% 的性能。此外，该算法利用基于最早尖峰时间的优胜者选择机制，非常适合硬件实现和提高能效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.