{"title":"SNN using color-opponent and attention mechanisms for object recognition","authors":"Zhiwei Yao , Shaobing Gao , Wenjuan Li","doi":"10.1016/j.patcog.2024.111070","DOIUrl":null,"url":null,"abstract":"<div><div>The current spiking neural network (SNN) relies on spike-timing-dependent plasticity (STDP) primarily for shape learning in object recognition tasks, overlooking the equally critical aspect of color information. To address this gap, our study introduces an unsupervised variant of STDP that incorporates principles from color-opponency mechanisms (COM) and classical receptive fields (CRF) found in the biological visual system, facilitating the integration of color information during parameter updates within the SNN architecture. Our approach initially preprocesses images into two distinct feature maps: one for shape and another for color. Then, signals derived from COM and intensity concurrently drive the STDP process, thereby updating parameters associated with both color and shape feature maps. Furthermore, we propose a channel-wise attention mechanism to enhance differentiation among objects sharing similar shapes or colors. Specifically, this mechanism utilizes convolution to generate an output spike-wave, identifying a winner based on earliest spike timing and maximal potential. The winning kernel computes attention, which is then applied via convolution to each input image feature map, generating post-feature maps. A STDP-like normalization rule compares firing times between pre- and post-feature maps, dynamically adjusting channel weights to optimize object recognition during the training phase.</div><div>We assessed the proposed algorithm using SNN with both single-layer and multi-layer architectures across three datasets. Experimental findings highlight its efficacy and superiority in complex object recognition tasks compared to state-of-the-art (SOTA) algorithms. Notably, our approach achieved a significant 20% performance improvement over the SOTA on the Caltech-101 dataset. Moreover, the algorithm is well-suited for hardware implementation and energy efficiency, leveraging a winner-selection mechanism based on the earliest spike time.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"158 ","pages":"Article 111070"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324008215","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The current spiking neural network (SNN) relies on spike-timing-dependent plasticity (STDP) primarily for shape learning in object recognition tasks, overlooking the equally critical aspect of color information. To address this gap, our study introduces an unsupervised variant of STDP that incorporates principles from color-opponency mechanisms (COM) and classical receptive fields (CRF) found in the biological visual system, facilitating the integration of color information during parameter updates within the SNN architecture. Our approach initially preprocesses images into two distinct feature maps: one for shape and another for color. Then, signals derived from COM and intensity concurrently drive the STDP process, thereby updating parameters associated with both color and shape feature maps. Furthermore, we propose a channel-wise attention mechanism to enhance differentiation among objects sharing similar shapes or colors. Specifically, this mechanism utilizes convolution to generate an output spike-wave, identifying a winner based on earliest spike timing and maximal potential. The winning kernel computes attention, which is then applied via convolution to each input image feature map, generating post-feature maps. A STDP-like normalization rule compares firing times between pre- and post-feature maps, dynamically adjusting channel weights to optimize object recognition during the training phase.
We assessed the proposed algorithm using SNN with both single-layer and multi-layer architectures across three datasets. Experimental findings highlight its efficacy and superiority in complex object recognition tasks compared to state-of-the-art (SOTA) algorithms. Notably, our approach achieved a significant 20% performance improvement over the SOTA on the Caltech-101 dataset. Moreover, the algorithm is well-suited for hardware implementation and energy efficiency, leveraging a winner-selection mechanism based on the earliest spike time.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.