Self-Supervised High-Order Information Bottleneck Learning of Spiking Neural Network for Robust Event-Based Optical Flow Estimation

IF 18.6 IEEE transactions on pattern analysis and machine intelligence Pub Date : 2024-12-02 DOI:10.1109/TPAMI.2024.3510627

Shuangming Yang;Bernabé Linares-Barranco;Yuzhu Wu;Badong Chen

{"title":"Self-Supervised High-Order Information Bottleneck Learning of Spiking Neural Network for Robust Event-Based Optical Flow Estimation","authors":"Shuangming Yang;Bernabé Linares-Barranco;Yuzhu Wu;Badong Chen","doi":"10.1109/TPAMI.2024.3510627","DOIUrl":null,"url":null,"abstract":"Event cameras form a fundamental foundation for visual perception in scenes characterized by high speed and a wide dynamic range. Although deep learning techniques have achieved remarkable success in estimating event-based optical flow, existing methods have not adequately addressed the significance of temporal information in capturing spatiotemporal features. Due to the dynamics of spiking neurons in SNNs, which preserve important information while forgetting redundant information over time, they are expected to outperform analog neural networks (ANNs) with the same architecture and size in sequential regression tasks. In addition, SNNs on neuromorphic hardware achieve advantages of extremely low power consumption. However, present SNN architectures encounter issues related to limited generalization and robustness during training, particularly in noisy scenes. To tackle these problems, this study introduces an innovative spike-based self-supervised learning algorithm known as SeLHIB, which leverages the information bottleneck theory. By utilizing event-based camera inputs, SeLHIB enables robust estimation of optical flow in the presence of noise. To the best of our knowledge, this is the first proposal of a self-supervised information bottleneck learning strategy based on SNNs. Furthermore, we develop spike-based self-supervised algorithms with nonlinear and high-order information bottleneck learning that employs nonlinear and high-order mutual information to enhance the extraction of relevant information and eliminate redundancy. We demonstrate that SeLHIB significantly enhances the generalization ability and robustness of optical flow estimation in various noise conditions. In terms of energy efficiency, SeLHIB achieves 90.44% and 45.70% cut down of energy consumption compared to its counterpart ANN and counterpart SNN models, while attaining 33.78% lower AEE (MVSEC), 5.96% lower RSAT (ECD) and 6.21% lower RSAT (HQF) compared to the counterpart ANN implementations with the same sizes and architectures.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2280-2297"},"PeriodicalIF":18.6000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10772601","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10772601/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Event cameras form a fundamental foundation for visual perception in scenes characterized by high speed and a wide dynamic range. Although deep learning techniques have achieved remarkable success in estimating event-based optical flow, existing methods have not adequately addressed the significance of temporal information in capturing spatiotemporal features. Due to the dynamics of spiking neurons in SNNs, which preserve important information while forgetting redundant information over time, they are expected to outperform analog neural networks (ANNs) with the same architecture and size in sequential regression tasks. In addition, SNNs on neuromorphic hardware achieve advantages of extremely low power consumption. However, present SNN architectures encounter issues related to limited generalization and robustness during training, particularly in noisy scenes. To tackle these problems, this study introduces an innovative spike-based self-supervised learning algorithm known as SeLHIB, which leverages the information bottleneck theory. By utilizing event-based camera inputs, SeLHIB enables robust estimation of optical flow in the presence of noise. To the best of our knowledge, this is the first proposal of a self-supervised information bottleneck learning strategy based on SNNs. Furthermore, we develop spike-based self-supervised algorithms with nonlinear and high-order information bottleneck learning that employs nonlinear and high-order mutual information to enhance the extraction of relevant information and eliminate redundancy. We demonstrate that SeLHIB significantly enhances the generalization ability and robustness of optical flow estimation in various noise conditions. In terms of energy efficiency, SeLHIB achieves 90.44% and 45.70% cut down of energy consumption compared to its counterpart ANN and counterpart SNN models, while attaining 33.78% lower AEE (MVSEC), 5.96% lower RSAT (ECD) and 6.21% lower RSAT (HQF) compared to the counterpart ANN implementations with the same sizes and architectures.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

自监督高阶信息瓶颈学习的脉冲神经网络鲁棒事件光流估计

事件相机是高速、大动态范围场景中视觉感知的基础。虽然深度学习技术在估计基于事件的光流方面取得了显著的成功，但现有的方法并没有充分解决时间信息在捕获时空特征方面的重要性。由于snn中的尖峰神经元在保留重要信息的同时会随着时间的推移而遗忘冗余信息，因此它们有望在序列回归任务中优于具有相同结构和大小的模拟神经网络（ann）。此外，基于神经形态硬件的snn具有极低功耗的优点。然而，目前的SNN架构在训练过程中遇到了与有限的泛化和鲁棒性相关的问题，特别是在有噪声的场景中。为了解决这些问题，本研究引入了一种创新的基于峰值的自监督学习算法，称为SeLHIB，它利用了信息瓶颈理论。通过利用基于事件的相机输入，SeLHIB可以在存在噪声的情况下对光流进行鲁棒估计。据我们所知，这是基于snn的自监督信息瓶颈学习策略的首次提出。此外，我们开发了非线性高阶信息瓶颈学习的基于峰值的自监督算法，该算法利用非线性高阶互信息来增强相关信息的提取并消除冗余。我们证明SeLHIB显著提高了各种噪声条件下光流估计的泛化能力和鲁棒性。在能效方面，SeLHIB与同类ANN模型和同类SNN模型相比，能耗分别降低了90.44%和45.70%，而在相同规模和架构的同类ANN实现中，AEE （MVSEC）降低了33.78%,RSAT （ECD）降低了5.96%,RSAT （HQF）降低了6.21%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊