Self-Supervised High-Order Information Bottleneck Learning of Spiking Neural Network for Robust Event-Based Optical Flow Estimation

Shuangming Yang;Bernabé Linares-Barranco;Yuzhu Wu;Badong Chen
{"title":"Self-Supervised High-Order Information Bottleneck Learning of Spiking Neural Network for Robust Event-Based Optical Flow Estimation","authors":"Shuangming Yang;Bernabé Linares-Barranco;Yuzhu Wu;Badong Chen","doi":"10.1109/TPAMI.2024.3510627","DOIUrl":null,"url":null,"abstract":"Event cameras form a fundamental foundation for visual perception in scenes characterized by high speed and a wide dynamic range. Although deep learning techniques have achieved remarkable success in estimating event-based optical flow, existing methods have not adequately addressed the significance of temporal information in capturing spatiotemporal features. Due to the dynamics of spiking neurons in SNNs, which preserve important information while forgetting redundant information over time, they are expected to outperform analog neural networks (ANNs) with the same architecture and size in sequential regression tasks. In addition, SNNs on neuromorphic hardware achieve advantages of extremely low power consumption. However, present SNN architectures encounter issues related to limited generalization and robustness during training, particularly in noisy scenes. To tackle these problems, this study introduces an innovative spike-based self-supervised learning algorithm known as SeLHIB, which leverages the information bottleneck theory. By utilizing event-based camera inputs, SeLHIB enables robust estimation of optical flow in the presence of noise. To the best of our knowledge, this is the first proposal of a self-supervised information bottleneck learning strategy based on SNNs. Furthermore, we develop spike-based self-supervised algorithms with nonlinear and high-order information bottleneck learning that employs nonlinear and high-order mutual information to enhance the extraction of relevant information and eliminate redundancy. We demonstrate that SeLHIB significantly enhances the generalization ability and robustness of optical flow estimation in various noise conditions. In terms of energy efficiency, SeLHIB achieves 90.44% and 45.70% cut down of energy consumption compared to its counterpart ANN and counterpart SNN models, while attaining 33.78% lower AEE (MVSEC), 5.96% lower RSAT (ECD) and 6.21% lower RSAT (HQF) compared to the counterpart ANN implementations with the same sizes and architectures.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2280-2297"},"PeriodicalIF":18.6000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10772601","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10772601/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Event cameras form a fundamental foundation for visual perception in scenes characterized by high speed and a wide dynamic range. Although deep learning techniques have achieved remarkable success in estimating event-based optical flow, existing methods have not adequately addressed the significance of temporal information in capturing spatiotemporal features. Due to the dynamics of spiking neurons in SNNs, which preserve important information while forgetting redundant information over time, they are expected to outperform analog neural networks (ANNs) with the same architecture and size in sequential regression tasks. In addition, SNNs on neuromorphic hardware achieve advantages of extremely low power consumption. However, present SNN architectures encounter issues related to limited generalization and robustness during training, particularly in noisy scenes. To tackle these problems, this study introduces an innovative spike-based self-supervised learning algorithm known as SeLHIB, which leverages the information bottleneck theory. By utilizing event-based camera inputs, SeLHIB enables robust estimation of optical flow in the presence of noise. To the best of our knowledge, this is the first proposal of a self-supervised information bottleneck learning strategy based on SNNs. Furthermore, we develop spike-based self-supervised algorithms with nonlinear and high-order information bottleneck learning that employs nonlinear and high-order mutual information to enhance the extraction of relevant information and eliminate redundancy. We demonstrate that SeLHIB significantly enhances the generalization ability and robustness of optical flow estimation in various noise conditions. In terms of energy efficiency, SeLHIB achieves 90.44% and 45.70% cut down of energy consumption compared to its counterpart ANN and counterpart SNN models, while attaining 33.78% lower AEE (MVSEC), 5.96% lower RSAT (ECD) and 6.21% lower RSAT (HQF) compared to the counterpart ANN implementations with the same sizes and architectures.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
自监督高阶信息瓶颈学习的脉冲神经网络鲁棒事件光流估计
事件相机是高速、大动态范围场景中视觉感知的基础。虽然深度学习技术在估计基于事件的光流方面取得了显著的成功,但现有的方法并没有充分解决时间信息在捕获时空特征方面的重要性。由于snn中的尖峰神经元在保留重要信息的同时会随着时间的推移而遗忘冗余信息,因此它们有望在序列回归任务中优于具有相同结构和大小的模拟神经网络(ann)。此外,基于神经形态硬件的snn具有极低功耗的优点。然而,目前的SNN架构在训练过程中遇到了与有限的泛化和鲁棒性相关的问题,特别是在有噪声的场景中。为了解决这些问题,本研究引入了一种创新的基于峰值的自监督学习算法,称为SeLHIB,它利用了信息瓶颈理论。通过利用基于事件的相机输入,SeLHIB可以在存在噪声的情况下对光流进行鲁棒估计。据我们所知,这是基于snn的自监督信息瓶颈学习策略的首次提出。此外,我们开发了非线性高阶信息瓶颈学习的基于峰值的自监督算法,该算法利用非线性高阶互信息来增强相关信息的提取并消除冗余。我们证明SeLHIB显著提高了各种噪声条件下光流估计的泛化能力和鲁棒性。在能效方面,SeLHIB与同类ANN模型和同类SNN模型相比,能耗分别降低了90.44%和45.70%,而在相同规模和架构的同类ANN实现中,AEE (MVSEC)降低了33.78%,RSAT (ECD)降低了5.96%,RSAT (HQF)降低了6.21%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Calibrating Biased Distribution in VFM-Derived Latent Space via Cross-Domain Geometric Consistency. Penny-Wise and Pound-Foolish in AI-Generated Image Detection. 50 Years of Automated Face Recognition. Soft Label Pruning and Quantization for Large-Scale Dataset Distillation. On the Adversarial Transferability of Generalized "Skip Connections".
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1