SpQuant-SNN：具有稀疏激活的超低精度膜电位释放了设备上尖峰神经网络应用的潜力

IF 4.3 3区材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC ACS Applied Electronic Materials Pub Date : 2024-09-04 DOI:10.3389/fnins.2024.1440000

Ahmed Hasssan, Jian Meng, Anupreetham Anupreetham, Jae-sun Seo

{"title":"SpQuant-SNN：具有稀疏激活的超低精度膜电位释放了设备上尖峰神经网络应用的潜力","authors":"Ahmed Hasssan, Jian Meng, Anupreetham Anupreetham, Jae-sun Seo","doi":"10.3389/fnins.2024.1440000","DOIUrl":null,"url":null,"abstract":"Spiking neural networks (SNNs) have received increasing attention due to their high biological plausibility and energy efficiency. The binary spike-based information propagation enables efficient sparse computation in event-based and static computer vision applications. However, the weight precision and especially the membrane potential precision remain as high-precision values (e.g., 32 bits) in state-of-the-art SNN algorithms. Each neuron in an SNN stores the membrane potential over time and typically updates its value in every time step. Such frequent read/write operations of high-precision membrane potential incur storage and memory access overhead in SNNs, which undermines the SNNs' compatibility with resource-constrained hardware. To resolve this inefficiency, prior works have explored the time step reduction and low-precision representation of membrane potential at a limited scale and reported significant accuracy drops. Furthermore, while recent advances in on-device AI present pruning and quantization optimization with different architectures and datasets, simultaneous pruning with quantization is highly under-explored in SNNs. In this work, we present SpQuant-SNN, a fully-quantized spiking neural network with ultra-low precision weights, membrane potential, and high spatial-channel sparsity, enabling the end-to-end low precision with significantly reduced operations on SNN. First, we propose an integer-only quantization scheme for the membrane potential with a stacked surrogate gradient function, a simple-yet-effective method that enables the smooth learning process of quantized SNN training. Second, we implement spatial-channel pruning with membrane potential prior, toward reducing the layer-wise computational complexity, and floating-point operations (FLOPs) in SNNs. Finally, to further improve the accuracy of low-precision and sparse SNN, we propose a self-adaptive learnable potential threshold for SNN training. Equipped with high biological adaptiveness, minimal computations, and memory utilization, SpQuant-SNN achieves state-of-the-art performance across multiple SNN models for both event-based and static image datasets, including both image classification and object detection tasks. The proposed SpQuant-SNN achieved up to 13× memory reduction and &gt;4.7× FLOPs reduction with &lt; 1.8% accuracy degradation for both classification and object detection tasks, compared to the SOTA baseline.","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SpQuant-SNN: ultra-low precision membrane potential with sparse activations unlock the potential of on-device spiking neural networks applications\",\"authors\":\"Ahmed Hasssan, Jian Meng, Anupreetham Anupreetham, Jae-sun Seo\",\"doi\":\"10.3389/fnins.2024.1440000\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spiking neural networks (SNNs) have received increasing attention due to their high biological plausibility and energy efficiency. The binary spike-based information propagation enables efficient sparse computation in event-based and static computer vision applications. However, the weight precision and especially the membrane potential precision remain as high-precision values (e.g., 32 bits) in state-of-the-art SNN algorithms. Each neuron in an SNN stores the membrane potential over time and typically updates its value in every time step. Such frequent read/write operations of high-precision membrane potential incur storage and memory access overhead in SNNs, which undermines the SNNs' compatibility with resource-constrained hardware. To resolve this inefficiency, prior works have explored the time step reduction and low-precision representation of membrane potential at a limited scale and reported significant accuracy drops. Furthermore, while recent advances in on-device AI present pruning and quantization optimization with different architectures and datasets, simultaneous pruning with quantization is highly under-explored in SNNs. In this work, we present SpQuant-SNN, a fully-quantized spiking neural network with ultra-low precision weights, membrane potential, and high spatial-channel sparsity, enabling the end-to-end low precision with significantly reduced operations on SNN. First, we propose an integer-only quantization scheme for the membrane potential with a stacked surrogate gradient function, a simple-yet-effective method that enables the smooth learning process of quantized SNN training. Second, we implement spatial-channel pruning with membrane potential prior, toward reducing the layer-wise computational complexity, and floating-point operations (FLOPs) in SNNs. Finally, to further improve the accuracy of low-precision and sparse SNN, we propose a self-adaptive learnable potential threshold for SNN training. Equipped with high biological adaptiveness, minimal computations, and memory utilization, SpQuant-SNN achieves state-of-the-art performance across multiple SNN models for both event-based and static image datasets, including both image classification and object detection tasks. The proposed SpQuant-SNN achieved up to 13× memory reduction and &gt;4.7× FLOPs reduction with &lt; 1.8% accuracy degradation for both classification and object detection tasks, compared to the SOTA baseline.\",\"PeriodicalId\":3,\"journal\":{\"name\":\"ACS Applied Electronic Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Electronic Materials\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3389/fnins.2024.1440000\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fnins.2024.1440000","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

尖峰神经网络（SNN）因其高度的生物合理性和能效而受到越来越多的关注。基于二进制尖峰的信息传播能在基于事件和静态的计算机视觉应用中实现高效的稀疏计算。然而，在最先进的 SNN 算法中，权重精度，尤其是膜电位精度仍然是高精度值（如 32 位）。SNN 中的每个神经元都会随时间存储膜电位，并通常在每个时间步中更新其值。这种频繁的高精度膜电位读/写操作会在 SNN 中产生存储和内存访问开销，从而影响 SNN 与资源受限硬件的兼容性。为了解决这一低效问题，之前的研究已经探索了在有限范围内减少时间步长和膜电位的低精度表示方法，并报告了显著的精度下降。此外，虽然最近在设备上人工智能领域取得的进展提出了针对不同架构和数据集的剪枝和量化优化，但在 SNNs 中同时进行剪枝和量化的探索还非常不足。在这项工作中，我们提出了 SpQuant-SNN，一种具有超低精度权重、膜电位和高空间通道稀疏性的完全量化尖峰神经网络，从而实现了端到端的低精度，并显著减少了对 SNN 的操作。首先，我们针对膜电位提出了一种堆叠代梯度函数的纯整数量化方案，这是一种简单而有效的方法，可实现量化 SNN 训练的平滑学习过程。其次，我们利用膜电位先验实现了空间通道剪枝，从而降低了 SNN 的层级计算复杂度和浮点运算 (FLOP)。最后，为了进一步提高低精度和稀疏 SNN 的准确性，我们提出了一种用于 SNN 训练的自适应可学习电位阈值。SpQuant-SNN 具有较高的生物适应性，计算量和内存利用率极低，在基于事件和静态图像数据集的多个 SNN 模型（包括图像分类和物体检测任务）中都取得了最先进的性能。与 SOTA 基线相比，SpQuant-SNN 在分类和物体检测任务中实现了高达 13 倍的内存缩减和 4.7 倍的 FLOPs 缩减，而精度降低了 1.8%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SpQuant-SNN: ultra-low precision membrane potential with sparse activations unlock the potential of on-device spiking neural networks applications

Spiking neural networks (SNNs) have received increasing attention due to their high biological plausibility and energy efficiency. The binary spike-based information propagation enables efficient sparse computation in event-based and static computer vision applications. However, the weight precision and especially the membrane potential precision remain as high-precision values (e.g., 32 bits) in state-of-the-art SNN algorithms. Each neuron in an SNN stores the membrane potential over time and typically updates its value in every time step. Such frequent read/write operations of high-precision membrane potential incur storage and memory access overhead in SNNs, which undermines the SNNs' compatibility with resource-constrained hardware. To resolve this inefficiency, prior works have explored the time step reduction and low-precision representation of membrane potential at a limited scale and reported significant accuracy drops. Furthermore, while recent advances in on-device AI present pruning and quantization optimization with different architectures and datasets, simultaneous pruning with quantization is highly under-explored in SNNs. In this work, we present SpQuant-SNN, a fully-quantized spiking neural network with ultra-low precision weights, membrane potential, and high spatial-channel sparsity, enabling the end-to-end low precision with significantly reduced operations on SNN. First, we propose an integer-only quantization scheme for the membrane potential with a stacked surrogate gradient function, a simple-yet-effective method that enables the smooth learning process of quantized SNN training. Second, we implement spatial-channel pruning with membrane potential prior, toward reducing the layer-wise computational complexity, and floating-point operations (FLOPs) in SNNs. Finally, to further improve the accuracy of low-precision and sparse SNN, we propose a self-adaptive learnable potential threshold for SNN training. Equipped with high biological adaptiveness, minimal computations, and memory utilization, SpQuant-SNN achieves state-of-the-art performance across multiple SNN models for both event-based and static image datasets, including both image classification and object detection tasks. The proposed SpQuant-SNN achieved up to 13× memory reduction and >4.7× FLOPs reduction with < 1.8% accuracy degradation for both classification and object detection tasks, compared to the SOTA baseline.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACS Applied Electronic Materials Multiple-

CiteScore

7.20

自引率

4.30%

发文量

567