Achieving personalized intelligence at the edge with real-time learning capabilities holds enormous promise in enhancing our daily experiences and helping decision making, planning, and sensing. However, efficient and reliable edge learning remains difficult with current technology due to the lack of personalized data, insufficient hardware capabilities, and inherent challenges posed by online learning. Over time and across multiple developmental stages, the brain has evolved to efficiently incorporate new knowledge by gradually building on previous knowledge. In this work, we emulate the multiple stages of learning with digital neuromorphic technology that simulates the neural and synaptic processes of the brain using two stages of learning. First, a meta-training stage trains the hyperparameters of synaptic plasticity for one-shot learning using a differentiable simulation of the neuromorphic hardware. This meta-training process refines a hardware local three-factor synaptic plasticity rule and its associated hyperparameters to align with the trained task domain. In a subsequent deployment stage, these optimized hyperparameters enable fast, data-efficient, and accurate learning of new classes. We demonstrate our approach using event-driven vision sensor data and the Intel Loihi neuromorphic processor with its plasticity dynamics, achieving real-time one-shot learning of new classes that is vastly improved over transfer learning. Our methodology can be deployed with arbitrary plasticity models and can be applied to situations demanding quick learning and adaptation at the edge, such as navigating unfamiliar environments or learning unexpected categories of data through user engagement.
{"title":"Emulating Brain-like Rapid Learning in Neuromorphic Edge Computing","authors":"Kenneth Stewart, Michael Neumeier, Sumit Bam Shrestha, Garrick Orchard, Emre Neftci","doi":"arxiv-2408.15800","DOIUrl":"https://doi.org/arxiv-2408.15800","url":null,"abstract":"Achieving personalized intelligence at the edge with real-time learning\u0000capabilities holds enormous promise in enhancing our daily experiences and\u0000helping decision making, planning, and sensing. However, efficient and reliable\u0000edge learning remains difficult with current technology due to the lack of\u0000personalized data, insufficient hardware capabilities, and inherent challenges\u0000posed by online learning. Over time and across multiple developmental stages, the brain has evolved to\u0000efficiently incorporate new knowledge by gradually building on previous\u0000knowledge. In this work, we emulate the multiple stages of learning with\u0000digital neuromorphic technology that simulates the neural and synaptic\u0000processes of the brain using two stages of learning. First, a meta-training\u0000stage trains the hyperparameters of synaptic plasticity for one-shot learning\u0000using a differentiable simulation of the neuromorphic hardware. This\u0000meta-training process refines a hardware local three-factor synaptic plasticity\u0000rule and its associated hyperparameters to align with the trained task domain.\u0000In a subsequent deployment stage, these optimized hyperparameters enable fast,\u0000data-efficient, and accurate learning of new classes. We demonstrate our\u0000approach using event-driven vision sensor data and the Intel Loihi neuromorphic\u0000processor with its plasticity dynamics, achieving real-time one-shot learning\u0000of new classes that is vastly improved over transfer learning. Our methodology\u0000can be deployed with arbitrary plasticity models and can be applied to\u0000situations demanding quick learning and adaptation at the edge, such as\u0000navigating unfamiliar environments or learning unexpected categories of data\u0000through user engagement.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Known as low energy consumption networks, spiking neural networks (SNNs) have gained a lot of attention within the past decades. While SNNs are increasing competitive with artificial neural networks (ANNs) for vision tasks, they are rarely used for long sequence tasks, despite their intrinsic temporal dynamics. In this work, we develop spiking state space models (SpikingSSMs) for long sequence learning by leveraging on the sequence learning abilities of state space models (SSMs). Inspired by dendritic neuron structure, we hierarchically integrate neuronal dynamics with the original SSM block, meanwhile realizing sparse synaptic computation. Furthermore, to solve the conflict of event-driven neuronal dynamics with parallel computing, we propose a light-weight surrogate dynamic network which accurately predicts the after-reset membrane potential and compatible to learnable thresholds, enabling orders of acceleration in training speed compared with conventional iterative methods. On the long range arena benchmark task, SpikingSSM achieves competitive performance to state-of-the-art SSMs meanwhile realizing on average 90% of network sparsity. On language modeling, our network significantly surpasses existing spiking large language models (spikingLLMs) on the WikiText-103 dataset with only a third of the model size, demonstrating its potential as backbone architecture for low computation cost LLMs.
{"title":"SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models","authors":"Shuaijie Shen, Chao Wang, Renzhuo Huang, Yan Zhong, Qinghai Guo, Zhichao Lu, Jianguo Zhang, Luziwei Leng","doi":"arxiv-2408.14909","DOIUrl":"https://doi.org/arxiv-2408.14909","url":null,"abstract":"Known as low energy consumption networks, spiking neural networks (SNNs) have\u0000gained a lot of attention within the past decades. While SNNs are increasing\u0000competitive with artificial neural networks (ANNs) for vision tasks, they are\u0000rarely used for long sequence tasks, despite their intrinsic temporal dynamics.\u0000In this work, we develop spiking state space models (SpikingSSMs) for long\u0000sequence learning by leveraging on the sequence learning abilities of state\u0000space models (SSMs). Inspired by dendritic neuron structure, we hierarchically\u0000integrate neuronal dynamics with the original SSM block, meanwhile realizing\u0000sparse synaptic computation. Furthermore, to solve the conflict of event-driven\u0000neuronal dynamics with parallel computing, we propose a light-weight surrogate\u0000dynamic network which accurately predicts the after-reset membrane potential\u0000and compatible to learnable thresholds, enabling orders of acceleration in\u0000training speed compared with conventional iterative methods. On the long range\u0000arena benchmark task, SpikingSSM achieves competitive performance to\u0000state-of-the-art SSMs meanwhile realizing on average 90% of network sparsity.\u0000On language modeling, our network significantly surpasses existing spiking\u0000large language models (spikingLLMs) on the WikiText-103 dataset with only a\u0000third of the model size, demonstrating its potential as backbone architecture\u0000for low computation cost LLMs.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yujie Wu, Siyuan Xu, Jibin Wu, Lei Deng, Mingkun Xu, Qinghao Wen, Guoqi Li
The Forward-Forward (FF) algorithm was recently proposed as a local learning method to address the limitations of backpropagation (BP), offering biological plausibility along with memory-efficient and highly parallelized computational benefits. However, it suffers from suboptimal performance and poor generalization, largely due to inadequate theoretical support and a lack of effective learning strategies. In this work, we reformulate FF using distance metric learning and propose a distance-forward algorithm (DF) to improve FF performance in supervised vision tasks while preserving its local computational properties, making it competitive for efficient on-chip learning. To achieve this, we reinterpret FF through the lens of centroid-based metric learning and develop a goodness-based N-pair margin loss to facilitate the learning of discriminative features. Furthermore, we integrate layer-collaboration local update strategies to reduce information loss caused by greedy local parameter updates. Our method surpasses existing FF models and other advanced local learning approaches, with accuracies of 99.7% on MNIST, 88.2% on CIFAR-10, 59% on CIFAR-100, 95.9% on SVHN, and 82.5% on ImageNette, respectively. Moreover, it achieves comparable performance with less than 40% memory cost compared to BP training, while exhibiting stronger robustness to multiple types of hardware-related noise, demonstrating its potential for online learning and energy-efficient computation on neuromorphic chips.
前向前馈(FF)算法是最近提出的一种局部学习方法,旨在解决反向传播(BP)的局限性,该算法不仅具有生物学上的合理性,还具有内存效率高、计算高度并行化等优点。然而,它的性能不理想,泛化能力差,这主要是由于理论支持不足和缺乏有效的学习策略。在这项工作中,我们使用距离度量学习重新表述了 FF,并提出了一种距离前向算法 (DF),以提高 FF 在有监督视觉任务中的性能,同时保留其本地计算特性,使其在高效片上学习方面具有竞争力。为了实现这一目标,我们从基于中心点的度量学习角度重新解释了 FF,并开发了一种基于善度的 N 对边距损失,以促进区分性特征的学习。此外,我们还整合了层协作局部更新策略,以减少贪婪的局部参数更新造成的信息损失。我们的方法超越了现有的FF模型和其他先进的局部学习方法,在MNIST上的准确率为99.7%,在CIFAR-10上的准确率为88.2%,在CIFAR-100上的准确率为59%,在SVHN上的准确率为95.9%,在ImageNette上的准确率为82.5%。此外,与BP训练相比,它以不到40%的内存成本实现了可比的性能,同时对多种类型的硬件相关噪声表现出更强的鲁棒性,证明了它在神经形态芯片上的在线学习和节能计算潜力。
{"title":"Distance-Forward Learning: Enhancing the Forward-Forward Algorithm Towards High-Performance On-Chip Learning","authors":"Yujie Wu, Siyuan Xu, Jibin Wu, Lei Deng, Mingkun Xu, Qinghao Wen, Guoqi Li","doi":"arxiv-2408.14925","DOIUrl":"https://doi.org/arxiv-2408.14925","url":null,"abstract":"The Forward-Forward (FF) algorithm was recently proposed as a local learning\u0000method to address the limitations of backpropagation (BP), offering biological\u0000plausibility along with memory-efficient and highly parallelized computational\u0000benefits. However, it suffers from suboptimal performance and poor\u0000generalization, largely due to inadequate theoretical support and a lack of\u0000effective learning strategies. In this work, we reformulate FF using distance\u0000metric learning and propose a distance-forward algorithm (DF) to improve FF\u0000performance in supervised vision tasks while preserving its local computational\u0000properties, making it competitive for efficient on-chip learning. To achieve\u0000this, we reinterpret FF through the lens of centroid-based metric learning and\u0000develop a goodness-based N-pair margin loss to facilitate the learning of\u0000discriminative features. Furthermore, we integrate layer-collaboration local\u0000update strategies to reduce information loss caused by greedy local parameter\u0000updates. Our method surpasses existing FF models and other advanced local\u0000learning approaches, with accuracies of 99.7% on MNIST, 88.2% on CIFAR-10,\u000059% on CIFAR-100, 95.9% on SVHN, and 82.5% on ImageNette, respectively.\u0000Moreover, it achieves comparable performance with less than 40% memory cost\u0000compared to BP training, while exhibiting stronger robustness to multiple types\u0000of hardware-related noise, demonstrating its potential for online learning and\u0000energy-efficient computation on neuromorphic chips.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinyi Chen, Jibin Wu, Chenxiang Ma, Yinsong Yan, Yujie Wu, Kay Chen Tan
Spiking Neural Networks (SNNs) hold great potential to realize brain-inspired, energy-efficient computational systems. However, current SNNs still fall short in terms of multi-scale temporal processing compared to their biological counterparts. This limitation has resulted in poor performance in many pattern recognition tasks with information that varies across different timescales. To address this issue, we put forward a novel spiking neuron model called Parallel Multi-compartment Spiking Neuron (PMSN). The PMSN emulates biological neurons by incorporating multiple interacting substructures and allows for flexible adjustment of the substructure counts to effectively represent temporal information across diverse timescales. Additionally, to address the computational burden associated with the increased complexity of the proposed model, we introduce two parallelization techniques that decouple the temporal dependencies of neuronal updates, enabling parallelized training across different time steps. Our experimental results on a wide range of pattern recognition tasks demonstrate the superiority of PMSN. It outperforms other state-of-the-art spiking neuron models in terms of its temporal processing capacity, training speed, and computation cost. Specifically, compared with the commonly used Leaky Integrate-and-Fire neuron, PMSN offers a simulation acceleration of over 10 $times$ and a 30 % improvement in accuracy on Sequential CIFAR10 dataset, while maintaining comparable computational cost.
{"title":"PMSN: A Parallel Multi-compartment Spiking Neuron for Multi-scale Temporal Processing","authors":"Xinyi Chen, Jibin Wu, Chenxiang Ma, Yinsong Yan, Yujie Wu, Kay Chen Tan","doi":"arxiv-2408.14917","DOIUrl":"https://doi.org/arxiv-2408.14917","url":null,"abstract":"Spiking Neural Networks (SNNs) hold great potential to realize\u0000brain-inspired, energy-efficient computational systems. However, current SNNs\u0000still fall short in terms of multi-scale temporal processing compared to their\u0000biological counterparts. This limitation has resulted in poor performance in\u0000many pattern recognition tasks with information that varies across different\u0000timescales. To address this issue, we put forward a novel spiking neuron model\u0000called Parallel Multi-compartment Spiking Neuron (PMSN). The PMSN emulates\u0000biological neurons by incorporating multiple interacting substructures and\u0000allows for flexible adjustment of the substructure counts to effectively\u0000represent temporal information across diverse timescales. Additionally, to\u0000address the computational burden associated with the increased complexity of\u0000the proposed model, we introduce two parallelization techniques that decouple\u0000the temporal dependencies of neuronal updates, enabling parallelized training\u0000across different time steps. Our experimental results on a wide range of\u0000pattern recognition tasks demonstrate the superiority of PMSN. It outperforms\u0000other state-of-the-art spiking neuron models in terms of its temporal\u0000processing capacity, training speed, and computation cost. Specifically,\u0000compared with the commonly used Leaky Integrate-and-Fire neuron, PMSN offers a\u0000simulation acceleration of over 10 $times$ and a 30 % improvement in accuracy\u0000on Sequential CIFAR10 dataset, while maintaining comparable computational cost.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking neural networks (SNNs) are gaining popularity in the computational simulation and artificial intelligence fields owing to their biological plausibility and computational efficiency. This paper explores the historical development of SNN and concludes that these two fields are intersecting and merging rapidly. Following the successful application of Dynamic Vision Sensors (DVS) and Dynamic Audio Sensors (DAS), SNNs have found some proper paradigms, such as continuous visual signal tracking, automatic speech recognition, and reinforcement learning for continuous control, that have extensively supported their key features, including spike encoding, neuronal heterogeneity, specific functional circuits, and multiscale plasticity. Compared to these real-world paradigms, the brain contains a spiking version of the biology-world paradigm, which exhibits a similar level of complexity and is usually considered a mirror of the real world. Considering the projected rapid development of invasive and parallel Brain-Computer Interface (BCI), as well as the new BCI-based paradigms that include online pattern recognition and stimulus control of biological spike trains, SNNs naturally leverage their advantages in energy efficiency, robustness, and flexibility. The biological brain has inspired the present study of SNNs and effective SNN machine-learning algorithms, which can help enhance neuroscience discoveries in the brain by applying them to the new BCI paradigm. Such two-way interactions with positive feedback can accelerate brain science research and brain-inspired intelligence technology.
{"title":"Research Advances and New Paradigms for Biology-inspired Spiking Neural Networks","authors":"Tianyu Zheng, Liyuan Han, Tielin Zhang","doi":"arxiv-2408.13996","DOIUrl":"https://doi.org/arxiv-2408.13996","url":null,"abstract":"Spiking neural networks (SNNs) are gaining popularity in the computational\u0000simulation and artificial intelligence fields owing to their biological\u0000plausibility and computational efficiency. This paper explores the historical\u0000development of SNN and concludes that these two fields are intersecting and\u0000merging rapidly. Following the successful application of Dynamic Vision Sensors\u0000(DVS) and Dynamic Audio Sensors (DAS), SNNs have found some proper paradigms,\u0000such as continuous visual signal tracking, automatic speech recognition, and\u0000reinforcement learning for continuous control, that have extensively supported\u0000their key features, including spike encoding, neuronal heterogeneity, specific\u0000functional circuits, and multiscale plasticity. Compared to these real-world\u0000paradigms, the brain contains a spiking version of the biology-world paradigm,\u0000which exhibits a similar level of complexity and is usually considered a mirror\u0000of the real world. Considering the projected rapid development of invasive and\u0000parallel Brain-Computer Interface (BCI), as well as the new BCI-based paradigms\u0000that include online pattern recognition and stimulus control of biological\u0000spike trains, SNNs naturally leverage their advantages in energy efficiency,\u0000robustness, and flexibility. The biological brain has inspired the present\u0000study of SNNs and effective SNN machine-learning algorithms, which can help\u0000enhance neuroscience discoveries in the brain by applying them to the new BCI\u0000paradigm. Such two-way interactions with positive feedback can accelerate brain\u0000science research and brain-inspired intelligence technology.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Uncertainty quantification is an important part of many performance critical applications. This paper provides a simple alternative to existing approaches such as ensemble learning and bayesian neural networks. By directly modeling the loss distribution with an Implicit Quantile Network, we get an estimate of how uncertain the model is of its predictions. For experiments with MNIST and CIFAR datasets, the mean of the estimated loss distribution is 2x higher for incorrect predictions. When data with high estimated uncertainty is removed from the test dataset, the accuracy of the model goes up as much as 10%. This method is simple to implement while offering important information to applications where the user has to know when the model could be wrong (e.g. deep learning for healthcare).
{"title":"Estimating Uncertainty with Implicit Quantile Network","authors":"Yi Hung Lim","doi":"arxiv-2408.14525","DOIUrl":"https://doi.org/arxiv-2408.14525","url":null,"abstract":"Uncertainty quantification is an important part of many performance critical\u0000applications. This paper provides a simple alternative to existing approaches\u0000such as ensemble learning and bayesian neural networks. By directly modeling\u0000the loss distribution with an Implicit Quantile Network, we get an estimate of\u0000how uncertain the model is of its predictions. For experiments with MNIST and\u0000CIFAR datasets, the mean of the estimated loss distribution is 2x higher for\u0000incorrect predictions. When data with high estimated uncertainty is removed\u0000from the test dataset, the accuracy of the model goes up as much as 10%. This\u0000method is simple to implement while offering important information to\u0000applications where the user has to know when the model could be wrong (e.g.\u0000deep learning for healthcare).","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pre-trained Artificial Neural Networks (ANNs) exhibit robust pattern recognition capabilities and share extensive similarities with the human brain, specifically Biological Neural Networks (BNNs). We are particularly intrigued by these models' ability to acquire new knowledge through fine-tuning. In this regard, Parameter-efficient Fine-tuning (PEFT) has gained widespread adoption as a substitute for full fine-tuning due to its cost reduction in training and mitigation of over-fitting risks by limiting the number of trainable parameters during adaptation. Since both ANNs and BNNs propagate information layer-by-layer, a common analogy can be drawn: weights in ANNs represent synapses in BNNs, while features (also known as latent variables or logits) in ANNs represent neurotransmitters released by neurons in BNNs. Mainstream PEFT methods aim to adjust feature or parameter values using only a limited number of trainable parameters (usually less than 1% of the total parameters), yet achieve surprisingly good results. Building upon this clue, we delve deeper into exploring the connections between feature adjustment and parameter adjustment, resulting in our proposed method Synapses & Neurons (SAN) that learns scaling matrices for features and propagates their effects towards posterior weight matrices. Our approach draws strong inspiration from well-known neuroscience phenomena - Long-term Potentiation (LTP) and Long-term Depression (LTD), which also reveal the relationship between synapse development and neurotransmitter release levels. We conducted extensive comparisons of PEFT on 26 datasets using attention-based networks as well as convolution-based networks, leading to significant improvements compared to other tuning methods (+8.5% over fully-finetune, +7% over Visual Prompt Tuning, and +3.2% over LoRA). The codes would be released.
{"title":"Discovering Long-Term Effects on Parameter Efficient Fine-tuning","authors":"Gaole Dai, Yiming Tang, Chunkai Fan, Qizhe Zhang, Zhi Zhang, Yulu Gan, Chengqing Zeng, Shanghang Zhang, Tiejun Huang","doi":"arxiv-2409.06706","DOIUrl":"https://doi.org/arxiv-2409.06706","url":null,"abstract":"Pre-trained Artificial Neural Networks (ANNs) exhibit robust pattern\u0000recognition capabilities and share extensive similarities with the human brain,\u0000specifically Biological Neural Networks (BNNs). We are particularly intrigued\u0000by these models' ability to acquire new knowledge through fine-tuning. In this\u0000regard, Parameter-efficient Fine-tuning (PEFT) has gained widespread adoption\u0000as a substitute for full fine-tuning due to its cost reduction in training and\u0000mitigation of over-fitting risks by limiting the number of trainable parameters\u0000during adaptation. Since both ANNs and BNNs propagate information\u0000layer-by-layer, a common analogy can be drawn: weights in ANNs represent\u0000synapses in BNNs, while features (also known as latent variables or logits) in\u0000ANNs represent neurotransmitters released by neurons in BNNs. Mainstream PEFT\u0000methods aim to adjust feature or parameter values using only a limited number\u0000of trainable parameters (usually less than 1% of the total parameters), yet\u0000achieve surprisingly good results. Building upon this clue, we delve deeper\u0000into exploring the connections between feature adjustment and parameter\u0000adjustment, resulting in our proposed method Synapses & Neurons (SAN) that\u0000learns scaling matrices for features and propagates their effects towards\u0000posterior weight matrices. Our approach draws strong inspiration from\u0000well-known neuroscience phenomena - Long-term Potentiation (LTP) and Long-term\u0000Depression (LTD), which also reveal the relationship between synapse\u0000development and neurotransmitter release levels. We conducted extensive\u0000comparisons of PEFT on 26 datasets using attention-based networks as well as\u0000convolution-based networks, leading to significant improvements compared to\u0000other tuning methods (+8.5% over fully-finetune, +7% over Visual Prompt Tuning,\u0000and +3.2% over LoRA). The codes would be released.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wentao Wu, Fanghua Hong, Xiao Wang, Chenglong Li, Jin Tang
Existing vehicle detectors are usually obtained by training a typical detector (e.g., YOLO, RCNN, DETR series) on vehicle images based on a pre-trained backbone (e.g., ResNet, ViT). Some researchers also exploit and enhance the detection performance using pre-trained large foundation models. However, we think these detectors may only get sub-optimal results because the large models they use are not specifically designed for vehicles. In addition, their results heavily rely on visual features, and seldom of they consider the alignment between the vehicle's semantic information and visual representations. In this work, we propose a new vehicle detection paradigm based on a pre-trained foundation vehicle model (VehicleMAE) and a large language model (T5), termed VFM-Det. It follows the region proposal-based detection framework and the features of each proposal can be enhanced using VehicleMAE. More importantly, we propose a new VAtt2Vec module that predicts the vehicle semantic attributes of these proposals and transforms them into feature vectors to enhance the vision features via contrastive learning. Extensive experiments on three vehicle detection benchmark datasets thoroughly proved the effectiveness of our vehicle detector. Specifically, our model improves the baseline approach by $+5.1%$, $+6.2%$ on the $AP_{0.5}$, $AP_{0.75}$ metrics, respectively, on the Cityscapes dataset.The source code of this work will be released at https://github.com/Event-AHU/VFM-Det.
{"title":"VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models","authors":"Wentao Wu, Fanghua Hong, Xiao Wang, Chenglong Li, Jin Tang","doi":"arxiv-2408.13031","DOIUrl":"https://doi.org/arxiv-2408.13031","url":null,"abstract":"Existing vehicle detectors are usually obtained by training a typical\u0000detector (e.g., YOLO, RCNN, DETR series) on vehicle images based on a\u0000pre-trained backbone (e.g., ResNet, ViT). Some researchers also exploit and\u0000enhance the detection performance using pre-trained large foundation models.\u0000However, we think these detectors may only get sub-optimal results because the\u0000large models they use are not specifically designed for vehicles. In addition,\u0000their results heavily rely on visual features, and seldom of they consider the\u0000alignment between the vehicle's semantic information and visual\u0000representations. In this work, we propose a new vehicle detection paradigm\u0000based on a pre-trained foundation vehicle model (VehicleMAE) and a large\u0000language model (T5), termed VFM-Det. It follows the region proposal-based\u0000detection framework and the features of each proposal can be enhanced using\u0000VehicleMAE. More importantly, we propose a new VAtt2Vec module that predicts\u0000the vehicle semantic attributes of these proposals and transforms them into\u0000feature vectors to enhance the vision features via contrastive learning.\u0000Extensive experiments on three vehicle detection benchmark datasets thoroughly\u0000proved the effectiveness of our vehicle detector. Specifically, our model\u0000improves the baseline approach by $+5.1%$, $+6.2%$ on the $AP_{0.5}$,\u0000$AP_{0.75}$ metrics, respectively, on the Cityscapes dataset.The source code of\u0000this work will be released at https://github.com/Event-AHU/VFM-Det.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amirhossein Nouranizadeh, Fatemeh Tabatabaei Far, Mohammad Rahmati
Evolving networks are complex data structures that emerge in a wide range of systems in science and engineering. Learning expressive representations for such networks that encode their structural connectivity and temporal evolution is essential for downstream data analytics and machine learning applications. In this study, we introduce a self-supervised method for learning representations of temporal networks and employ these representations in the dynamic link prediction task. While temporal networks are typically characterized as a sequence of interactions over the continuous time domain, our study focuses on their discrete-time versions. This enables us to balance the trade-off between computational complexity and precise modeling of the interactions. We propose a recurrent message-passing neural network architecture for modeling the information flow over time-respecting paths of temporal networks. The key feature of our method is the contrastive training objective of the model, which is a combination of three loss functions: link prediction, graph reconstruction, and contrastive predictive coding losses. The contrastive predictive coding objective is implemented using infoNCE losses at both local and global scales of the input graphs. We empirically show that the additional self-supervised losses enhance the training and improve the model's performance in the dynamic link prediction task. The proposed method is tested on Enron, COLAB, and Facebook datasets and exhibits superior results compared to existing models.
{"title":"Contrastive Representation Learning for Dynamic Link Prediction in Temporal Networks","authors":"Amirhossein Nouranizadeh, Fatemeh Tabatabaei Far, Mohammad Rahmati","doi":"arxiv-2408.12753","DOIUrl":"https://doi.org/arxiv-2408.12753","url":null,"abstract":"Evolving networks are complex data structures that emerge in a wide range of\u0000systems in science and engineering. Learning expressive representations for\u0000such networks that encode their structural connectivity and temporal evolution\u0000is essential for downstream data analytics and machine learning applications.\u0000In this study, we introduce a self-supervised method for learning\u0000representations of temporal networks and employ these representations in the\u0000dynamic link prediction task. While temporal networks are typically\u0000characterized as a sequence of interactions over the continuous time domain,\u0000our study focuses on their discrete-time versions. This enables us to balance\u0000the trade-off between computational complexity and precise modeling of the\u0000interactions. We propose a recurrent message-passing neural network\u0000architecture for modeling the information flow over time-respecting paths of\u0000temporal networks. The key feature of our method is the contrastive training\u0000objective of the model, which is a combination of three loss functions: link\u0000prediction, graph reconstruction, and contrastive predictive coding losses. The\u0000contrastive predictive coding objective is implemented using infoNCE losses at\u0000both local and global scales of the input graphs. We empirically show that the\u0000additional self-supervised losses enhance the training and improve the model's\u0000performance in the dynamic link prediction task. The proposed method is tested\u0000on Enron, COLAB, and Facebook datasets and exhibits superior results compared\u0000to existing models.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Spiking Neural Network (SNN), due to its unique spiking-driven nature, is a more energy-efficient and effective neural network compared to Artificial Neural Networks (ANNs). The encoding method directly influences the overall performance of the network, and currently, direct encoding is primarily used for directly trained SNNs. When working with static image datasets, direct encoding inputs the same feature map at every time step, failing to fully exploit the spatiotemporal properties of SNNs. While temporal encoding converts input data into spike trains with spatiotemporal characteristics, traditional SNNs utilize the same neurons when processing input data across different time steps, limiting their ability to integrate and utilize spatiotemporal information effectively.To address this, this paper employs temporal encoding and proposes the Adaptive Spiking Neural Network (ASNN), enhancing the utilization of temporal encoding in conventional SNNs. Additionally, temporal encoding is less frequently used because short time steps can lead to significant loss of input data information, often necessitating a higher number of time steps in practical applications. However, training large SNNs with long time steps is challenging due to hardware constraints. To overcome this, this paper introduces a hybrid encoding approach that not only reduces the required time steps for training but also continues to improve the overall network performance.Notably, significant improvements in classification performance are observed on both Spikformer and Spiking ResNet architectures.our code is available at https://github.com/hhx0320/ASNN
{"title":"Adaptive Spiking Neural Networks with Hybrid Coding","authors":"Huaxu He","doi":"arxiv-2408.12407","DOIUrl":"https://doi.org/arxiv-2408.12407","url":null,"abstract":"The Spiking Neural Network (SNN), due to its unique spiking-driven nature, is\u0000a more energy-efficient and effective neural network compared to Artificial\u0000Neural Networks (ANNs). The encoding method directly influences the overall\u0000performance of the network, and currently, direct encoding is primarily used\u0000for directly trained SNNs. When working with static image datasets, direct\u0000encoding inputs the same feature map at every time step, failing to fully\u0000exploit the spatiotemporal properties of SNNs. While temporal encoding converts\u0000input data into spike trains with spatiotemporal characteristics, traditional\u0000SNNs utilize the same neurons when processing input data across different time\u0000steps, limiting their ability to integrate and utilize spatiotemporal\u0000information effectively.To address this, this paper employs temporal encoding\u0000and proposes the Adaptive Spiking Neural Network (ASNN), enhancing the\u0000utilization of temporal encoding in conventional SNNs. Additionally, temporal\u0000encoding is less frequently used because short time steps can lead to\u0000significant loss of input data information, often necessitating a higher number\u0000of time steps in practical applications. However, training large SNNs with long\u0000time steps is challenging due to hardware constraints. To overcome this, this\u0000paper introduces a hybrid encoding approach that not only reduces the required\u0000time steps for training but also continues to improve the overall network\u0000performance.Notably, significant improvements in classification performance are\u0000observed on both Spikformer and Spiking ResNet architectures.our code is\u0000available at https://github.com/hhx0320/ASNN","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}