首页 > 最新文献

arXiv - CS - Neural and Evolutionary Computing最新文献

英文 中文
SHIRE: Enhancing Sample Efficiency using Human Intuition in REinforcement Learning SHIRE:在强化学习中利用人类直觉提高采样效率
Pub Date : 2024-09-16 DOI: arxiv-2409.09990
Amogh Joshi, Adarsh Kumar Kosta, Kaushik Roy
The ability of neural networks to perform robotic perception and controltasks such as depth and optical flow estimation, simultaneous localization andmapping (SLAM), and automatic control has led to their widespread adoption inrecent years. Deep Reinforcement Learning has been used extensively in thesesettings, as it does not have the unsustainable training costs associated withsupervised learning. However, DeepRL suffers from poor sample efficiency, i.e.,it requires a large number of environmental interactions to converge to anacceptable solution. Modern RL algorithms such as Deep Q Learning and SoftActor-Critic attempt to remedy this shortcoming but can not provide theexplainability required in applications such as autonomous robotics. Humansintuitively understand the long-time-horizon sequential tasks common inrobotics. Properly using such intuition can make RL policies more explainablewhile enhancing their sample efficiency. In this work, we propose SHIRE, anovel framework for encoding human intuition using Probabilistic GraphicalModels (PGMs) and using it in the Deep RL training pipeline to enhance sampleefficiency. Our framework achieves 25-78% sample efficiency gains across theenvironments we evaluate at negligible overhead cost. Additionally, by teachingRL agents the encoded elementary behavior, SHIRE enhances policyexplainability. A real-world demonstration further highlights the efficacy ofpolicies trained using our framework.
神经网络能够执行深度和光流估计、同步定位和映射(SLAM)以及自动控制等机器人感知和控制任务,因此近年来被广泛采用。深度强化学习(Deep Reinforcement Learning)在这些环境中得到了广泛应用,因为它不存在与监督学习相关的不可持续的训练成本。然而,深度强化学习的采样效率较低,也就是说,它需要大量的环境交互才能收敛到可接受的解决方案。Deep Q Learning 和 SoftActor-Critic 等现代 RL 算法试图弥补这一缺陷,但无法提供自主机器人等应用所需的可解释性。人类凭直觉就能理解机器人技术中常见的长时间跨度顺序任务。适当利用这种直觉可以使 RL 策略更具可解释性,同时提高其采样效率。在这项工作中,我们提出了一个新的框架--SHIRE,用于使用概率图形模型(PGM)对人类直觉进行编码,并将其用于深度 RL 训练管道以提高采样效率。在我们评估的环境中,我们的框架以可忽略不计的开销成本实现了 25-78% 的样本效率提升。此外,通过向 RL 代理教授编码的基本行为,SHIRE 增强了政策的可解释性。现实世界的演示进一步凸显了使用我们的框架训练出的政策的有效性。
{"title":"SHIRE: Enhancing Sample Efficiency using Human Intuition in REinforcement Learning","authors":"Amogh Joshi, Adarsh Kumar Kosta, Kaushik Roy","doi":"arxiv-2409.09990","DOIUrl":"https://doi.org/arxiv-2409.09990","url":null,"abstract":"The ability of neural networks to perform robotic perception and control\u0000tasks such as depth and optical flow estimation, simultaneous localization and\u0000mapping (SLAM), and automatic control has led to their widespread adoption in\u0000recent years. Deep Reinforcement Learning has been used extensively in these\u0000settings, as it does not have the unsustainable training costs associated with\u0000supervised learning. However, DeepRL suffers from poor sample efficiency, i.e.,\u0000it requires a large number of environmental interactions to converge to an\u0000acceptable solution. Modern RL algorithms such as Deep Q Learning and Soft\u0000Actor-Critic attempt to remedy this shortcoming but can not provide the\u0000explainability required in applications such as autonomous robotics. Humans\u0000intuitively understand the long-time-horizon sequential tasks common in\u0000robotics. Properly using such intuition can make RL policies more explainable\u0000while enhancing their sample efficiency. In this work, we propose SHIRE, a\u0000novel framework for encoding human intuition using Probabilistic Graphical\u0000Models (PGMs) and using it in the Deep RL training pipeline to enhance sample\u0000efficiency. Our framework achieves 25-78% sample efficiency gains across the\u0000environments we evaluate at negligible overhead cost. Additionally, by teaching\u0000RL agents the encoded elementary behavior, SHIRE enhances policy\u0000explainability. A real-world demonstration further highlights the efficacy of\u0000policies trained using our framework.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
COSCO: A Sharpness-Aware Training Framework for Few-shot Multivariate Time Series Classification COSCO:用于少镜头多变量时间序列分类的锐度感知训练框架
Pub Date : 2024-09-15 DOI: arxiv-2409.09645
Jesus Barreda, Ashley Gomez, Ruben Puga, Kaixiong Zhou, Li Zhang
Multivariate time series classification is an important task with widespreaddomains of applications. Recently, deep neural networks (DNN) have achievedstate-of-the-art performance in time series classification. However, they oftenrequire large expert-labeled training datasets which can be infeasible inpractice. In few-shot settings, i.e. only a limited number of samples per classare available in training data, DNNs show a significant drop in testingaccuracy and poor generalization ability. In this paper, we propose to addressthese problems from an optimization and a loss function perspective.Specifically, we propose a new learning framework named COSCO consisting of asharpness-aware minimization (SAM) optimization and a Prototypical lossfunction to improve the generalization ability of DNN for multivariate timeseries classification problems under few-shot setting. Our experimentsdemonstrate our proposed method outperforms the existing baseline methods. Oursource code is available at: https://github.com/JRB9/COSCO.
多变量时间序列分类是一项应用领域广泛的重要任务。最近,深度神经网络(DNN)在时间序列分类方面取得了最先进的性能。然而,它们通常需要大量专家标注的训练数据集,这在实践中是不可行的。在少数几个样本的情况下,即每个类别只有有限数量的样本作为训练数据,DNNs 的测试精度会显著下降,泛化能力也很差。具体来说,我们提出了一种名为 COSCO 的新学习框架,该框架由锐利度感知最小化(SAM)优化和原型损失函数组成,用于提高 DNN 在少样本设置下对多变量时间序列分类问题的泛化能力。实验证明,我们提出的方法优于现有的基线方法。我们的源代码可在以下网址获取:https://github.com/JRB9/COSCO。
{"title":"COSCO: A Sharpness-Aware Training Framework for Few-shot Multivariate Time Series Classification","authors":"Jesus Barreda, Ashley Gomez, Ruben Puga, Kaixiong Zhou, Li Zhang","doi":"arxiv-2409.09645","DOIUrl":"https://doi.org/arxiv-2409.09645","url":null,"abstract":"Multivariate time series classification is an important task with widespread\u0000domains of applications. Recently, deep neural networks (DNN) have achieved\u0000state-of-the-art performance in time series classification. However, they often\u0000require large expert-labeled training datasets which can be infeasible in\u0000practice. In few-shot settings, i.e. only a limited number of samples per class\u0000are available in training data, DNNs show a significant drop in testing\u0000accuracy and poor generalization ability. In this paper, we propose to address\u0000these problems from an optimization and a loss function perspective.\u0000Specifically, we propose a new learning framework named COSCO consisting of a\u0000sharpness-aware minimization (SAM) optimization and a Prototypical loss\u0000function to improve the generalization ability of DNN for multivariate time\u0000series classification problems under few-shot setting. Our experiments\u0000demonstrate our proposed method outperforms the existing baseline methods. Our\u0000source code is available at: https://github.com/JRB9/COSCO.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TX-Gen: Multi-Objective Optimization for Sparse Counterfactual Explanations for Time-Series Classification TX-Gen:时间序列分类稀疏反事实解释的多目标优化
Pub Date : 2024-09-14 DOI: arxiv-2409.09461
Qi Huang, Sofoklis Kitharidis, Thomas Bäck, Niki van Stein
In time-series classification, understanding model decisions is crucial fortheir application in high-stakes domains such as healthcare and finance.Counterfactual explanations, which provide insights by presenting alternativeinputs that change model predictions, offer a promising solution. However,existing methods for generating counterfactual explanations for time-seriesdata often struggle with balancing key objectives like proximity, sparsity, andvalidity. In this paper, we introduce TX-Gen, a novel algorithm for generatingcounterfactual explanations based on the Non-dominated Sorting GeneticAlgorithm II (NSGA-II). TX-Gen leverages evolutionary multi-objectiveoptimization to find a diverse set of counterfactuals that are both sparse andvalid, while maintaining minimal dissimilarity to the original time series. Byincorporating a flexible reference-guided mechanism, our method improves theplausibility and interpretability of the counterfactuals without relying onpredefined assumptions. Extensive experiments on benchmark datasets demonstratethat TX-Gen outperforms existing methods in generating high-qualitycounterfactuals, making time-series models more transparent and interpretable.
在时间序列分类中,理解模型的决策对其在医疗保健和金融等高风险领域的应用至关重要。反事实解释通过提出改变模型预测的替代输入来提供洞察力,提供了一种有前途的解决方案。然而,现有的为时间序列数据生成反事实解释的方法往往难以在接近性、稀疏性和有效性等关键目标之间取得平衡。本文介绍了 TX-Gen,这是一种基于非优势排序遗传算法 II(NSGA-II)的生成反事实解释的新型算法。TX-Gen 利用进化式多目标优化找到了一组既稀疏又有效的多样化反事实,同时保持了与原始时间序列的最小相似性。通过结合灵活的参考引导机制,我们的方法提高了反事实的可信度和可解释性,而无需依赖预先定义的假设。在基准数据集上进行的大量实验证明,TX-Gen 在生成高质量反事实方面优于现有方法,从而使时间序列模型更加透明和可解释。
{"title":"TX-Gen: Multi-Objective Optimization for Sparse Counterfactual Explanations for Time-Series Classification","authors":"Qi Huang, Sofoklis Kitharidis, Thomas Bäck, Niki van Stein","doi":"arxiv-2409.09461","DOIUrl":"https://doi.org/arxiv-2409.09461","url":null,"abstract":"In time-series classification, understanding model decisions is crucial for\u0000their application in high-stakes domains such as healthcare and finance.\u0000Counterfactual explanations, which provide insights by presenting alternative\u0000inputs that change model predictions, offer a promising solution. However,\u0000existing methods for generating counterfactual explanations for time-series\u0000data often struggle with balancing key objectives like proximity, sparsity, and\u0000validity. In this paper, we introduce TX-Gen, a novel algorithm for generating\u0000counterfactual explanations based on the Non-dominated Sorting Genetic\u0000Algorithm II (NSGA-II). TX-Gen leverages evolutionary multi-objective\u0000optimization to find a diverse set of counterfactuals that are both sparse and\u0000valid, while maintaining minimal dissimilarity to the original time series. By\u0000incorporating a flexible reference-guided mechanism, our method improves the\u0000plausibility and interpretability of the counterfactuals without relying on\u0000predefined assumptions. Extensive experiments on benchmark datasets demonstrate\u0000that TX-Gen outperforms existing methods in generating high-quality\u0000counterfactuals, making time-series models more transparent and interpretable.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"190 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142249056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ORS: A novel Olive Ridley Survival inspired Meta-heuristic Optimization Algorithm ORS:新颖的 Olive Ridley 生存启发元启发式优化算法
Pub Date : 2024-09-13 DOI: arxiv-2409.09210
Niranjan Panigrahi, Sourav Kumar Bhoi, Debasis Mohapatra, Rashmi Ranjan Sahoo, Kshira Sagar Sahoo, Anil Mohapatra
Meta-heuristic algorithmic development has been a thrust area of researchsince its inception. In this paper, a novel meta-heuristic optimizationalgorithm, Olive Ridley Survival (ORS), is proposed which is inspired fromsurvival challenges faced by hatchlings of Olive Ridley sea turtle. A majorfact about survival of Olive Ridley reveals that out of one thousand OliveRidley hatchlings which emerge from nest, only one survive at sea due tovarious environmental and other factors. This fact acts as the backbone fordeveloping the proposed algorithm. The algorithm has two major phases:hatchlings survival through environmental factors and impact of movementtrajectory on its survival. The phases are mathematically modelled andimplemented along with suitable input representation and fitness function. Thealgorithm is analysed theoretically. To validate the algorithm, fourteenmathematical benchmark functions from standard CEC test suites are evaluatedand statistically tested. Also, to study the efficacy of ORS on recent complexbenchmark functions, ten benchmark functions of CEC-06-2019 are evaluated.Further, three well-known engineering problems are solved by ORS and comparedwith other state-of-the-art meta-heuristics. Simulation results show that inmany cases, the proposed ORS algorithm outperforms some state-of-the-artmeta-heuristic optimization algorithms. The sub-optimal behavior of ORS in somerecent benchmark functions is also observed.
元启发式算法自诞生以来一直是研究的重点领域。本文提出了一种新颖的元启发式优化算法--Olive Ridley Survival(ORS),其灵感来源于 Olive Ridley 海龟幼体面临的生存挑战。有关 Olive Ridley 海龟生存的一个重要事实表明,由于各种环境和其他因素,在一千只出巢的 Olive Ridley 海龟幼体中,只有一只能在海上存活。这一事实是开发拟议算法的基础。该算法分为两个主要阶段:幼鸟在环境因素中的存活率和运动轨迹对其存活率的影响。这两个阶段通过数学模型和适当的输入表示和适应度函数得以实现。对算法进行了理论分析。为了验证该算法,对标准 CEC 测试套件中的 14 个数学基准函数进行了评估和统计测试。此外,为了研究 ORS 对最新复杂基准函数的功效,还评估了 CEC-06-2019 中的十个基准函数。仿真结果表明,在许多情况下,所提出的 ORS 算法优于一些最先进的元启发式优化算法。此外,还观察到 ORS 在某些最新基准函数中的次优行为。
{"title":"ORS: A novel Olive Ridley Survival inspired Meta-heuristic Optimization Algorithm","authors":"Niranjan Panigrahi, Sourav Kumar Bhoi, Debasis Mohapatra, Rashmi Ranjan Sahoo, Kshira Sagar Sahoo, Anil Mohapatra","doi":"arxiv-2409.09210","DOIUrl":"https://doi.org/arxiv-2409.09210","url":null,"abstract":"Meta-heuristic algorithmic development has been a thrust area of research\u0000since its inception. In this paper, a novel meta-heuristic optimization\u0000algorithm, Olive Ridley Survival (ORS), is proposed which is inspired from\u0000survival challenges faced by hatchlings of Olive Ridley sea turtle. A major\u0000fact about survival of Olive Ridley reveals that out of one thousand Olive\u0000Ridley hatchlings which emerge from nest, only one survive at sea due to\u0000various environmental and other factors. This fact acts as the backbone for\u0000developing the proposed algorithm. The algorithm has two major phases:\u0000hatchlings survival through environmental factors and impact of movement\u0000trajectory on its survival. The phases are mathematically modelled and\u0000implemented along with suitable input representation and fitness function. The\u0000algorithm is analysed theoretically. To validate the algorithm, fourteen\u0000mathematical benchmark functions from standard CEC test suites are evaluated\u0000and statistically tested. Also, to study the efficacy of ORS on recent complex\u0000benchmark functions, ten benchmark functions of CEC-06-2019 are evaluated.\u0000Further, three well-known engineering problems are solved by ORS and compared\u0000with other state-of-the-art meta-heuristics. Simulation results show that in\u0000many cases, the proposed ORS algorithm outperforms some state-of-the-art\u0000meta-heuristic optimization algorithms. The sub-optimal behavior of ORS in some\u0000recent benchmark functions is also observed.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Training Spiking Neural Networks via Augmented Direct Feedback Alignment 通过增强直接反馈排列训练尖峰神经网络
Pub Date : 2024-09-12 DOI: arxiv-2409.07776
Yongbo Zhang, Katsuma Inoue, Mitsumasa Nakajima, Toshikazu Hashimoto, Yasuo Kuniyoshi, Kohei Nakajima
Spiking neural networks (SNNs), the models inspired by the mechanisms of realneurons in the brain, transmit and represent information by employing discreteaction potentials or spikes. The sparse, asynchronous properties of informationprocessing make SNNs highly energy efficient, leading to SNNs being promisingsolutions for implementing neural networks in neuromorphic devices. However,the nondifferentiable nature of SNN neurons makes it a challenge to train them.The current training methods of SNNs that are based on error backpropagation(BP) and precisely designing surrogate gradient are difficult to implement andbiologically implausible, hindering the implementation of SNNs on neuromorphicdevices. Thus, it is important to train SNNs with a method that is bothphysically implementatable and biologically plausible. In this paper, wepropose using augmented direct feedback alignment (aDFA), a gradient-freeapproach based on random projection, to train SNNs. This method requires onlypartial information of the forward process during training, so it is easy toimplement and biologically plausible. We systematically demonstrate thefeasibility of the proposed aDFA-SNNs scheme, propose its effective workingrange, and analyze its well-performing settings by employing genetic algorithm.We also analyze the impact of crucial features of SNNs on the scheme, thusdemonstrating its superiority and stability over BP and conventional directfeedback alignment. Our scheme can achieve competitive performance withoutaccurate prior knowledge about the utilized system, thus providing a valuablereference for physically training SNNs.
尖峰神经网络(SNN)是受大脑中真实神经元机制启发而建立的模型,通过使用离散动作电位或尖峰来传输和表示信息。信息处理的稀疏性和异步性使 SNN 具有很高的能效,因此 SNN 有望成为在神经形态设备中实现神经网络的解决方案。目前基于误差反向传播(BP)和精确设计替代梯度的 SNNs 训练方法难以实现,而且在生物学上难以置信,阻碍了 SNNs 在神经形态设备上的实现。因此,使用一种既能在物理学上实现,又能在生物学上合理的方法来训练 SNN 是非常重要的。在本文中,我们提出使用增强直接反馈对齐(aDFA)来训练 SNN,这是一种基于随机投影的无梯度方法。这种方法在训练过程中只需要前向过程的部分信息,因此易于实现,在生物学上也是可行的。我们系统地证明了所提出的 aDFA-SNNs 方案的可行性,提出了其有效的工作范围,并通过遗传算法分析了其性能良好的设置,还分析了 SNNs 的关键特征对该方案的影响,从而证明了其优于 BP 和传统直接反馈配准的稳定性。我们的方案可以在没有关于所用系统的准确先验知识的情况下实现具有竞争力的性能,从而为物理训练 SNNs 提供了有价值的参考。
{"title":"Training Spiking Neural Networks via Augmented Direct Feedback Alignment","authors":"Yongbo Zhang, Katsuma Inoue, Mitsumasa Nakajima, Toshikazu Hashimoto, Yasuo Kuniyoshi, Kohei Nakajima","doi":"arxiv-2409.07776","DOIUrl":"https://doi.org/arxiv-2409.07776","url":null,"abstract":"Spiking neural networks (SNNs), the models inspired by the mechanisms of real\u0000neurons in the brain, transmit and represent information by employing discrete\u0000action potentials or spikes. The sparse, asynchronous properties of information\u0000processing make SNNs highly energy efficient, leading to SNNs being promising\u0000solutions for implementing neural networks in neuromorphic devices. However,\u0000the nondifferentiable nature of SNN neurons makes it a challenge to train them.\u0000The current training methods of SNNs that are based on error backpropagation\u0000(BP) and precisely designing surrogate gradient are difficult to implement and\u0000biologically implausible, hindering the implementation of SNNs on neuromorphic\u0000devices. Thus, it is important to train SNNs with a method that is both\u0000physically implementatable and biologically plausible. In this paper, we\u0000propose using augmented direct feedback alignment (aDFA), a gradient-free\u0000approach based on random projection, to train SNNs. This method requires only\u0000partial information of the forward process during training, so it is easy to\u0000implement and biologically plausible. We systematically demonstrate the\u0000feasibility of the proposed aDFA-SNNs scheme, propose its effective working\u0000range, and analyze its well-performing settings by employing genetic algorithm.\u0000We also analyze the impact of crucial features of SNNs on the scheme, thus\u0000demonstrating its superiority and stability over BP and conventional direct\u0000feedback alignment. Our scheme can achieve competitive performance without\u0000accurate prior knowledge about the utilized system, thus providing a valuable\u0000reference for physically training SNNs.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classifying Images with CoLaNET Spiking Neural Network -- the MNIST Example 使用 CoLaNET 尖峰神经网络进行图像分类 - MNIST 示例
Pub Date : 2024-09-12 DOI: arxiv-2409.07833
Mikhail Kiselev
In the present paper, it is shown how the columnar/layered CoLaNET spikingneural network (SNN) architecture can be used in supervised learning imageclassification tasks. Image pixel brightness is coded by the spike count duringimage presentation period. Image class label is indicated by activity ofspecial SNN input nodes (one node per class). The CoLaNET classificationaccuracy is evaluated on the MNIST benchmark. It is demonstrated that CoLaNETis almost as accurate as the most advanced machine learning algorithms (notusing convolutional approach).
本文展示了柱状/层状 CoLaNET 尖峰神经网络(SNN)架构如何用于监督学习图像分类任务。图像像素亮度由图像呈现期间的尖峰计数编码。图像类别标签由特殊 SNN 输入节点(每个类别一个节点)的活动指示。CoLaNET 的分类准确率在 MNIST 基准上进行了评估。结果表明,CoLaNET 的准确度几乎与最先进的机器学习算法(不使用卷积方法)相当。
{"title":"Classifying Images with CoLaNET Spiking Neural Network -- the MNIST Example","authors":"Mikhail Kiselev","doi":"arxiv-2409.07833","DOIUrl":"https://doi.org/arxiv-2409.07833","url":null,"abstract":"In the present paper, it is shown how the columnar/layered CoLaNET spiking\u0000neural network (SNN) architecture can be used in supervised learning image\u0000classification tasks. Image pixel brightness is coded by the spike count during\u0000image presentation period. Image class label is indicated by activity of\u0000special SNN input nodes (one node per class). The CoLaNET classification\u0000accuracy is evaluated on the MNIST benchmark. It is demonstrated that CoLaNET\u0000is almost as accurate as the most advanced machine learning algorithms (not\u0000using convolutional approach).","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing Neural Network Performance and Interpretability with Diophantine Equation Encoding 用二叉方程编码优化神经网络性能和可解释性
Pub Date : 2024-09-11 DOI: arxiv-2409.07310
Ronald Katende
This paper explores the integration of Diophantine equations into neuralnetwork (NN) architectures to improve model interpretability, stability, andefficiency. By encoding and decoding neural network parameters as integersolutions to Diophantine equations, we introduce a novel approach that enhancesboth the precision and robustness of deep learning models. Our methodintegrates a custom loss function that enforces Diophantine constraints duringtraining, leading to better generalization, reduced error bounds, and enhancedresilience against adversarial attacks. We demonstrate the efficacy of thisapproach through several tasks, including image classification and naturallanguage processing, where improvements in accuracy, convergence, androbustness are observed. This study offers a new perspective on combiningmathematical theory and machine learning to create more interpretable andefficient models.
本文探讨了如何将 Diophantine 方程整合到神经网络(NN)架构中,以提高模型的可解释性、稳定性和效率。通过将神经网络参数编码和解码为 Diophantine 方程的整数解,我们引入了一种新颖的方法来提高深度学习模型的精度和鲁棒性。我们的方法集成了一个自定义损失函数,在训练过程中强制执行 Diophantine 约束,从而实现更好的泛化、降低误差边界,并增强对对抗性攻击的复原力。我们通过包括图像分类和自然语言处理在内的几项任务证明了这种方法的有效性,在准确性、收敛性和稳健性方面都有所改进。这项研究为数学理论与机器学习的结合提供了一个新的视角,以创建更可解释、更高效的模型。
{"title":"Optimizing Neural Network Performance and Interpretability with Diophantine Equation Encoding","authors":"Ronald Katende","doi":"arxiv-2409.07310","DOIUrl":"https://doi.org/arxiv-2409.07310","url":null,"abstract":"This paper explores the integration of Diophantine equations into neural\u0000network (NN) architectures to improve model interpretability, stability, and\u0000efficiency. By encoding and decoding neural network parameters as integer\u0000solutions to Diophantine equations, we introduce a novel approach that enhances\u0000both the precision and robustness of deep learning models. Our method\u0000integrates a custom loss function that enforces Diophantine constraints during\u0000training, leading to better generalization, reduced error bounds, and enhanced\u0000resilience against adversarial attacks. We demonstrate the efficacy of this\u0000approach through several tasks, including image classification and natural\u0000language processing, where improvements in accuracy, convergence, and\u0000robustness are observed. This study offers a new perspective on combining\u0000mathematical theory and machine learning to create more interpretable and\u0000efficient models.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Y-Drop: A Conductance based Dropout for fully connected layers Y-Drop:基于电导的全连接层滤除器
Pub Date : 2024-09-11 DOI: arxiv-2409.09088
Efthymios Georgiou, Georgios Paraskevopoulos, Alexandros Potamianos
In this work, we introduce Y-Drop, a regularization method that biases thedropout algorithm towards dropping more important neurons with higherprobability. The backbone of our approach is neuron conductance, aninterpretable measure of neuron importance that calculates the contribution ofeach neuron towards the end-to-end mapping of the network. We investigate theimpact of the uniform dropout selection criterion on performance by assigninghigher dropout probability to the more important units. We show that forcingthe network to solve the task at hand in the absence of its important unitsyields a strong regularization effect. Further analysis indicates that Y-Dropyields solutions where more neurons are important, i.e have high conductance,and yields robust networks. In our experiments we show that the regularizationeffect of Y-Drop scales better than vanilla dropout w.r.t. the architecturesize and consistently yields superior performance over multiple datasets andarchitecture combinations, with little tuning.
在这项工作中,我们引入了 Y-Drop,这是一种正则化方法,它能使丢弃算法偏向于以更高的概率丢弃更重要的神经元。我们方法的支柱是神经元电导,这是一种可解释的神经元重要性度量,它计算每个神经元对网络端到端映射的贡献。我们通过为更重要的单元分配更高的辍学概率,研究了均匀辍学选择标准对性能的影响。我们发现,迫使网络在没有重要单元的情况下解决手头的任务会产生很强的正则化效应。进一步的分析表明,Y-正则化能产生更多重要神经元(即具有高传导性)的解决方案,并产生稳健的网络。在实验中,我们发现 Y-Drop 的正则化效果比 vanilla dropout 更好地扩展了架构规模,而且在多个数据集和架构组合中,Y-Drop 只需进行少量调整,就能始终如一地获得卓越性能。
{"title":"Y-Drop: A Conductance based Dropout for fully connected layers","authors":"Efthymios Georgiou, Georgios Paraskevopoulos, Alexandros Potamianos","doi":"arxiv-2409.09088","DOIUrl":"https://doi.org/arxiv-2409.09088","url":null,"abstract":"In this work, we introduce Y-Drop, a regularization method that biases the\u0000dropout algorithm towards dropping more important neurons with higher\u0000probability. The backbone of our approach is neuron conductance, an\u0000interpretable measure of neuron importance that calculates the contribution of\u0000each neuron towards the end-to-end mapping of the network. We investigate the\u0000impact of the uniform dropout selection criterion on performance by assigning\u0000higher dropout probability to the more important units. We show that forcing\u0000the network to solve the task at hand in the absence of its important units\u0000yields a strong regularization effect. Further analysis indicates that Y-Drop\u0000yields solutions where more neurons are important, i.e have high conductance,\u0000and yields robust networks. In our experiments we show that the regularization\u0000effect of Y-Drop scales better than vanilla dropout w.r.t. the architecture\u0000size and consistently yields superior performance over multiple datasets and\u0000architecture combinations, with little tuning.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"190 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advanced LSTM Neural Networks for Predicting Directional Changes in Sector-Specific ETFs Using Machine Learning Techniques 利用机器学习技术预测特定行业 ETF 方向性变化的高级 LSTM 神经网络
Pub Date : 2024-09-09 DOI: arxiv-2409.05778
Rifa Gowani, Zaryab Kanjiani
Trading and investing in stocks for some is their full-time career, while forothers, it's simply a supplementary income stream. Universal among allinvestors is the desire to turn a profit. The key to achieving this goal isdiversification. Spreading investments across sectors is critical toprofitability and maximizing returns. This study aims to gauge the viability ofmachine learning methods in practicing the principle of diversification tomaximize portfolio returns. To test this, the study evaluates the Long-ShortTerm Memory (LSTM) model across nine different sectors and over 2,200 stocksusing Vanguard's sector-based ETFs. The R-squared value across all sectorsshowed promising results, with an average of 0.8651 and a high of 0.942 for theVNQ ETF. These findings suggest that the LSTM model is a capable and viablemodel for accurately predicting directional changes across various industrysectors, helping investors diversify and grow their portfolios.
对一些人来说,股票交易和投资是他们的全职工作,而对另一些人来说,这只是一种补充收入来源。所有投资者的共同愿望是实现盈利。实现这一目标的关键在于分散投资。跨行业分散投资是盈利能力和收益最大化的关键。本研究旨在衡量机器学习方法在实践多元化原则以实现投资组合回报最大化方面的可行性。为了验证这一点,本研究利用 Vanguard 基于行业的 ETF,在九个不同行业和 2200 多只股票中评估了长短期记忆(LSTM)模型。所有行业的 R 平方值均显示出良好的结果,平均值为 0.8651,VNQ ETF 的最高值为 0.942。这些研究结果表明,LSTM 模型能够准确预测各行业板块的方向性变化,帮助投资者实现投资组合的多样化和增长。
{"title":"Advanced LSTM Neural Networks for Predicting Directional Changes in Sector-Specific ETFs Using Machine Learning Techniques","authors":"Rifa Gowani, Zaryab Kanjiani","doi":"arxiv-2409.05778","DOIUrl":"https://doi.org/arxiv-2409.05778","url":null,"abstract":"Trading and investing in stocks for some is their full-time career, while for\u0000others, it's simply a supplementary income stream. Universal among all\u0000investors is the desire to turn a profit. The key to achieving this goal is\u0000diversification. Spreading investments across sectors is critical to\u0000profitability and maximizing returns. This study aims to gauge the viability of\u0000machine learning methods in practicing the principle of diversification to\u0000maximize portfolio returns. To test this, the study evaluates the Long-Short\u0000Term Memory (LSTM) model across nine different sectors and over 2,200 stocks\u0000using Vanguard's sector-based ETFs. The R-squared value across all sectors\u0000showed promising results, with an average of 0.8651 and a high of 0.942 for the\u0000VNQ ETF. These findings suggest that the LSTM model is a capable and viable\u0000model for accurately predicting directional changes across various industry\u0000sectors, helping investors diversify and grow their portfolios.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comprehensive Comparison Between ANNs and KANs For Classifying EEG Alzheimer's Data ANN 与 KAN 在脑电图阿尔茨海默病数据分类方面的综合比较
Pub Date : 2024-09-09 DOI: arxiv-2409.05989
Akshay Sunkara, Sriram Sattiraju, Aakarshan Kumar, Zaryab Kanjiani, Himesh Anumala
Alzheimer's Disease is an incurable cognitive condition that affectsthousands of people globally. While some diagnostic methods exist forAlzheimer's Disease, many of these methods cannot detect Alzheimer's in itsearlier stages. Recently, researchers have explored the use ofElectroencephalogram (EEG) technology for diagnosing Alzheimer's. EEG is anoninvasive method of recording the brain's electrical signals, and EEG datahas shown distinct differences between patients with and without Alzheimer's.In the past, Artificial Neural Networks (ANNs) have been used to predictAlzheimer's from EEG data, but these models sometimes produce false positivediagnoses. This study aims to compare losses between ANNs and Kolmogorov-ArnoldNetworks (KANs) across multiple types of epochs, learning rates, and nodes. Theresults show that across these different parameters, ANNs are more accurate inpredicting Alzheimer's Disease from EEG signals.
阿尔茨海默病是一种无法治愈的认知疾病,影响着全球成千上万的人。虽然目前已有一些诊断阿尔茨海默病的方法,但其中许多方法无法检测到早期阶段的阿尔茨海默病。最近,研究人员探索使用脑电图(EEG)技术诊断阿尔茨海默病。脑电图是一种记录大脑电信号的非侵入性方法,脑电图数据显示阿尔茨海默病患者和非阿尔茨海默病患者之间存在明显差异。过去,人工神经网络(ANN)曾被用于从脑电图数据中预测阿尔茨海默病,但这些模型有时会产生误诊。本研究旨在比较人工神经网络和柯尔莫哥洛夫-阿诺德网络(KAN)在不同类型的历时、学习率和节点上的损失。结果表明,在这些不同的参数中,ANN 在从脑电图信号预测阿尔茨海默病方面更为准确。
{"title":"A Comprehensive Comparison Between ANNs and KANs For Classifying EEG Alzheimer's Data","authors":"Akshay Sunkara, Sriram Sattiraju, Aakarshan Kumar, Zaryab Kanjiani, Himesh Anumala","doi":"arxiv-2409.05989","DOIUrl":"https://doi.org/arxiv-2409.05989","url":null,"abstract":"Alzheimer's Disease is an incurable cognitive condition that affects\u0000thousands of people globally. While some diagnostic methods exist for\u0000Alzheimer's Disease, many of these methods cannot detect Alzheimer's in its\u0000earlier stages. Recently, researchers have explored the use of\u0000Electroencephalogram (EEG) technology for diagnosing Alzheimer's. EEG is a\u0000noninvasive method of recording the brain's electrical signals, and EEG data\u0000has shown distinct differences between patients with and without Alzheimer's.\u0000In the past, Artificial Neural Networks (ANNs) have been used to predict\u0000Alzheimer's from EEG data, but these models sometimes produce false positive\u0000diagnoses. This study aims to compare losses between ANNs and Kolmogorov-Arnold\u0000Networks (KANs) across multiple types of epochs, learning rates, and nodes. The\u0000results show that across these different parameters, ANNs are more accurate in\u0000predicting Alzheimer's Disease from EEG signals.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"160 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Neural and Evolutionary Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1