首页 > 最新文献

arXiv - CS - Neural and Evolutionary Computing最新文献

英文 中文
Hardware-Friendly Implementation of Physical Reservoir Computing with CMOS-based Time-domain Analog Spiking Neurons 利用基于 CMOS 的时域模拟尖峰神经元,以硬件友好方式实现物理存储计算
Pub Date : 2024-09-18 DOI: arxiv-2409.11612
Nanako Kimura, Ckristian Duran, Zolboo Byambadorj, Ryosho Nakane, Tetsuya Iizuka
This paper introduces an analog spiking neuron that utilizes time-domaininformation, i.e., a time interval of two signal transitions and a pulse width,to construct a spiking neural network (SNN) for a hardware-friendly physicalreservoir computing (RC) on a complementary metal-oxide-semiconductor (CMOS)platform. A neuron with leaky integrate-and-fire is realized by employing twovoltage-controlled oscillators (VCOs) with opposite sensitivities to theinternal control voltage, and the neuron connection structure is restricted bythe use of only 4 neighboring neurons on the 2-dimensional plane to feasiblyconstruct a regular network topology. Such a system enables us to compose anSNN with a counter-based readout circuit, which simplifies the hardwareimplementation of the SNN. Moreover, another technical advantage thanks to thebottom-up integration is the capability of dynamically capturing every neuronstate in the network, which can significantly contribute to finding guidelineson how to enhance the performance for various computational tasks in temporalinformation processing. Diverse nonlinear physical dynamics needed for RC canbe realized by collective behavior through dynamic interaction between neurons,like coupled oscillators, despite the simple network structure. With behavioralsystem-level simulations, we demonstrate physical RC through short-term memoryand exclusive OR tasks, and the spoken digit recognition task with an accuracyof 97.7% as well. Our system is considerably feasible for practicalapplications and also can be a useful platform for studying the mechanism ofphysical RC.
本文介绍了一种模拟尖峰神经元,它利用时域信息(即两个信号转换的时间间隔和脉冲宽度)来构建尖峰神经网络(SNN),从而在互补金属氧化物半导体(CMOS)平台上实现硬件友好型物理存储计算(RC)。通过采用两个对内部控制电压具有相反灵敏度的压控振荡器(VCO),实现了具有泄漏积分和发射功能的神经元,并且神经元连接结构受限于在二维平面上仅使用 4 个相邻神经元,以可行地构建规则的网络拓扑结构。这样的系统使我们能够用一个基于计数器的读出电路来组成 SNN,从而简化了 SNN 的硬件实现。此外,自下而上集成的另一个技术优势是能够动态捕捉网络中的每一个神经元状态,这将大大有助于找到如何提高时间信息处理中各种计算任务性能的指导原则。尽管网络结构简单,但 RC 所需的各种非线性物理动态可以通过神经元之间的动态交互来实现,就像耦合振荡器一样。通过行为系统级仿真,我们在短时记忆和排他 OR 任务中演示了物理 RC,在口语数字识别任务中的准确率也达到了 97.7%。我们的系统在实际应用中具有相当大的可行性,也可以成为研究物理 RC 机制的有用平台。
{"title":"Hardware-Friendly Implementation of Physical Reservoir Computing with CMOS-based Time-domain Analog Spiking Neurons","authors":"Nanako Kimura, Ckristian Duran, Zolboo Byambadorj, Ryosho Nakane, Tetsuya Iizuka","doi":"arxiv-2409.11612","DOIUrl":"https://doi.org/arxiv-2409.11612","url":null,"abstract":"This paper introduces an analog spiking neuron that utilizes time-domain\u0000information, i.e., a time interval of two signal transitions and a pulse width,\u0000to construct a spiking neural network (SNN) for a hardware-friendly physical\u0000reservoir computing (RC) on a complementary metal-oxide-semiconductor (CMOS)\u0000platform. A neuron with leaky integrate-and-fire is realized by employing two\u0000voltage-controlled oscillators (VCOs) with opposite sensitivities to the\u0000internal control voltage, and the neuron connection structure is restricted by\u0000the use of only 4 neighboring neurons on the 2-dimensional plane to feasibly\u0000construct a regular network topology. Such a system enables us to compose an\u0000SNN with a counter-based readout circuit, which simplifies the hardware\u0000implementation of the SNN. Moreover, another technical advantage thanks to the\u0000bottom-up integration is the capability of dynamically capturing every neuron\u0000state in the network, which can significantly contribute to finding guidelines\u0000on how to enhance the performance for various computational tasks in temporal\u0000information processing. Diverse nonlinear physical dynamics needed for RC can\u0000be realized by collective behavior through dynamic interaction between neurons,\u0000like coupled oscillators, despite the simple network structure. With behavioral\u0000system-level simulations, we demonstrate physical RC through short-term memory\u0000and exclusive OR tasks, and the spoken digit recognition task with an accuracy\u0000of 97.7% as well. Our system is considerably feasible for practical\u0000applications and also can be a useful platform for studying the mechanism of\u0000physical RC.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"95 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PReLU: Yet Another Single-Layer Solution to the XOR Problem PReLU:XOR 问题的另一种单层解决方案
Pub Date : 2024-09-17 DOI: arxiv-2409.10821
Rafael C. Pinto, Anderson R. Tavares
This paper demonstrates that a single-layer neural network using ParametricRectified Linear Unit (PReLU) activation can solve the XOR problem, a simplefact that has been overlooked so far. We compare this solution to themulti-layer perceptron (MLP) and the Growing Cosine Unit (GCU) activationfunction and explain why PReLU enables this capability. Our results show thatthe single-layer PReLU network can achieve 100% success rate in a wider rangeof learning rates while using only three learnable parameters.
本文证明了使用参数线性单元(PReLU)激活的单层神经网络可以解决 XOR 问题,而这是一个迄今为止一直被忽视的简单问题。我们将这一解决方案与多层感知器(MLP)和增长余弦单元(GCU)激活功能进行了比较,并解释了为什么 PReLU 能够实现这一功能。我们的结果表明,单层 PReLU 网络可以在更宽的学习率范围内实现 100% 的成功率,同时只使用三个可学习参数。
{"title":"PReLU: Yet Another Single-Layer Solution to the XOR Problem","authors":"Rafael C. Pinto, Anderson R. Tavares","doi":"arxiv-2409.10821","DOIUrl":"https://doi.org/arxiv-2409.10821","url":null,"abstract":"This paper demonstrates that a single-layer neural network using Parametric\u0000Rectified Linear Unit (PReLU) activation can solve the XOR problem, a simple\u0000fact that has been overlooked so far. We compare this solution to the\u0000multi-layer perceptron (MLP) and the Growing Cosine Unit (GCU) activation\u0000function and explain why PReLU enables this capability. Our results show that\u0000the single-layer PReLU network can achieve 100% success rate in a wider range\u0000of learning rates while using only three learnable parameters.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inferno: An Extensible Framework for Spiking Neural Networks 地狱尖峰神经网络的可扩展框架
Pub Date : 2024-09-17 DOI: arxiv-2409.11567
Marissa Dominijanni
This paper introduces Inferno, a software library built on top of PyTorchthat is designed to meet distinctive challenges of using spiking neuralnetworks (SNNs) for machine learning tasks. We describe the architecture ofInferno and key differentiators that make it uniquely well-suited to thesetasks. We show how Inferno supports trainable heterogeneous delays on both CPUsand GPUs, and how Inferno enables a "write once, apply everywhere" developmentmethodology for novel models and techniques. We compare Inferno's performanceto BindsNET, a library aimed at machine learning with SNNs, andBrian2/Brian2CUDA which is popular in neuroscience. Among several examples, weshow how the design decisions made by Inferno facilitate easily implementingthe new methods of Nadafian and Ganjtabesh in delay learning with spike-timingdependent plasticity.
本文介绍了 Inferno,这是一个建立在 PyTorch 基础上的软件库,旨在应对使用尖峰神经网络(SNN)完成机器学习任务所面临的独特挑战。我们描述了Inferno的架构以及使其能够独一无二地胜任这些任务的关键差异化因素。我们展示了Inferno如何在CPU和GPU上支持可训练的异构延迟,以及Inferno如何为新型模型和技术实现 "一次编写,随处应用 "的开发方法。我们将Inferno的性能与BindsNET和Brian2/Brian2CUDA进行了比较,BindsNET是一个针对使用SNN进行机器学习的库,而Brian2/Brian2CUDA则在神经科学领域非常流行。在几个例子中,我们展示了 Inferno 所做的设计决定是如何帮助轻松实现 Nadafian 和 Ganjtabesh 的新方法的,这些方法用于具有尖峰计时可塑性的延迟学习。
{"title":"Inferno: An Extensible Framework for Spiking Neural Networks","authors":"Marissa Dominijanni","doi":"arxiv-2409.11567","DOIUrl":"https://doi.org/arxiv-2409.11567","url":null,"abstract":"This paper introduces Inferno, a software library built on top of PyTorch\u0000that is designed to meet distinctive challenges of using spiking neural\u0000networks (SNNs) for machine learning tasks. We describe the architecture of\u0000Inferno and key differentiators that make it uniquely well-suited to these\u0000tasks. We show how Inferno supports trainable heterogeneous delays on both CPUs\u0000and GPUs, and how Inferno enables a \"write once, apply everywhere\" development\u0000methodology for novel models and techniques. We compare Inferno's performance\u0000to BindsNET, a library aimed at machine learning with SNNs, and\u0000Brian2/Brian2CUDA which is popular in neuroscience. Among several examples, we\u0000show how the design decisions made by Inferno facilitate easily implementing\u0000the new methods of Nadafian and Ganjtabesh in delay learning with spike-timing\u0000dependent plasticity.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MonoKAN: Certified Monotonic Kolmogorov-Arnold Network MonoKAN:经过认证的单调科尔莫戈罗夫-阿诺德网络
Pub Date : 2024-09-17 DOI: arxiv-2409.11078
Alejandro Polo-Molina, David Alfaya, Jose Portela
Artificial Neural Networks (ANNs) have significantly advanced various fieldsby effectively recognizing patterns and solving complex problems. Despite theseadvancements, their interpretability remains a critical challenge, especiallyin applications where transparency and accountability are essential. To addressthis, explainable AI (XAI) has made progress in demystifying ANNs, yetinterpretability alone is often insufficient. In certain applications, modelpredictions must align with expert-imposed requirements, sometimes exemplifiedby partial monotonicity constraints. While monotonic approaches are found inthe literature for traditional Multi-layer Perceptrons (MLPs), they still facedifficulties in achieving both interpretability and certified partialmonotonicity. Recently, the Kolmogorov-Arnold Network (KAN) architecture, basedon learnable activation functions parametrized as splines, has been proposed asa more interpretable alternative to MLPs. Building on this, we introduce anovel ANN architecture called MonoKAN, which is based on the KAN architectureand achieves certified partial monotonicity while enhancing interpretability.To achieve this, we employ cubic Hermite splines, which guarantee monotonicitythrough a set of straightforward conditions. Additionally, by using positiveweights in the linear combinations of these splines, we ensure that the networkpreserves the monotonic relationships between input and output. Our experimentsdemonstrate that MonoKAN not only enhances interpretability but also improvespredictive performance across the majority of benchmarks, outperformingstate-of-the-art monotonic MLP approaches.
人工神经网络(ANN)通过有效识别模式和解决复杂问题,极大地推动了各个领域的发展。尽管取得了这些进步,但其可解释性仍然是一个严峻的挑战,尤其是在对透明度和问责制至关重要的应用领域。为了解决这个问题,可解释人工智能(XAI)在揭开人工智能的神秘面纱方面取得了进展,但仅有可解释性往往是不够的。在某些应用中,模型预测必须符合专家提出的要求,有时部分单调性约束就是一个例子。虽然传统多层感知器(MLP)的单调性方法已见诸文献,但它们在实现可解释性和经认证的部分单调性方面仍然面临困难。最近,有人提出了基于参数化为劈线的可学习激活函数的 Kolmogorov-Arnold 网络(KAN)架构,作为 MLP 的一种更可解释的替代方案。在此基础上,我们推出了一种名为 MonoKAN 的新型 ANN 架构,它以 KAN 架构为基础,在增强可解释性的同时实现了经认证的部分单调性。此外,通过在这些样条的线性组合中使用正权重,我们确保了网络保留了输入和输出之间的单调关系。我们的实验证明,MonoKAN 不仅增强了可解释性,而且在大多数基准测试中提高了预测性能,表现优于最先进的单调 MLP 方法。
{"title":"MonoKAN: Certified Monotonic Kolmogorov-Arnold Network","authors":"Alejandro Polo-Molina, David Alfaya, Jose Portela","doi":"arxiv-2409.11078","DOIUrl":"https://doi.org/arxiv-2409.11078","url":null,"abstract":"Artificial Neural Networks (ANNs) have significantly advanced various fields\u0000by effectively recognizing patterns and solving complex problems. Despite these\u0000advancements, their interpretability remains a critical challenge, especially\u0000in applications where transparency and accountability are essential. To address\u0000this, explainable AI (XAI) has made progress in demystifying ANNs, yet\u0000interpretability alone is often insufficient. In certain applications, model\u0000predictions must align with expert-imposed requirements, sometimes exemplified\u0000by partial monotonicity constraints. While monotonic approaches are found in\u0000the literature for traditional Multi-layer Perceptrons (MLPs), they still face\u0000difficulties in achieving both interpretability and certified partial\u0000monotonicity. Recently, the Kolmogorov-Arnold Network (KAN) architecture, based\u0000on learnable activation functions parametrized as splines, has been proposed as\u0000a more interpretable alternative to MLPs. Building on this, we introduce a\u0000novel ANN architecture called MonoKAN, which is based on the KAN architecture\u0000and achieves certified partial monotonicity while enhancing interpretability.\u0000To achieve this, we employ cubic Hermite splines, which guarantee monotonicity\u0000through a set of straightforward conditions. Additionally, by using positive\u0000weights in the linear combinations of these splines, we ensure that the network\u0000preserves the monotonic relationships between input and output. Our experiments\u0000demonstrate that MonoKAN not only enhances interpretability but also improves\u0000predictive performance across the majority of benchmarks, outperforming\u0000state-of-the-art monotonic MLP approaches.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bio-Inspired Mamba: Temporal Locality and Bioplausible Learning in Selective State Space Models 受生物启发的曼巴:选择性状态空间模型中的时域性和可生物学习
Pub Date : 2024-09-17 DOI: arxiv-2409.11263
Jiahao Qin
This paper introduces Bio-Inspired Mamba (BIM), a novel online learningframework for selective state space models that integrates biological learningprinciples with the Mamba architecture. BIM combines Real-Time RecurrentLearning (RTRL) with Spike-Timing-Dependent Plasticity (STDP)-like locallearning rules, addressing the challenges of temporal locality and biologicalplausibility in training spiking neural networks. Our approach leverages theinherent connection between backpropagation through time and STDP, offering acomputationally efficient alternative that maintains the ability to capturelong-range dependencies. We evaluate BIM on language modeling, speechrecognition, and biomedical signal analysis tasks, demonstrating competitiveperformance against traditional methods while adhering to biological learningprinciples. Results show improved energy efficiency and potential forneuromorphic hardware implementation. BIM not only advances the field ofbiologically plausible machine learning but also provides insights into themechanisms of temporal information processing in biological neural networks.
本文介绍了生物启发曼巴(BIM),这是一种用于选择性状态空间模型的新型在线学习框架,它将生物学习原理与曼巴架构融为一体。BIM 将实时循环学习(Real-Time RecurrentLearning,RTRL)与类似于尖峰定时可塑性(Spike-Timing-Dependent Plasticity,STDP)的局部学习规则相结合,解决了尖峰神经网络训练中的时间局部性和生物可信性难题。我们的方法利用了时间反向传播和 STDP 之间的内在联系,提供了一种计算高效的替代方法,同时保持了捕捉长程依赖性的能力。我们在语言建模、语音识别和生物医学信号分析任务中对 BIM 进行了评估,结果表明,在遵循生物学习原理的同时,BIM 的性能与传统方法相比极具竞争力。结果表明,BIM 的能效得到了提高,并有可能实现超形态硬件。BIM 不仅推动了生物可信机器学习领域的发展,还为生物神经网络中的时间信息处理机制提供了新的见解。
{"title":"Bio-Inspired Mamba: Temporal Locality and Bioplausible Learning in Selective State Space Models","authors":"Jiahao Qin","doi":"arxiv-2409.11263","DOIUrl":"https://doi.org/arxiv-2409.11263","url":null,"abstract":"This paper introduces Bio-Inspired Mamba (BIM), a novel online learning\u0000framework for selective state space models that integrates biological learning\u0000principles with the Mamba architecture. BIM combines Real-Time Recurrent\u0000Learning (RTRL) with Spike-Timing-Dependent Plasticity (STDP)-like local\u0000learning rules, addressing the challenges of temporal locality and biological\u0000plausibility in training spiking neural networks. Our approach leverages the\u0000inherent connection between backpropagation through time and STDP, offering a\u0000computationally efficient alternative that maintains the ability to capture\u0000long-range dependencies. We evaluate BIM on language modeling, speech\u0000recognition, and biomedical signal analysis tasks, demonstrating competitive\u0000performance against traditional methods while adhering to biological learning\u0000principles. Results show improved energy efficiency and potential for\u0000neuromorphic hardware implementation. BIM not only advances the field of\u0000biologically plausible machine learning but also provides insights into the\u0000mechanisms of temporal information processing in biological neural networks.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-Contrastive Forward-Forward Algorithm 自对焦前向算法
Pub Date : 2024-09-17 DOI: arxiv-2409.11593
Xing Chen, Dongshu Liu, Jeremie Laydevant, Julie Grollier
The Forward-Forward (FF) algorithm is a recent, purely forward-mode learningmethod, that updates weights locally and layer-wise and supports supervised aswell as unsupervised learning. These features make it ideal for applicationssuch as brain-inspired learning, low-power hardware neural networks, anddistributed learning in large models. However, while FF has shown promise onwritten digit recognition tasks, its performance on natural images andtime-series remains a challenge. A key limitation is the need to generatehigh-quality negative examples for contrastive learning, especially inunsupervised tasks, where versatile solutions are currently lacking. To addressthis, we introduce the Self-Contrastive Forward-Forward (SCFF) method, inspiredby self-supervised contrastive learning. SCFF generates positive and negativeexamples applicable across different datasets, surpassing existing localforward algorithms for unsupervised classification accuracy on MNIST (MLP:98.7%), CIFAR-10 (CNN: 80.75%), and STL-10 (CNN: 77.3%). Additionally, SCFF isthe first to enable FF training of recurrent neural networks, opening the doorto more complex tasks and continuous-time video and text processing.
前向-前向(FF)算法是一种最新的纯前向模式学习方法,它在局部和层上更新权重,支持有监督和无监督学习。这些特点使其成为大脑启发学习、低功耗硬件神经网络和大型模型分布式学习等应用的理想选择。然而,虽然 FF 在书面数字识别任务中表现出了良好的前景,但它在自然图像和时间序列上的表现仍然是一个挑战。一个关键的限制因素是需要为对比学习生成高质量的负面示例,特别是在无监督任务中,目前还缺乏通用的解决方案。为了解决这个问题,我们从自我监督对比学习中汲取灵感,引入了自对比前向(SCFF)方法。SCFF 可生成适用于不同数据集的正负样本,在 MNIST(MLP:98.7%)、CIFAR-10(CNN:80.75%)和 STL-10(CNN:77.3%)上的无监督分类准确率超过了现有的局部前向算法。此外,SCFF 还首次实现了循环神经网络的 FF 训练,为更复杂的任务以及连续时间视频和文本处理打开了大门。
{"title":"Self-Contrastive Forward-Forward Algorithm","authors":"Xing Chen, Dongshu Liu, Jeremie Laydevant, Julie Grollier","doi":"arxiv-2409.11593","DOIUrl":"https://doi.org/arxiv-2409.11593","url":null,"abstract":"The Forward-Forward (FF) algorithm is a recent, purely forward-mode learning\u0000method, that updates weights locally and layer-wise and supports supervised as\u0000well as unsupervised learning. These features make it ideal for applications\u0000such as brain-inspired learning, low-power hardware neural networks, and\u0000distributed learning in large models. However, while FF has shown promise on\u0000written digit recognition tasks, its performance on natural images and\u0000time-series remains a challenge. A key limitation is the need to generate\u0000high-quality negative examples for contrastive learning, especially in\u0000unsupervised tasks, where versatile solutions are currently lacking. To address\u0000this, we introduce the Self-Contrastive Forward-Forward (SCFF) method, inspired\u0000by self-supervised contrastive learning. SCFF generates positive and negative\u0000examples applicable across different datasets, surpassing existing local\u0000forward algorithms for unsupervised classification accuracy on MNIST (MLP:\u000098.7%), CIFAR-10 (CNN: 80.75%), and STL-10 (CNN: 77.3%). Additionally, SCFF is\u0000the first to enable FF training of recurrent neural networks, opening the door\u0000to more complex tasks and continuous-time video and text processing.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Efficacy of Instance Incremental vs. Batch Learning in Delayed Label Environments: An Empirical Study on Tabular Data Streaming for Fraud Detection 评估延迟标签环境下实例增量学习与批量学习的效果:用于欺诈检测的表格数据流实证研究
Pub Date : 2024-09-16 DOI: arxiv-2409.10111
Kodjo Mawuena Amekoe, Mustapha Lebbah, Gregoire Jaffre, Hanene Azzag, Zaineb Chelly Dagdia
Real-world tabular learning production scenarios typically involve evolvingdata streams, where data arrives continuously and its distribution may changeover time. In such a setting, most studies in the literature regardingsupervised learning favor the use of instance incremental algorithms due totheir ability to adapt to changes in the data distribution. Another significantreason for choosing these algorithms is textit{avoid storing observations inmemory} as commonly done in batch incremental settings. However, the design ofinstance incremental algorithms often assumes immediate availability of labels,which is an optimistic assumption. In many real-world scenarios, such as frauddetection or credit scoring, labels may be delayed. Consequently, batchincremental algorithms are widely used in many real-world tasks. This raises animportant question: "In delayed settings, is instance incremental learning thebest option regarding predictive performance and computational efficiency?"Unfortunately, this question has not been studied in depth, probably due to thescarcity of real datasets containing delayed information. In this study, weconduct a comprehensive empirical evaluation and analysis of this questionusing a real-world fraud detection problem and commonly used generateddatasets. Our findings indicate that instance incremental learning is not thesuperior option, considering on one side state-of-the-art models such asAdaptive Random Forest (ARF) and other side batch learning models such asXGBoost. Additionally, when considering the interpretability of the learningsystems, batch incremental solutions tend to be favored. Code:url{https://github.com/anselmeamekoe/DelayedLabelStream}
现实世界中的表格学习生产场景通常涉及不断发展的数据流,其中数据不断到达,其分布可能随时间发生变化。在这种情况下,大多数关于监督学习的文献研究都倾向于使用实例增量算法,因为它们能够适应数据分布的变化。选择这些算法的另一个重要原因是,它们可以避免将观察结果存储在内存中,而批量增量算法通常就是这样做的。然而,实例增量算法的设计通常假设标签立即可用,这是一个乐观的假设。在现实世界的许多场景中,如欺诈检测或信用评分,标签可能会延迟。因此,批量递增算法被广泛应用于许多实际任务中。这就提出了一个重要问题:"不幸的是,这个问题还没有得到深入研究,这可能是由于包含延迟信息的真实数据集非常稀少。在本研究中,我们利用现实世界中的欺诈检测问题和常用的生成数据集对这一问题进行了全面的实证评估和分析。我们的研究结果表明,考虑到自适应随机森林(ARF)等最先进的模型和 XGBoost 等批量学习模型,实例增量学习并不是更优的选择。此外,考虑到学习系统的可解释性,批量增量解决方案往往更受青睐。代码:url{https://github.com/anselmeamekoe/DelayedLabelStream}
{"title":"Evaluating the Efficacy of Instance Incremental vs. Batch Learning in Delayed Label Environments: An Empirical Study on Tabular Data Streaming for Fraud Detection","authors":"Kodjo Mawuena Amekoe, Mustapha Lebbah, Gregoire Jaffre, Hanene Azzag, Zaineb Chelly Dagdia","doi":"arxiv-2409.10111","DOIUrl":"https://doi.org/arxiv-2409.10111","url":null,"abstract":"Real-world tabular learning production scenarios typically involve evolving\u0000data streams, where data arrives continuously and its distribution may change\u0000over time. In such a setting, most studies in the literature regarding\u0000supervised learning favor the use of instance incremental algorithms due to\u0000their ability to adapt to changes in the data distribution. Another significant\u0000reason for choosing these algorithms is textit{avoid storing observations in\u0000memory} as commonly done in batch incremental settings. However, the design of\u0000instance incremental algorithms often assumes immediate availability of labels,\u0000which is an optimistic assumption. In many real-world scenarios, such as fraud\u0000detection or credit scoring, labels may be delayed. Consequently, batch\u0000incremental algorithms are widely used in many real-world tasks. This raises an\u0000important question: \"In delayed settings, is instance incremental learning the\u0000best option regarding predictive performance and computational efficiency?\"\u0000Unfortunately, this question has not been studied in depth, probably due to the\u0000scarcity of real datasets containing delayed information. In this study, we\u0000conduct a comprehensive empirical evaluation and analysis of this question\u0000using a real-world fraud detection problem and commonly used generated\u0000datasets. Our findings indicate that instance incremental learning is not the\u0000superior option, considering on one side state-of-the-art models such as\u0000Adaptive Random Forest (ARF) and other side batch learning models such as\u0000XGBoost. Additionally, when considering the interpretability of the learning\u0000systems, batch incremental solutions tend to be favored. Code:\u0000url{https://github.com/anselmeamekoe/DelayedLabelStream}","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fixed-Parameter Tractability of the (1+1) Evolutionary Algorithm on Random Planted Vertex Covers 随机植被顶点覆盖上 (1+1) 进化算法的固定参数可操作性
Pub Date : 2024-09-16 DOI: arxiv-2409.10144
Jack Kearney, Frank Neumann, Andrew M. Sutton
We present the first parameterized analysis of a standard (1+1) EvolutionaryAlgorithm on a distribution of vertex cover problems. We show that if theplanted cover is at most logarithmic, restarting the (1+1) EA every $O(n logn)$ steps will find a cover at least as small as the planted cover inpolynomial time for sufficiently dense random graphs $p > 0.71$. Forsuperlogarithmic planted covers, we prove that the (1+1) EA finds a solution infixed-parameter tractable time in expectation. We complement these theoretical investigations with a number of computationalexperiments that highlight the interplay between planted cover size, graphdensity and runtime.
我们首次对顶点覆盖分布问题上的标准(1+1)进化算法进行了参数化分析。我们证明,如果种植覆盖最多为对数,那么对于足够密集的随机图 $p > 0.71$,每隔 $O(n logn)$ 步重新启动 (1+1) 进化算法将在多项式时间内找到一个至少与种植覆盖一样小的覆盖。对于超对数植被覆盖,我们证明了 (1+1) EA 在期望时间内找到了一个无固定参数的解决方案。我们用大量计算实验来补充这些理论研究,这些实验突出了植被大小、图密度和运行时间之间的相互作用。
{"title":"Fixed-Parameter Tractability of the (1+1) Evolutionary Algorithm on Random Planted Vertex Covers","authors":"Jack Kearney, Frank Neumann, Andrew M. Sutton","doi":"arxiv-2409.10144","DOIUrl":"https://doi.org/arxiv-2409.10144","url":null,"abstract":"We present the first parameterized analysis of a standard (1+1) Evolutionary\u0000Algorithm on a distribution of vertex cover problems. We show that if the\u0000planted cover is at most logarithmic, restarting the (1+1) EA every $O(n log\u0000n)$ steps will find a cover at least as small as the planted cover in\u0000polynomial time for sufficiently dense random graphs $p > 0.71$. For\u0000superlogarithmic planted covers, we prove that the (1+1) EA finds a solution in\u0000fixed-parameter tractable time in expectation. We complement these theoretical investigations with a number of computational\u0000experiments that highlight the interplay between planted cover size, graph\u0000density and runtime.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kolmogorov-Arnold Transformer 柯尔莫哥洛夫-阿诺德变换器
Pub Date : 2024-09-16 DOI: arxiv-2409.10594
Xingyi Yang, Xinchao Wang
Transformers stand as the cornerstone of mordern deep learning.Traditionally, these models rely on multi-layer perceptron (MLP) layers to mixthe information between channels. In this paper, we introduce theKolmogorov-Arnold Transformer (KAT), a novel architecture that replaces MLPlayers with Kolmogorov-Arnold Network (KAN) layers to enhance theexpressiveness and performance of the model. Integrating KANs intotransformers, however, is no easy feat, especially when scaled up.Specifically, we identify three key challenges: (C1) Base function. Thestandard B-spline function used in KANs is not optimized for parallel computingon modern hardware, resulting in slower inference speeds. (C2) Parameter andComputation Inefficiency. KAN requires a unique function for each input-outputpair, making the computation extremely large. (C3) Weight initialization. Theinitialization of weights in KANs is particularly challenging due to theirlearnable activation functions, which are critical for achieving convergence indeep neural networks. To overcome the aforementioned challenges, we proposethree key solutions: (S1) Rational basis. We replace B-spline functions withrational functions to improve compatibility with modern GPUs. By implementingthis in CUDA, we achieve faster computations. (S2) Group KAN. We share theactivation weights through a group of neurons, to reduce the computational loadwithout sacrificing performance. (S3) Variance-preserving initialization. Wecarefully initialize the activation weights to make sure that the activationvariance is maintained across layers. With these designs, KAT scaleseffectively and readily outperforms traditional MLP-based transformers.
变压器是现代深度学习的基石。传统上,这些模型依靠多层感知器(MLP)层来混合通道之间的信息。在本文中,我们介绍了柯尔莫哥洛夫-阿诺德变换器(KAT),这是一种新颖的架构,用柯尔莫哥洛夫-阿诺德网络(KAN)层取代了 MLP 层,从而提高了模型的可执行性和性能。然而,将 KAN 集成到转换器中并非易事,尤其是在扩大规模时。具体而言,我们发现了三个关键挑战:(C1)基础函数。KANs 中使用的标准 B-样条函数没有针对现代硬件的并行计算进行优化,导致推断速度较慢。(C2) 参数和计算效率低下。KAN 要求每个输入输出对都使用唯一的函数,这使得计算量极大。(C3) 权重初始化。KAN 中权重的初始化尤其具有挑战性,因为其激活函数是可学习的,而激活函数是实现深度神经网络收敛的关键。为了克服上述挑战,我们提出了三个主要解决方案:(S1)有理基础。我们用有理函数取代 B-样条函数,以提高与现代 GPU 的兼容性。通过在 CUDA 中实施,我们实现了更快的计算速度。(S2) 组 KAN。我们通过一组神经元共享激活权重,在不牺牲性能的情况下减少计算负荷。(S3) 保留方差的初始化。我们精心初始化激活权重,以确保各层之间保持激活方差。有了这些设计,KAT 可以有效地扩展,并轻松超越基于 MLP 的传统转换器。
{"title":"Kolmogorov-Arnold Transformer","authors":"Xingyi Yang, Xinchao Wang","doi":"arxiv-2409.10594","DOIUrl":"https://doi.org/arxiv-2409.10594","url":null,"abstract":"Transformers stand as the cornerstone of mordern deep learning.\u0000Traditionally, these models rely on multi-layer perceptron (MLP) layers to mix\u0000the information between channels. In this paper, we introduce the\u0000Kolmogorov-Arnold Transformer (KAT), a novel architecture that replaces MLP\u0000layers with Kolmogorov-Arnold Network (KAN) layers to enhance the\u0000expressiveness and performance of the model. Integrating KANs into\u0000transformers, however, is no easy feat, especially when scaled up.\u0000Specifically, we identify three key challenges: (C1) Base function. The\u0000standard B-spline function used in KANs is not optimized for parallel computing\u0000on modern hardware, resulting in slower inference speeds. (C2) Parameter and\u0000Computation Inefficiency. KAN requires a unique function for each input-output\u0000pair, making the computation extremely large. (C3) Weight initialization. The\u0000initialization of weights in KANs is particularly challenging due to their\u0000learnable activation functions, which are critical for achieving convergence in\u0000deep neural networks. To overcome the aforementioned challenges, we propose\u0000three key solutions: (S1) Rational basis. We replace B-spline functions with\u0000rational functions to improve compatibility with modern GPUs. By implementing\u0000this in CUDA, we achieve faster computations. (S2) Group KAN. We share the\u0000activation weights through a group of neurons, to reduce the computational load\u0000without sacrificing performance. (S3) Variance-preserving initialization. We\u0000carefully initialize the activation weights to make sure that the activation\u0000variance is maintained across layers. With these designs, KAT scales\u0000effectively and readily outperforms traditional MLP-based transformers.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"105 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Steinmetz Neural Networks for Complex-Valued Data 用于复值数据的 Steinmetz 神经网络
Pub Date : 2024-09-16 DOI: arxiv-2409.10075
Shyam Venkatasubramanian, Ali Pezeshki, Vahid Tarokh
In this work, we introduce a new approach to processing complex-valued datausing DNNs consisting of parallel real-valued subnetworks with coupled outputs.Our proposed class of architectures, referred to as Steinmetz Neural Networks,leverages multi-view learning to construct more interpretable representationswithin the latent space. Subsequently, we present the Analytic Neural Network,which implements a consistency penalty that encourages analytic signalrepresentations in the Steinmetz neural network's latent space. This penaltyenforces a deterministic and orthogonal relationship between the real andimaginary components. Utilizing an information-theoretic construction, wedemonstrate that the upper bound on the generalization error posited by theanalytic neural network is lower than that of the general class of Steinmetzneural networks. Our numerical experiments demonstrate the improved performanceand robustness to additive noise, afforded by our proposed networks onbenchmark datasets and synthetic examples.
在这项工作中,我们介绍了一种处理复值数据的新方法,即使用由具有耦合输出的并行实值子网络组成的 DNN。我们提出的这一类架构被称为 Steinmetz 神经网络,它利用多视角学习在潜在空间中构建更多可解释的表示。随后,我们提出了分析神经网络,它实施了一种一致性惩罚,鼓励在 Steinmetz 神经网络的潜在空间中进行分析信号表示。这种惩罚加强了实分量和虚分量之间的确定性和正交关系。利用信息论结构,我们证明了分析神经网络假设的泛化误差上限低于一般的斯坦梅茨神经网络。我们的数值实验证明,我们提出的网络在基准数据集和合成示例上具有更高的性能和对加性噪声的鲁棒性。
{"title":"Steinmetz Neural Networks for Complex-Valued Data","authors":"Shyam Venkatasubramanian, Ali Pezeshki, Vahid Tarokh","doi":"arxiv-2409.10075","DOIUrl":"https://doi.org/arxiv-2409.10075","url":null,"abstract":"In this work, we introduce a new approach to processing complex-valued data\u0000using DNNs consisting of parallel real-valued subnetworks with coupled outputs.\u0000Our proposed class of architectures, referred to as Steinmetz Neural Networks,\u0000leverages multi-view learning to construct more interpretable representations\u0000within the latent space. Subsequently, we present the Analytic Neural Network,\u0000which implements a consistency penalty that encourages analytic signal\u0000representations in the Steinmetz neural network's latent space. This penalty\u0000enforces a deterministic and orthogonal relationship between the real and\u0000imaginary components. Utilizing an information-theoretic construction, we\u0000demonstrate that the upper bound on the generalization error posited by the\u0000analytic neural network is lower than that of the general class of Steinmetz\u0000neural networks. Our numerical experiments demonstrate the improved performance\u0000and robustness to additive noise, afforded by our proposed networks on\u0000benchmark datasets and synthetic examples.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Neural and Evolutionary Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1