{"title":"DTGA: an in-situ training scheme for memristor neural networks with high performance","authors":"Siyuan Shen, Mingjian Guo, Lidan Wang, Shukai Duan","doi":"10.1007/s10489-024-06091-9","DOIUrl":null,"url":null,"abstract":"<div><p>Memristor Neural Networks (MNNs) stand out for their low power consumption and accelerated matrix operations, making them a promising hardware solution for neural network implementations. The efficacy of MNNs is significantly influenced by the careful selection of memristor update thresholds and the in-situ update scheme during hardware deployment. This paper addresses these critical aspects through the introduction of a novel scheme that integrates Dynamic Threshold (DT) and Gradient Accumulation (GA) with Threshold Properties. In this paper, realistic memristor characteristics, including pulse-to-pulse (P2P) and device-to-device (D2D) behaviors, were simulated by introducing random noise to the Vteam memristor model. A dynamic threshold scheme is proposed to enhance in-situ training accuracy, leveraging the inherent characteristics of memristors. Furthermore, the accumulation of gradients during back propagation is employed to finely regulate memristor updates, contributing to an improved in-situ training accuracy. Experimental results demonstrate a significant enhancement in test accuracy using the DTGA scheme on the MNIST dataset (82.98% to 96.15%) and the Fashion-MNIST dataset (75.58% to 82.53%). Robustness analysis reveals the DTGA scheme’s ability to tolerate a random noise factor of 0.03 for the MNIST dataset and 0.02 for the Fashion-MNIST dataset, showcasing its reliability under varied conditions. Notably, in the Fashion-MNIST dataset, the DTGA scheme yields a 7% performance improvement accompanied by a corresponding 7% reduction in training time. This study affirms the efficiency and accuracy of the DTGA scheme, which proves adaptable beyond multilayer perceptron neural networks (MLP), offering a compelling solution for the hardware implementation of diverse neuromorphic systems.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 2","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-06091-9","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Memristor Neural Networks (MNNs) stand out for their low power consumption and accelerated matrix operations, making them a promising hardware solution for neural network implementations. The efficacy of MNNs is significantly influenced by the careful selection of memristor update thresholds and the in-situ update scheme during hardware deployment. This paper addresses these critical aspects through the introduction of a novel scheme that integrates Dynamic Threshold (DT) and Gradient Accumulation (GA) with Threshold Properties. In this paper, realistic memristor characteristics, including pulse-to-pulse (P2P) and device-to-device (D2D) behaviors, were simulated by introducing random noise to the Vteam memristor model. A dynamic threshold scheme is proposed to enhance in-situ training accuracy, leveraging the inherent characteristics of memristors. Furthermore, the accumulation of gradients during back propagation is employed to finely regulate memristor updates, contributing to an improved in-situ training accuracy. Experimental results demonstrate a significant enhancement in test accuracy using the DTGA scheme on the MNIST dataset (82.98% to 96.15%) and the Fashion-MNIST dataset (75.58% to 82.53%). Robustness analysis reveals the DTGA scheme’s ability to tolerate a random noise factor of 0.03 for the MNIST dataset and 0.02 for the Fashion-MNIST dataset, showcasing its reliability under varied conditions. Notably, in the Fashion-MNIST dataset, the DTGA scheme yields a 7% performance improvement accompanied by a corresponding 7% reduction in training time. This study affirms the efficiency and accuracy of the DTGA scheme, which proves adaptable beyond multilayer perceptron neural networks (MLP), offering a compelling solution for the hardware implementation of diverse neuromorphic systems.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.