基于闪存的超高耐久性神经网络在线训练的优化运行方案

IF 4.8 4区物理与天体物理 Q2 PHYSICS, CONDENSED MATTER Journal of Semiconductors Pub Date : 2024-01-01 DOI:10.1088/1674-4926/45/1/012301

Yang Feng, Zhaohui Sun, Yueran Qi, Xuepeng Zhan, Junyu Zhang, Jing Liu, Masaharu Kobayashi, Jixuan Wu, Jiezhi Chen

{"title":"基于闪存的超高耐久性神经网络在线训练的优化运行方案","authors":"Yang Feng, Zhaohui Sun, Yueran Qi, Xuepeng Zhan, Junyu Zhang, Jing Liu, Masaharu Kobayashi, Jixuan Wu, Jiezhi Chen","doi":"10.1088/1674-4926/45/1/012301","DOIUrl":null,"url":null,"abstract":"With the rapid development of machine learning, the demand for high-efficient computing becomes more and more urgent. To break the bottleneck of the traditional Von Neumann architecture, computing-in-memory (CIM) has attracted increasing attention in recent years. In this work, to provide a feasible CIM solution for the large-scale neural networks (NN) requiring continuous weight updating in online training, a flash-based computing-in-memory with high endurance (109 cycles) and ultra-fast programming speed is investigated. On the one hand, the proposed programming scheme of channel hot electron injection (CHEI) and hot hole injection (HHI) demonstrate high linearity, symmetric potentiation, and a depression process, which help to improve the training speed and accuracy. On the other hand, the low-damage programming scheme and memory window (MW) optimizations can suppress cell degradation effectively with improved computing accuracy. Even after 109 cycles, the leakage current (<italic toggle=\"yes\">I</italic>\noff) of cells remains sub-10pA, ensuring the large-scale computing ability of memory. Further characterizations are done on read disturb to demonstrate its robust reliabilities. By processing CIFAR-10 tasks, it is evident that ~90% accuracy can be achieved after 109 cycles in both ResNet50 and VGG16 NN. Our results suggest that flash-based CIM has great potential to overcome the limitations of traditional Von Neumann architectures and enable high-performance NN online training, which pave the way for further development of artificial intelligence (AI) accelerators.","PeriodicalId":17038,"journal":{"name":"Journal of Semiconductors","volume":"70 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimized operation scheme of flash-memory-based neural network online training with ultra-high endurance\",\"authors\":\"Yang Feng, Zhaohui Sun, Yueran Qi, Xuepeng Zhan, Junyu Zhang, Jing Liu, Masaharu Kobayashi, Jixuan Wu, Jiezhi Chen\",\"doi\":\"10.1088/1674-4926/45/1/012301\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of machine learning, the demand for high-efficient computing becomes more and more urgent. To break the bottleneck of the traditional Von Neumann architecture, computing-in-memory (CIM) has attracted increasing attention in recent years. In this work, to provide a feasible CIM solution for the large-scale neural networks (NN) requiring continuous weight updating in online training, a flash-based computing-in-memory with high endurance (109 cycles) and ultra-fast programming speed is investigated. On the one hand, the proposed programming scheme of channel hot electron injection (CHEI) and hot hole injection (HHI) demonstrate high linearity, symmetric potentiation, and a depression process, which help to improve the training speed and accuracy. On the other hand, the low-damage programming scheme and memory window (MW) optimizations can suppress cell degradation effectively with improved computing accuracy. Even after 109 cycles, the leakage current (<italic toggle=\\\"yes\\\">I</italic>\\noff) of cells remains sub-10pA, ensuring the large-scale computing ability of memory. Further characterizations are done on read disturb to demonstrate its robust reliabilities. By processing CIFAR-10 tasks, it is evident that ~90% accuracy can be achieved after 109 cycles in both ResNet50 and VGG16 NN. Our results suggest that flash-based CIM has great potential to overcome the limitations of traditional Von Neumann architectures and enable high-performance NN online training, which pave the way for further development of artificial intelligence (AI) accelerators.\",\"PeriodicalId\":17038,\"journal\":{\"name\":\"Journal of Semiconductors\",\"volume\":\"70 1\",\"pages\":\"\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Semiconductors\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1088/1674-4926/45/1/012301\",\"RegionNum\":4,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, CONDENSED MATTER\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Semiconductors","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/1674-4926/45/1/012301","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, CONDENSED MATTER","Score":null,"Total":0}

引用次数: 0

摘要

随着机器学习的快速发展，人们对高效计算的需求越来越迫切。为了突破传统冯-诺依曼架构的瓶颈，内存计算（CIM）近年来越来越受到关注。本研究针对在线训练中需要持续更新权重的大规模神经网络（NN）提供了一种可行的 CIM 解决方案，研究了一种具有高耐用性（109 个周期）和超快编程速度的基于闪存的内存计算。一方面，所提出的通道热电子注入（CHEI）和热空穴注入（HHI）编程方案具有高线性度、对称电位和抑制过程，有助于提高训练速度和准确性。另一方面，低损伤编程方案和存储器窗口（MW）优化可以有效抑制单元退化，提高计算精度。即使经过 109 次循环，单元的漏电流（Ioff）也保持在 10pA 以下，确保了存储器的大规模运算能力。我们还对读取干扰进行了进一步的鉴定，以证明其强大的可靠性。通过处理 CIFAR-10 任务，可以明显看出，在 ResNet50 和 VGG16 NN 中，经过 109 次循环后，准确率可达到约 90%。我们的研究结果表明，基于闪存的 CIM 在克服传统冯-诺依曼架构的局限性和实现高性能 NN 在线训练方面具有巨大潜力，这为人工智能（AI）加速器的进一步发展铺平了道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Optimized operation scheme of flash-memory-based neural network online training with ultra-high endurance

With the rapid development of machine learning, the demand for high-efficient computing becomes more and more urgent. To break the bottleneck of the traditional Von Neumann architecture, computing-in-memory (CIM) has attracted increasing attention in recent years. In this work, to provide a feasible CIM solution for the large-scale neural networks (NN) requiring continuous weight updating in online training, a flash-based computing-in-memory with high endurance (10⁹ cycles) and ultra-fast programming speed is investigated. On the one hand, the proposed programming scheme of channel hot electron injection (CHEI) and hot hole injection (HHI) demonstrate high linearity, symmetric potentiation, and a depression process, which help to improve the training speed and accuracy. On the other hand, the low-damage programming scheme and memory window (MW) optimizations can suppress cell degradation effectively with improved computing accuracy. Even after 10⁹ cycles, the leakage current (I _off) of cells remains sub-10pA, ensuring the large-scale computing ability of memory. Further characterizations are done on read disturb to demonstrate its robust reliabilities. By processing CIFAR-10 tasks, it is evident that ~90% accuracy can be achieved after 10⁹ cycles in both ResNet50 and VGG16 NN. Our results suggest that flash-based CIM has great potential to overcome the limitations of traditional Von Neumann architectures and enable high-performance NN online training, which pave the way for further development of artificial intelligence (AI) accelerators.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊