Ning Lin, Shaocong Wang, Yi Li, Bo Wang, Shuhui Shi, Yangu He, Woyu Zhang, Yifei Yu, Yue Zhang, Xinyuan Zhang, Kwunhang Wong, Songqi Wang, Xiaoming Chen, Hao Jiang, Xumeng Zhang, Peng Lin, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Dashan Shang, Qi Liu, Ming Liu
{"title":"Resistive memory-based zero-shot liquid state machine for multimodal event data learning.","authors":"Ning Lin, Shaocong Wang, Yi Li, Bo Wang, Shuhui Shi, Yangu He, Woyu Zhang, Yifei Yu, Yue Zhang, Xinyuan Zhang, Kwunhang Wong, Songqi Wang, Xiaoming Chen, Hao Jiang, Xumeng Zhang, Peng Lin, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Dashan Shang, Qi Liu, Ming Liu","doi":"10.1038/s43588-024-00751-z","DOIUrl":null,"url":null,"abstract":"<p><p>The human brain is a complex spiking neural network (SNN) capable of learning multimodal signals in a zero-shot manner by generalizing existing knowledge. Remarkably, it maintains minimal power consumption through event-based signal propagation. However, replicating the human brain in neuromorphic hardware presents both hardware and software challenges. Hardware limitations, such as the slowdown of Moore's law and Von Neumann bottleneck, hinder the efficiency of digital computers. In addition, SNNs are characterized by their software training complexities. Here, to this end, we propose a hardware-software co-design on a 40 nm 256 kB in-memory computing macro that physically integrates a fixed and random liquid state machine SNN encoder with trainable artificial neural network projections. We showcase the zero-shot learning of multimodal events on the N-MNIST and N-TIDIGITS datasets, including visual and audio data association, as well as neural and visual data alignment for brain-machine interfaces. Our co-design achieves classification accuracy comparable to fully optimized software models, resulting in a 152.83- and 393.07-fold reduction in training costs compared with state-of-the-art spiking recurrent neural network-based contrastive learning and prototypical networks, and a 23.34- and 160-fold improvement in energy efficiency compared with cutting-edge digital hardware, respectively. These proof-of-principle prototypes demonstrate zero-shot multimodal events learning capability for emerging efficient and compact neuromorphic hardware.</p>","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":" ","pages":""},"PeriodicalIF":12.0000,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature computational science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s43588-024-00751-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
The human brain is a complex spiking neural network (SNN) capable of learning multimodal signals in a zero-shot manner by generalizing existing knowledge. Remarkably, it maintains minimal power consumption through event-based signal propagation. However, replicating the human brain in neuromorphic hardware presents both hardware and software challenges. Hardware limitations, such as the slowdown of Moore's law and Von Neumann bottleneck, hinder the efficiency of digital computers. In addition, SNNs are characterized by their software training complexities. Here, to this end, we propose a hardware-software co-design on a 40 nm 256 kB in-memory computing macro that physically integrates a fixed and random liquid state machine SNN encoder with trainable artificial neural network projections. We showcase the zero-shot learning of multimodal events on the N-MNIST and N-TIDIGITS datasets, including visual and audio data association, as well as neural and visual data alignment for brain-machine interfaces. Our co-design achieves classification accuracy comparable to fully optimized software models, resulting in a 152.83- and 393.07-fold reduction in training costs compared with state-of-the-art spiking recurrent neural network-based contrastive learning and prototypical networks, and a 23.34- and 160-fold improvement in energy efficiency compared with cutting-edge digital hardware, respectively. These proof-of-principle prototypes demonstrate zero-shot multimodal events learning capability for emerging efficient and compact neuromorphic hardware.