基于随机性的忆阻器交叉棒阵列中位置敏感的随机投影散列,用于稀疏自注意变压器

IF 5.3 2区 材料科学 Q2 MATERIALS SCIENCE, MULTIDISCIPLINARY Advanced Electronic Materials Pub Date : 2024-06-17 DOI:10.1002/aelm.202300850
Xinxin Wang, Ilia Valov, Huanglong Li
{"title":"基于随机性的忆阻器交叉棒阵列中位置敏感的随机投影散列,用于稀疏自注意变压器","authors":"Xinxin Wang,&nbsp;Ilia Valov,&nbsp;Huanglong Li","doi":"10.1002/aelm.202300850","DOIUrl":null,"url":null,"abstract":"<p>Self-attention mechanism is critically central to the state-of-the-art transformer models. Because the standard full self-attention has quadratic complexity with respect to the input's length L, resulting in prohibitively large memory for very long sequences, sparse self-attention enabled by random projection (RP)-based locality-sensitive hashing (LSH) has recently been proposed to reduce the complexity to O(L log L). However, in current digital computing hardware with a von Neumann architecture, RP, which is essentially a matrix multiplication operation, incurs unavoidable time and energy-consuming data shuttling between off-chip memory and processing units. In addition, it is known that digital computers simply cannot generate provably random numbers. With the emerging analog memristive technology, it is shown that it is feasible to harness the intrinsic device-to-device variability in the memristor crossbar array for implementing the RP matrix and perform RP-LSH computation in memory. On this basis, sequence prediction tasks are performed with a sparse self-attention-based Transformer in a hybrid software-hardware approach, achieving a testing accuracy over 70% with much less computational complexity. By further harnessing the cycle-to-cycle variability for multi-round hashing, 12% increase in the testing accuracy is demonstrated. This work extends the range of applications of memristor crossbar arrays to the state-of-the-art large language models (LLMs).</p>","PeriodicalId":110,"journal":{"name":"Advanced Electronic Materials","volume":"10 10","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aelm.202300850","citationCount":"0","resultStr":"{\"title\":\"Random Projection-Based Locality-Sensitive Hashing in a Memristor Crossbar Array with Stochasticity for Sparse Self-Attention-Based Transformer\",\"authors\":\"Xinxin Wang,&nbsp;Ilia Valov,&nbsp;Huanglong Li\",\"doi\":\"10.1002/aelm.202300850\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Self-attention mechanism is critically central to the state-of-the-art transformer models. Because the standard full self-attention has quadratic complexity with respect to the input's length L, resulting in prohibitively large memory for very long sequences, sparse self-attention enabled by random projection (RP)-based locality-sensitive hashing (LSH) has recently been proposed to reduce the complexity to O(L log L). However, in current digital computing hardware with a von Neumann architecture, RP, which is essentially a matrix multiplication operation, incurs unavoidable time and energy-consuming data shuttling between off-chip memory and processing units. In addition, it is known that digital computers simply cannot generate provably random numbers. With the emerging analog memristive technology, it is shown that it is feasible to harness the intrinsic device-to-device variability in the memristor crossbar array for implementing the RP matrix and perform RP-LSH computation in memory. On this basis, sequence prediction tasks are performed with a sparse self-attention-based Transformer in a hybrid software-hardware approach, achieving a testing accuracy over 70% with much less computational complexity. By further harnessing the cycle-to-cycle variability for multi-round hashing, 12% increase in the testing accuracy is demonstrated. This work extends the range of applications of memristor crossbar arrays to the state-of-the-art large language models (LLMs).</p>\",\"PeriodicalId\":110,\"journal\":{\"name\":\"Advanced Electronic Materials\",\"volume\":\"10 10\",\"pages\":\"\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2024-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aelm.202300850\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advanced Electronic Materials\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/aelm.202300850\",\"RegionNum\":2,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Electronic Materials","FirstCategoryId":"88","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aelm.202300850","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

自注意机制是最先进变压器模型的关键核心。由于标准的完全自注意具有相对于输入长度 L 的二次复杂性,导致超长序列的内存过大,因此最近提出了基于随机投影(RP)的位置敏感散列(LSH)的稀疏自注意,以将复杂性降低到 O(L log L)。然而,在目前采用冯-诺依曼架构的数字计算硬件中,RP 本质上是一种矩阵乘法运算,在片外内存和处理单元之间进行数据穿梭需要耗费大量时间和精力,这是不可避免的。此外,众所周知,数字计算机根本无法生成可证明的随机数。新兴的模拟忆阻器技术表明,利用忆阻器交叉棒阵列中器件与器件之间的内在可变性来实现 RP 矩阵并在存储器中执行 RP-LSH 计算是可行的。在此基础上,通过软硬件混合方法,利用基于稀疏自注意的Transformer执行序列预测任务,以更低的计算复杂度实现了超过70%的测试精度。通过进一步利用多轮散列的周期间可变性,测试精度提高了 12%。这项工作扩展了忆阻器横杆阵列在最先进的大型语言模型(LLM)中的应用范围。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Random Projection-Based Locality-Sensitive Hashing in a Memristor Crossbar Array with Stochasticity for Sparse Self-Attention-Based Transformer

Self-attention mechanism is critically central to the state-of-the-art transformer models. Because the standard full self-attention has quadratic complexity with respect to the input's length L, resulting in prohibitively large memory for very long sequences, sparse self-attention enabled by random projection (RP)-based locality-sensitive hashing (LSH) has recently been proposed to reduce the complexity to O(L log L). However, in current digital computing hardware with a von Neumann architecture, RP, which is essentially a matrix multiplication operation, incurs unavoidable time and energy-consuming data shuttling between off-chip memory and processing units. In addition, it is known that digital computers simply cannot generate provably random numbers. With the emerging analog memristive technology, it is shown that it is feasible to harness the intrinsic device-to-device variability in the memristor crossbar array for implementing the RP matrix and perform RP-LSH computation in memory. On this basis, sequence prediction tasks are performed with a sparse self-attention-based Transformer in a hybrid software-hardware approach, achieving a testing accuracy over 70% with much less computational complexity. By further harnessing the cycle-to-cycle variability for multi-round hashing, 12% increase in the testing accuracy is demonstrated. This work extends the range of applications of memristor crossbar arrays to the state-of-the-art large language models (LLMs).

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Advanced Electronic Materials
Advanced Electronic Materials NANOSCIENCE & NANOTECHNOLOGYMATERIALS SCIE-MATERIALS SCIENCE, MULTIDISCIPLINARY
CiteScore
11.00
自引率
3.20%
发文量
433
期刊介绍: Advanced Electronic Materials is an interdisciplinary forum for peer-reviewed, high-quality, high-impact research in the fields of materials science, physics, and engineering of electronic and magnetic materials. It includes research on physics and physical properties of electronic and magnetic materials, spintronics, electronics, device physics and engineering, micro- and nano-electromechanical systems, and organic electronics, in addition to fundamental research.
期刊最新文献
Physical Reservoir Computing Utilizing Ion-Gating Transistors Operating in Electric Double Layer and Redox Mechanisms Single-Cell Membrane Potential Stimulation and Recording by an Electrolyte-Gated Organic Field-Effect Transistor 2D α-In2Se3 Flakes for High Frequency Tunable and Switchable Film Bulk Acoustic Wave Resonators Aqueous Ammonia Sensor with Neuromorphic Detection 3D Nano Hafnium-Based Ferroelectric Memory Vertical Array for High-Density and High-Reliability Logic-In-Memory Application
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1