Design and Analysis of a Nano-photonic Processing Unit for Low-Latency Recurrent Neural Network Applications

Eito Sato, Koji Inoue, Satoshi Kawakami
{"title":"Design and Analysis of a Nano-photonic Processing Unit for Low-Latency Recurrent Neural Network Applications","authors":"Eito Sato, Koji Inoue, Satoshi Kawakami","doi":"10.1109/MCSoC57363.2022.00058","DOIUrl":null,"url":null,"abstract":"Recurrent neural networks (RNNs) have achieved high performance in inference processing that handles time-series data. Among them, hardware acceleration for fast processing RNNs is helpful for tasks where real-time performance is es-sential, such as speech recognition and stock market prediction. The nano-photonic neural network accelerator is an approach that takes advantage of the high speed, high parallelism, and low power consumption of light to achieve high performance in neural network processing. However, existing methods are inefficient for RNNs due to significant overhead caused by the absence of recursive paths and the immaturity of the model to be designed. Therefore, architectural considerations that take advantage of RNN characteristics are essential for low latency. This paper proposes a fast and low-power processing unit for RNNs that introduces activation functions and recursion processing using optical devices. We clarified the impact of noise on the proposed circuit's calculation accuracy and inference accuracy. As a result, the calculation accuracy deteriorated significantly in proportion to the increase in the number of recursions, but the effect on inference accuracy was negligible. We also compared the performance of the proposed circuit to an all-electric design and a hybrid design that processes the vector-matrix product optically and the recursion electrically. As a result, the performance of the proposed circuit improves latency by 467x, reduces power consumption by 93.0% compared with the all-electrical design, improves latency by 7.3x, and reduces power consumption by 58.6% compared with the hybrid design.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MCSoC57363.2022.00058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recurrent neural networks (RNNs) have achieved high performance in inference processing that handles time-series data. Among them, hardware acceleration for fast processing RNNs is helpful for tasks where real-time performance is es-sential, such as speech recognition and stock market prediction. The nano-photonic neural network accelerator is an approach that takes advantage of the high speed, high parallelism, and low power consumption of light to achieve high performance in neural network processing. However, existing methods are inefficient for RNNs due to significant overhead caused by the absence of recursive paths and the immaturity of the model to be designed. Therefore, architectural considerations that take advantage of RNN characteristics are essential for low latency. This paper proposes a fast and low-power processing unit for RNNs that introduces activation functions and recursion processing using optical devices. We clarified the impact of noise on the proposed circuit's calculation accuracy and inference accuracy. As a result, the calculation accuracy deteriorated significantly in proportion to the increase in the number of recursions, but the effect on inference accuracy was negligible. We also compared the performance of the proposed circuit to an all-electric design and a hybrid design that processes the vector-matrix product optically and the recursion electrically. As a result, the performance of the proposed circuit improves latency by 467x, reduces power consumption by 93.0% compared with the all-electrical design, improves latency by 7.3x, and reduces power consumption by 58.6% compared with the hybrid design.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于低延迟递归神经网络的纳米光子处理单元的设计与分析
递归神经网络(RNNs)在处理时间序列数据的推理处理中取得了很高的性能。其中,快速处理rnn的硬件加速有助于实时性能要求很高的任务,如语音识别和股票市场预测。纳米光子神经网络加速器是一种利用光的高速、高并行性和低功耗来实现神经网络处理高性能的方法。然而,由于缺乏递归路径和待设计模型的不成熟,现有的方法对于rnn来说效率低下。因此,利用RNN特性的架构考虑对于低延迟至关重要。本文提出了一种快速、低功耗的rnn处理单元,该单元采用光学器件引入激活函数和递归处理。我们阐明了噪声对所提出电路的计算精度和推理精度的影响。结果,计算精度随递归次数的增加而显著下降,但对推理精度的影响可以忽略不计。我们还将所提出电路的性能与全电设计和混合设计进行了比较,混合设计以光学方式处理矢量矩阵乘积和电递归。结果表明,与全电设计相比,该电路的性能延迟提高了467x,功耗降低了93.0%,与混合设计相比,延迟提高了7.3x,功耗降低了58.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Driver Status Monitoring System with Feedback from Fatigue Detection and Lane Line Detection Efficient and High-Performance Sparse Matrix-Vector Multiplication on a Many-Core Array Impact of Programming Language Skills in Programming Learning Composite Lightweight Authenticated Encryption Based on LED Block Cipher and PHOTON Hash Function for IoT Devices Message from the Chairs: Welcome to the 2022 IEEE 15th International Symposium on embedded Multicore/Many-core Systems-on-Chip (IEEE MCSoC-2022)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1