Design and Analysis of a Nano-photonic Processing Unit for Low-Latency Recurrent Neural Network Applications

2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) Pub Date : 2022-12-01 DOI:10.1109/MCSoC57363.2022.00058

Eito Sato, Koji Inoue, Satoshi Kawakami

{"title":"Design and Analysis of a Nano-photonic Processing Unit for Low-Latency Recurrent Neural Network Applications","authors":"Eito Sato, Koji Inoue, Satoshi Kawakami","doi":"10.1109/MCSoC57363.2022.00058","DOIUrl":null,"url":null,"abstract":"Recurrent neural networks (RNNs) have achieved high performance in inference processing that handles time-series data. Among them, hardware acceleration for fast processing RNNs is helpful for tasks where real-time performance is es-sential, such as speech recognition and stock market prediction. The nano-photonic neural network accelerator is an approach that takes advantage of the high speed, high parallelism, and low power consumption of light to achieve high performance in neural network processing. However, existing methods are inefficient for RNNs due to significant overhead caused by the absence of recursive paths and the immaturity of the model to be designed. Therefore, architectural considerations that take advantage of RNN characteristics are essential for low latency. This paper proposes a fast and low-power processing unit for RNNs that introduces activation functions and recursion processing using optical devices. We clarified the impact of noise on the proposed circuit's calculation accuracy and inference accuracy. As a result, the calculation accuracy deteriorated significantly in proportion to the increase in the number of recursions, but the effect on inference accuracy was negligible. We also compared the performance of the proposed circuit to an all-electric design and a hybrid design that processes the vector-matrix product optically and the recursion electrically. As a result, the performance of the proposed circuit improves latency by 467x, reduces power consumption by 93.0% compared with the all-electrical design, improves latency by 7.3x, and reduces power consumption by 58.6% compared with the hybrid design.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MCSoC57363.2022.00058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Recurrent neural networks (RNNs) have achieved high performance in inference processing that handles time-series data. Among them, hardware acceleration for fast processing RNNs is helpful for tasks where real-time performance is es-sential, such as speech recognition and stock market prediction. The nano-photonic neural network accelerator is an approach that takes advantage of the high speed, high parallelism, and low power consumption of light to achieve high performance in neural network processing. However, existing methods are inefficient for RNNs due to significant overhead caused by the absence of recursive paths and the immaturity of the model to be designed. Therefore, architectural considerations that take advantage of RNN characteristics are essential for low latency. This paper proposes a fast and low-power processing unit for RNNs that introduces activation functions and recursion processing using optical devices. We clarified the impact of noise on the proposed circuit's calculation accuracy and inference accuracy. As a result, the calculation accuracy deteriorated significantly in proportion to the increase in the number of recursions, but the effect on inference accuracy was negligible. We also compared the performance of the proposed circuit to an all-electric design and a hybrid design that processes the vector-matrix product optically and the recursion electrically. As a result, the performance of the proposed circuit improves latency by 467x, reduces power consumption by 93.0% compared with the all-electrical design, improves latency by 7.3x, and reduces power consumption by 58.6% compared with the hybrid design.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于低延迟递归神经网络的纳米光子处理单元的设计与分析

递归神经网络(RNNs)在处理时间序列数据的推理处理中取得了很高的性能。其中，快速处理rnn的硬件加速有助于实时性能要求很高的任务，如语音识别和股票市场预测。纳米光子神经网络加速器是一种利用光的高速、高并行性和低功耗来实现神经网络处理高性能的方法。然而，由于缺乏递归路径和待设计模型的不成熟，现有的方法对于rnn来说效率低下。因此，利用RNN特性的架构考虑对于低延迟至关重要。本文提出了一种快速、低功耗的rnn处理单元，该单元采用光学器件引入激活函数和递归处理。我们阐明了噪声对所提出电路的计算精度和推理精度的影响。结果，计算精度随递归次数的增加而显著下降，但对推理精度的影响可以忽略不计。我们还将所提出电路的性能与全电设计和混合设计进行了比较，混合设计以光学方式处理矢量矩阵乘积和电递归。结果表明，与全电设计相比，该电路的性能延迟提高了467x，功耗降低了93.0%，与混合设计相比，延迟提高了7.3x，功耗降低了58.6%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

自引率

0.00%

发文量