Efficient O-type mapping and routing of large-scale neural networks to torus-based ONoCs

IF 4 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Journal of Optical Communications and Networking Pub Date : 2024-08-26 DOI:10.1364/JOCN.525666
Qiuyan Yao;Daqing Meng;Hui Yang;Nan Feng;Jie Zhang
{"title":"Efficient O-type mapping and routing of large-scale neural networks to torus-based ONoCs","authors":"Qiuyan Yao;Daqing Meng;Hui Yang;Nan Feng;Jie Zhang","doi":"10.1364/JOCN.525666","DOIUrl":null,"url":null,"abstract":"The rapid development of artificial intelligence has accelerated the arrival of the era of large models. Artificial-neural-network-based large models typically have millions to billions of parameters, and their training and reasoning processes put strict requirements on hardware, especially at the chip level, in terms of interconnection bandwidth, processing speed, latency, etc. The optical network-on-chip (ONoC) is a new interconnection technology that connects IP cores through a network of optical waveguides. Due to its incomparable advantages such as low loss, high throughput, and low delay, this communication mode has gradually become the key technology to improve the efficiency of large models. At present, the ONoC has been used to reduce the interconnection complexity of neural network accelerators, where neural network models are reshaped to map into the process elements of the ONoC and communicate at high speed on chip. In this paper, we first propose a torus-based O-type mapping strategy to realize efficient mapping of neuron groups to the chip. Additionally, an array congestion information-based low-congestion arbitrator is designed and then a multi-path low-congestion routing algorithm named TMLA is presented to alleviate array congestion and disperse the routing pressure of each path. Results demonstrate that the proposed mapping and routing scheme can reduce the average network delay without additional loss when the injection rate is relatively large, which provides a valuable reference for the research of neural network acceleration.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"16 9","pages":"918-928"},"PeriodicalIF":4.0000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Optical Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10646889/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

The rapid development of artificial intelligence has accelerated the arrival of the era of large models. Artificial-neural-network-based large models typically have millions to billions of parameters, and their training and reasoning processes put strict requirements on hardware, especially at the chip level, in terms of interconnection bandwidth, processing speed, latency, etc. The optical network-on-chip (ONoC) is a new interconnection technology that connects IP cores through a network of optical waveguides. Due to its incomparable advantages such as low loss, high throughput, and low delay, this communication mode has gradually become the key technology to improve the efficiency of large models. At present, the ONoC has been used to reduce the interconnection complexity of neural network accelerators, where neural network models are reshaped to map into the process elements of the ONoC and communicate at high speed on chip. In this paper, we first propose a torus-based O-type mapping strategy to realize efficient mapping of neuron groups to the chip. Additionally, an array congestion information-based low-congestion arbitrator is designed and then a multi-path low-congestion routing algorithm named TMLA is presented to alleviate array congestion and disperse the routing pressure of each path. Results demonstrate that the proposed mapping and routing scheme can reduce the average network delay without additional loss when the injection rate is relatively large, which provides a valuable reference for the research of neural network acceleration.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
大规模神经网络到基于环的 ONoC 的高效 O 型映射和路由
人工智能的快速发展加速了大型模型时代的到来。基于人工神经网络的大型模型通常拥有数百万到数十亿个参数,其训练和推理过程对硬件,尤其是芯片级硬件的互联带宽、处理速度、延迟等提出了严格的要求。片上光网络(ONoC)是一种通过光波导网络连接 IP 核的新型互连技术。由于其具有低损耗、高吞吐量、低延迟等无可比拟的优势,这种通信模式已逐渐成为提高大型模型效率的关键技术。目前,ONoC 已被用于降低神经网络加速器的互连复杂度,将神经网络模型重塑后映射到 ONoC 的工艺元件中,并在芯片上进行高速通信。在本文中,我们首先提出了一种基于环的 O 型映射策略,以实现神经元群到芯片的高效映射。此外,我们还设计了一种基于阵列拥塞信息的低拥塞仲裁器,然后提出了一种名为 TMLA 的多路径低拥塞路由算法,以缓解阵列拥塞并分散各路径的路由压力。结果表明,当注入率相对较大时,所提出的映射和路由方案可以在不增加额外损耗的情况下降低平均网络延迟,为神经网络加速研究提供了有价值的参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
9.40
自引率
16.00%
发文量
104
审稿时长
4 months
期刊介绍: The scope of the Journal includes advances in the state-of-the-art of optical networking science, technology, and engineering. Both theoretical contributions (including new techniques, concepts, analyses, and economic studies) and practical contributions (including optical networking experiments, prototypes, and new applications) are encouraged. Subareas of interest include the architecture and design of optical networks, optical network survivability and security, software-defined optical networking, elastic optical networks, data and control plane advances, network management related innovation, and optical access networks. Enabling technologies and their applications are suitable topics only if the results are shown to directly impact optical networking beyond simple point-to-point networks.
期刊最新文献
Introduction to the Benchmarking in Optical Networks Special Issue Protocol-aware approach for mitigating radiation-induced errors in free-space optical downlinks Security enhancement for NOMA-PON with 2D cellular automata and Turing pattern cascading scramble aided fixed-point extended logistic chaotic encryption In-network stable radix sorter using many FPGAs with high-bandwidth photonics [Invited] Power-consumption analysis for different IPoWDM network architectures with ZR/ZR+ and long-haul muxponders
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1