Mingdai Yang, Qiuwen Lou, R. Rajaei, M. Jokar, Junyi Qiu, Yuming Liu, Aditi Udupa, F. Chong, J. Dallesasse, Milton Feng, L. Goddard, X. S. Hu, Yanjing Li
{"title":"A Hybrid Optical-Electrical Analog Deep Learning Accelerator Using Incoherent Optical Signals","authors":"Mingdai Yang, Qiuwen Lou, R. Rajaei, M. Jokar, Junyi Qiu, Yuming Liu, Aditi Udupa, F. Chong, J. Dallesasse, Milton Feng, L. Goddard, X. S. Hu, Yanjing Li","doi":"10.1145/3584183","DOIUrl":null,"url":null,"abstract":"Optical deep learning (DL) accelerators have attracted significant interests due to their latency and power advantages. In this article, we focus on incoherent optical designs. A significant challenge is that there is no known solution to perform single-wavelength accumulation (a key operation required for DL workloads) using incoherent optical signals efficiently. Therefore, we devise a hybrid approach, where accumulation is done in the electrical domain, and multiplication is performed in the optical domain. The key technology enabler of our design is the transistor laser, which performs electrical-to-optical and optical-to-electrical conversions efficiently. Through detailed design and evaluation of our design, along with a comprehensive benchmarking study against state-of-the-art RRAM-based designs, we derive the following key results: (1) For a four-layer multilayer perceptron network, our design achieves 115× and 17.11× improvements in latency and energy, respectively, compared to the RRAM-based design. We can take full advantage of the speed and energy benefits of the optical technology because the inference task can be entirely mapped onto our design. (2) For a complex workload (Resnet50), weight reprogramming is needed, and intermediate results need to be stored/re-fetched to/from memories. In this case, for the same area, our design still outperforms the RRAM-based design by 15.92× in inference latency, and 8.99× in energy.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":" ","pages":"1 - 24"},"PeriodicalIF":2.1000,"publicationDate":"2023-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Journal on Emerging Technologies in Computing Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3584183","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Optical deep learning (DL) accelerators have attracted significant interests due to their latency and power advantages. In this article, we focus on incoherent optical designs. A significant challenge is that there is no known solution to perform single-wavelength accumulation (a key operation required for DL workloads) using incoherent optical signals efficiently. Therefore, we devise a hybrid approach, where accumulation is done in the electrical domain, and multiplication is performed in the optical domain. The key technology enabler of our design is the transistor laser, which performs electrical-to-optical and optical-to-electrical conversions efficiently. Through detailed design and evaluation of our design, along with a comprehensive benchmarking study against state-of-the-art RRAM-based designs, we derive the following key results: (1) For a four-layer multilayer perceptron network, our design achieves 115× and 17.11× improvements in latency and energy, respectively, compared to the RRAM-based design. We can take full advantage of the speed and energy benefits of the optical technology because the inference task can be entirely mapped onto our design. (2) For a complex workload (Resnet50), weight reprogramming is needed, and intermediate results need to be stored/re-fetched to/from memories. In this case, for the same area, our design still outperforms the RRAM-based design by 15.92× in inference latency, and 8.99× in energy.
期刊介绍:
The Journal of Emerging Technologies in Computing Systems invites submissions of original technical papers describing research and development in emerging technologies in computing systems. Major economic and technical challenges are expected to impede the continued scaling of semiconductor devices. This has resulted in the search for alternate mechanical, biological/biochemical, nanoscale electronic, asynchronous and quantum computing and sensor technologies. As the underlying nanotechnologies continue to evolve in the labs of chemists, physicists, and biologists, it has become imperative for computer scientists and engineers to translate the potential of the basic building blocks (analogous to the transistor) emerging from these labs into information systems. Their design will face multiple challenges ranging from the inherent (un)reliability due to the self-assembly nature of the fabrication processes for nanotechnologies, from the complexity due to the sheer volume of nanodevices that will have to be integrated for complex functionality, and from the need to integrate these new nanotechnologies with silicon devices in the same system.
The journal provides comprehensive coverage of innovative work in the specification, design analysis, simulation, verification, testing, and evaluation of computing systems constructed out of emerging technologies and advanced semiconductors