{"title":"Optoelectronic nonlinear Softmax operator based on diffractive neural networks.","authors":"Ziyu Zhan, Hao Wang, Qiang Liu, Xing Fu","doi":"10.1364/OE.527843","DOIUrl":null,"url":null,"abstract":"<p><p>Softmax, a pervasive nonlinear operation, plays a pivotal role in numerous statistics and deep learning (DL) models such as ChatGPT. To compute it is expensive especially for at-scale models. Several software and hardware speed-up strategies are proposed but still suffer from low efficiency, poor scalability. Here we propose a photonic-computing solution including massive programmable neurons that is capable to execute such operation in an accurate, computation-efficient, robust and scalable manner. Experimental results show our diffraction-based computing system exhibits salient generalization ability in diverse artificial and real-world tasks (mean square error <10<sup>-5</sup>). We further analyze its performances against several realistic restricted factors. Such flexible system not only contributes to optimizing Softmax operation mechanism but may provide an inspiration of manufacturing a plug-and-play module for general optoelectronic accelerators.</p>","PeriodicalId":19691,"journal":{"name":"Optics express","volume":"32 15","pages":"26458-26469"},"PeriodicalIF":3.2000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optics express","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1364/OE.527843","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Softmax, a pervasive nonlinear operation, plays a pivotal role in numerous statistics and deep learning (DL) models such as ChatGPT. To compute it is expensive especially for at-scale models. Several software and hardware speed-up strategies are proposed but still suffer from low efficiency, poor scalability. Here we propose a photonic-computing solution including massive programmable neurons that is capable to execute such operation in an accurate, computation-efficient, robust and scalable manner. Experimental results show our diffraction-based computing system exhibits salient generalization ability in diverse artificial and real-world tasks (mean square error <10-5). We further analyze its performances against several realistic restricted factors. Such flexible system not only contributes to optimizing Softmax operation mechanism but may provide an inspiration of manufacturing a plug-and-play module for general optoelectronic accelerators.
期刊介绍:
Optics Express is the all-electronic, open access journal for optics providing rapid publication for peer-reviewed articles that emphasize scientific and technology innovations in all aspects of optics and photonics.