三元权重光神经网络的退火启发式训练

Anas Skalli, Mirko Goldmann, Nasibeh Haghighi, Stephan Reitzenstein, James A. Lott, Daniel Brunner
{"title":"三元权重光神经网络的退火启发式训练","authors":"Anas Skalli, Mirko Goldmann, Nasibeh Haghighi, Stephan Reitzenstein, James A. Lott, Daniel Brunner","doi":"arxiv-2409.01042","DOIUrl":null,"url":null,"abstract":"Artificial neural networks (ANNs) represent a fundamentally connectionnist\nand distributed approach to computing, and as such they differ from classical\ncomputers that utilize the von Neumann architecture. This has revived research\ninterest in new unconventional hardware to enable more efficient\nimplementations of ANNs rather than emulating them on traditional machines. In\norder to fully leverage the capabilities of this new generation of ANNs,\noptimization algorithms that take into account hardware limitations and\nimperfections are necessary. Photonics represents a particularly promising\nplatform, offering scalability, high speed, energy efficiency, and the\ncapability for parallel information processing. Yet, fully fledged\nimplementations of autonomous optical neural networks (ONNs) with in-situ\nlearning remain scarce. In this work, we propose a ternary weight architecture\nhigh-dimensional semiconductor laser-based ONN. We introduce a simple method\nfor achieving ternary weights with Boolean hardware, significantly increasing\nthe ONN's information processing capabilities. Furthermore, we design a novel\nin-situ optimization algorithm that is compatible with, both, Boolean and\nternary weights, and provide a detailed hyperparameter study of said algorithm\nfor two different tasks. Our novel algorithm results in benefits, both in terms\nof convergence speed and performance. Finally, we experimentally characterize\nthe long-term inference stability of our ONN and find that it is extremely\nstable with a consistency above 99\\% over a period of more than 10 hours,\naddressing one of the main concerns in the field. Our work is of particular\nrelevance in the context of in-situ learning under restricted hardware\nresources, especially since minimizing the power consumption of auxiliary\nhardware is crucial to preserving efficiency gains achieved by non-von Neumann\nANN implementations.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"95 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Annealing-inspired training of an optical neural network with ternary weights\",\"authors\":\"Anas Skalli, Mirko Goldmann, Nasibeh Haghighi, Stephan Reitzenstein, James A. Lott, Daniel Brunner\",\"doi\":\"arxiv-2409.01042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Artificial neural networks (ANNs) represent a fundamentally connectionnist\\nand distributed approach to computing, and as such they differ from classical\\ncomputers that utilize the von Neumann architecture. This has revived research\\ninterest in new unconventional hardware to enable more efficient\\nimplementations of ANNs rather than emulating them on traditional machines. In\\norder to fully leverage the capabilities of this new generation of ANNs,\\noptimization algorithms that take into account hardware limitations and\\nimperfections are necessary. Photonics represents a particularly promising\\nplatform, offering scalability, high speed, energy efficiency, and the\\ncapability for parallel information processing. Yet, fully fledged\\nimplementations of autonomous optical neural networks (ONNs) with in-situ\\nlearning remain scarce. In this work, we propose a ternary weight architecture\\nhigh-dimensional semiconductor laser-based ONN. We introduce a simple method\\nfor achieving ternary weights with Boolean hardware, significantly increasing\\nthe ONN's information processing capabilities. Furthermore, we design a novel\\nin-situ optimization algorithm that is compatible with, both, Boolean and\\nternary weights, and provide a detailed hyperparameter study of said algorithm\\nfor two different tasks. Our novel algorithm results in benefits, both in terms\\nof convergence speed and performance. Finally, we experimentally characterize\\nthe long-term inference stability of our ONN and find that it is extremely\\nstable with a consistency above 99\\\\% over a period of more than 10 hours,\\naddressing one of the main concerns in the field. Our work is of particular\\nrelevance in the context of in-situ learning under restricted hardware\\nresources, especially since minimizing the power consumption of auxiliary\\nhardware is crucial to preserving efficiency gains achieved by non-von Neumann\\nANN implementations.\",\"PeriodicalId\":501168,\"journal\":{\"name\":\"arXiv - CS - Emerging Technologies\",\"volume\":\"95 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Emerging Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.01042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.01042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

人工神经网络(ANN)从根本上代表了一种连接主义和分布式计算方法,因此有别于采用冯-诺依曼体系结构的经典计算机。这重新激发了人们对新的非传统硬件的研究兴趣,以便更有效地实现神经网络,而不是在传统机器上模拟神经网络。为了充分发挥新一代人工神经网络的能力,必须采用考虑到硬件限制和缺陷的优化算法。光子技术是一个特别有前途的平台,它具有可扩展性、高速度、高能效和并行信息处理能力。然而,具有现场学习功能的自主光学神经网络(ONNs)的成熟实施方案仍然很少。在这项研究中,我们提出了一种基于三元权重架构的高维半导体激光光神经网络。我们介绍了一种利用布尔硬件实现三元权重的简单方法,大大提高了 ONN 的信息处理能力。此外,我们还设计了一种与布尔权重和三元权重兼容的新型原位优化算法,并针对两个不同任务对该算法进行了详细的超参数研究。我们的新算法在收敛速度和性能方面都有优势。最后,我们在实验中描述了我们的ONN的长期推理稳定性,发现它非常稳定,在超过10小时的时间内一致性超过99%,解决了该领域的主要问题之一。我们的工作对于在硬件资源受限的情况下进行原位学习具有特别重要的意义,尤其是因为最大限度地降低辅助硬件的功耗对于保持非冯诺依曼ONN实现所取得的效率提升至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Annealing-inspired training of an optical neural network with ternary weights
Artificial neural networks (ANNs) represent a fundamentally connectionnist and distributed approach to computing, and as such they differ from classical computers that utilize the von Neumann architecture. This has revived research interest in new unconventional hardware to enable more efficient implementations of ANNs rather than emulating them on traditional machines. In order to fully leverage the capabilities of this new generation of ANNs, optimization algorithms that take into account hardware limitations and imperfections are necessary. Photonics represents a particularly promising platform, offering scalability, high speed, energy efficiency, and the capability for parallel information processing. Yet, fully fledged implementations of autonomous optical neural networks (ONNs) with in-situ learning remain scarce. In this work, we propose a ternary weight architecture high-dimensional semiconductor laser-based ONN. We introduce a simple method for achieving ternary weights with Boolean hardware, significantly increasing the ONN's information processing capabilities. Furthermore, we design a novel in-situ optimization algorithm that is compatible with, both, Boolean and ternary weights, and provide a detailed hyperparameter study of said algorithm for two different tasks. Our novel algorithm results in benefits, both in terms of convergence speed and performance. Finally, we experimentally characterize the long-term inference stability of our ONN and find that it is extremely stable with a consistency above 99\% over a period of more than 10 hours, addressing one of the main concerns in the field. Our work is of particular relevance in the context of in-situ learning under restricted hardware resources, especially since minimizing the power consumption of auxiliary hardware is crucial to preserving efficiency gains achieved by non-von Neumann ANN implementations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Pennsieve - A Collaborative Platform for Translational Neuroscience and Beyond Analysing Attacks on Blockchain Systems in a Layer-based Approach Exploring Utility in a Real-World Warehouse Optimization Problem: Formulation Based on Quantun Annealers and Preliminary Results High Definition Map Mapping and Update: A General Overview and Future Directions Detection Made Easy: Potentials of Large Language Models for Solidity Vulnerabilities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1