用于精密传感器处理的22纳米全数字时域神经网络加速器

IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-11-19 DOI:10.1109/TVLSI.2024.3496090
Ahmed M. Mohey;Jelin Leslin;Gaurav Singh;Marko Kosunen;Jussi Ryynänen;Martin Andraud
{"title":"用于精密传感器处理的22纳米全数字时域神经网络加速器","authors":"Ahmed M. Mohey;Jelin Leslin;Gaurav Singh;Marko Kosunen;Jussi Ryynänen;Martin Andraud","doi":"10.1109/TVLSI.2024.3496090","DOIUrl":null,"url":null,"abstract":"Deep neural network (DNN) accelerators are increasingly integrated into sensing applications, such as wearables and sensor networks, to provide advanced in-sensor processing capabilities. Given wearables’ strict size and power requirements, minimizing the area and energy consumption of DNN accelerators is a critical concern. In that regard, computing DNN models in the time domain is a promising architecture, taking advantage of both technology scaling friendliness and efficiency. Yet, time-domain accelerators are typically not fully digital, limiting the full benefits of time-domain computation. In this work, we propose an all-digital time-domain accelerator with a small size and low energy consumption to target precision in-sensor processing like human activity recognition (HAR). The proposed accelerator features a simple and efficient architecture without dependencies on analog nonidealities such as leakage and charge errors. An eight-neuron layer (core computation layer) is implemented in 22-nm FD-SOI technology. The layer occupies \n<inline-formula> <tex-math>$70 \\times \\,70\\,\\mu $ </tex-math></inline-formula>\nm while supporting multibit inputs (8-bit) and weights (8-bit) with signed accumulation up to 18 bits. The power dissipation of the computation layer is 576\n<inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>\nW at 0.72-V supply and 500-MHz clock frequency achieving an average area efficiency of 24.74 GOPS/mm2 (up to 544.22 GOPS/mm2), an average energy efficiency of 0.21 TOPS/W (up to 4.63 TOPS/W), and a normalized energy efficiency of 13.46 1b-TOPS/W (up to 296.30 1b-TOPS/W).","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 12","pages":"2220-2231"},"PeriodicalIF":2.8000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A 22-nm All-Digital Time-Domain Neural Network Accelerator for Precision In-Sensor Processing\",\"authors\":\"Ahmed M. Mohey;Jelin Leslin;Gaurav Singh;Marko Kosunen;Jussi Ryynänen;Martin Andraud\",\"doi\":\"10.1109/TVLSI.2024.3496090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural network (DNN) accelerators are increasingly integrated into sensing applications, such as wearables and sensor networks, to provide advanced in-sensor processing capabilities. Given wearables’ strict size and power requirements, minimizing the area and energy consumption of DNN accelerators is a critical concern. In that regard, computing DNN models in the time domain is a promising architecture, taking advantage of both technology scaling friendliness and efficiency. Yet, time-domain accelerators are typically not fully digital, limiting the full benefits of time-domain computation. In this work, we propose an all-digital time-domain accelerator with a small size and low energy consumption to target precision in-sensor processing like human activity recognition (HAR). The proposed accelerator features a simple and efficient architecture without dependencies on analog nonidealities such as leakage and charge errors. An eight-neuron layer (core computation layer) is implemented in 22-nm FD-SOI technology. The layer occupies \\n<inline-formula> <tex-math>$70 \\\\times \\\\,70\\\\,\\\\mu $ </tex-math></inline-formula>\\nm while supporting multibit inputs (8-bit) and weights (8-bit) with signed accumulation up to 18 bits. The power dissipation of the computation layer is 576\\n<inline-formula> <tex-math>$\\\\mu $ </tex-math></inline-formula>\\nW at 0.72-V supply and 500-MHz clock frequency achieving an average area efficiency of 24.74 GOPS/mm2 (up to 544.22 GOPS/mm2), an average energy efficiency of 0.21 TOPS/W (up to 4.63 TOPS/W), and a normalized energy efficiency of 13.46 1b-TOPS/W (up to 296.30 1b-TOPS/W).\",\"PeriodicalId\":13425,\"journal\":{\"name\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"volume\":\"32 12\",\"pages\":\"2220-2231\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10758340/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10758340/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

深度神经网络(DNN)加速器正越来越多地集成到可穿戴设备和传感器网络等传感应用中,以提供先进的传感器内处理能力。鉴于可穿戴设备对尺寸和功耗的严格要求,最大限度地减少 DNN 加速器的面积和能耗是一个关键问题。在这方面,在时域中计算 DNN 模型是一种很有前景的架构,既能利用技术扩展友好性,又能提高效率。然而,时域加速器通常不是完全数字化的,从而限制了时域计算的全部优势。在这项工作中,我们提出了一种全数字时域加速器,具有体积小、能耗低的特点,可用于精密传感器内处理,如人类活动识别(HAR)。该加速器采用简单高效的架构,不依赖于漏电和电荷误差等模拟非理想状态。八神经元层(核心计算层)采用 22 纳米 FD-SOI 技术实现。该层占用 70 美元的时间,同时支持多位输入(8 位)和权重(8 位),带符号累加可达 18 位。在 0.72-V 电源和 500-MHz 时钟频率下,计算层的功耗为 576 美元/毫瓦,实现了 24.74 GOPS/mm2 的平均面积效率(高达 544.22 GOPS/mm2)、0.21 TOPS/W 的平均能效(高达 4.63 TOPS/W)和 13.46 1b-TOPS/W 的归一化能效(高达 296.30 1b-TOPS/W)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A 22-nm All-Digital Time-Domain Neural Network Accelerator for Precision In-Sensor Processing
Deep neural network (DNN) accelerators are increasingly integrated into sensing applications, such as wearables and sensor networks, to provide advanced in-sensor processing capabilities. Given wearables’ strict size and power requirements, minimizing the area and energy consumption of DNN accelerators is a critical concern. In that regard, computing DNN models in the time domain is a promising architecture, taking advantage of both technology scaling friendliness and efficiency. Yet, time-domain accelerators are typically not fully digital, limiting the full benefits of time-domain computation. In this work, we propose an all-digital time-domain accelerator with a small size and low energy consumption to target precision in-sensor processing like human activity recognition (HAR). The proposed accelerator features a simple and efficient architecture without dependencies on analog nonidealities such as leakage and charge errors. An eight-neuron layer (core computation layer) is implemented in 22-nm FD-SOI technology. The layer occupies $70 \times \,70\,\mu $ m while supporting multibit inputs (8-bit) and weights (8-bit) with signed accumulation up to 18 bits. The power dissipation of the computation layer is 576 $\mu $ W at 0.72-V supply and 500-MHz clock frequency achieving an average area efficiency of 24.74 GOPS/mm2 (up to 544.22 GOPS/mm2), an average energy efficiency of 0.21 TOPS/W (up to 4.63 TOPS/W), and a normalized energy efficiency of 13.46 1b-TOPS/W (up to 296.30 1b-TOPS/W).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.40
自引率
7.10%
发文量
187
审稿时长
3.6 months
期刊介绍: The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society. Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.
期刊最新文献
Table of Contents IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information IEEE Transactions on Very Large Scale Integration (VLSI) Systems Publication Information Table of Contents IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1