了解 FPGA 中超频收缩乘积阵列的时序误差特性

IF 1.6 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC Journal of Low Power Electronics and Applications Pub Date : 2024-01-09 DOI:10.3390/jlpea14010004
Andrew Chamberlin, Andrew Gerber, Mason Palmer, Tim Goodale, N. D. Gundi, Koushik Chakraborty, Sanghamitra Roy
{"title":"了解 FPGA 中超频收缩乘积阵列的时序误差特性","authors":"Andrew Chamberlin, Andrew Gerber, Mason Palmer, Tim Goodale, N. D. Gundi, Koushik Chakraborty, Sanghamitra Roy","doi":"10.3390/jlpea14010004","DOIUrl":null,"url":null,"abstract":"Artificial Intelligence (AI) hardware accelerators have seen tremendous developments in recent years due to the rapid growth of AI in multiple fields. Many such accelerators comprise a Systolic Multiply–Accumulate Array (SMA) as its computational brain. In this paper, we investigate the faulty output characterization of an SMA in a real silicon FPGA board. Experiments were run on a single Zybo Z7-20 board to control for process variation at nominal voltage and in small batches to control for temperature. The FPGA is rated up to 800 MHz in the data sheet due to the max frequency of the PLL, but the design is written using Verilog for the FPGA and C++ for the processor and synthesized with a chosen constraint of a 125 MHz clock. We then operate the system at a frequency range of 125 MHz to 450 MHz for the FPGA and the nominal 667 MHz for the processor core to produce timing errors in the FPGA without affecting the processor. Our extensive experimental platform with a hardware–software ecosystem provides a methodological pathway that reveals fascinating characteristics of SMA behavior under an overclocked environment. While one may intuitively expect that timing errors resulting from overclocked hardware may produce a wide variation in output values, our post-silicon evaluation reveals a lack of variation in erroneous output values. We found an intriguing pattern where error output values are stable for a given input across a range of operating frequencies far exceeding the rated frequency of the FPGA.","PeriodicalId":38100,"journal":{"name":"Journal of Low Power Electronics and Applications","volume":"55 18","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Understanding Timing Error Characteristics from Overclocked Systolic Multiply–Accumulate Arrays in FPGAs\",\"authors\":\"Andrew Chamberlin, Andrew Gerber, Mason Palmer, Tim Goodale, N. D. Gundi, Koushik Chakraborty, Sanghamitra Roy\",\"doi\":\"10.3390/jlpea14010004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Artificial Intelligence (AI) hardware accelerators have seen tremendous developments in recent years due to the rapid growth of AI in multiple fields. Many such accelerators comprise a Systolic Multiply–Accumulate Array (SMA) as its computational brain. In this paper, we investigate the faulty output characterization of an SMA in a real silicon FPGA board. Experiments were run on a single Zybo Z7-20 board to control for process variation at nominal voltage and in small batches to control for temperature. The FPGA is rated up to 800 MHz in the data sheet due to the max frequency of the PLL, but the design is written using Verilog for the FPGA and C++ for the processor and synthesized with a chosen constraint of a 125 MHz clock. We then operate the system at a frequency range of 125 MHz to 450 MHz for the FPGA and the nominal 667 MHz for the processor core to produce timing errors in the FPGA without affecting the processor. Our extensive experimental platform with a hardware–software ecosystem provides a methodological pathway that reveals fascinating characteristics of SMA behavior under an overclocked environment. While one may intuitively expect that timing errors resulting from overclocked hardware may produce a wide variation in output values, our post-silicon evaluation reveals a lack of variation in erroneous output values. We found an intriguing pattern where error output values are stable for a given input across a range of operating frequencies far exceeding the rated frequency of the FPGA.\",\"PeriodicalId\":38100,\"journal\":{\"name\":\"Journal of Low Power Electronics and Applications\",\"volume\":\"55 18\",\"pages\":\"\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-01-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Low Power Electronics and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/jlpea14010004\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Low Power Electronics and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jlpea14010004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

近年来,随着人工智能在多个领域的快速发展,人工智能(AI)硬件加速器也取得了巨大的发展。许多此类加速器都包含一个作为其计算大脑的收缩乘积阵列(SMA)。本文研究了实际硅 FPGA 板中 SMA 的故障输出特性。实验在一块 Zybo Z7-20 电路板上进行,以控制额定电压下的工艺变化,并小批量地控制温度。由于 PLL 的最大频率,数据表中 FPGA 的额定频率高达 800 MHz,但 FPGA 的设计是使用 Verilog 编写的,处理器的设计是使用 C++ 编写的,合成时选择了 125 MHz 的时钟限制。然后,我们在 FPGA 的 125 MHz 至 450 MHz 频率范围和处理器内核的标称 667 MHz 频率范围内运行系统,以便在不影响处理器的情况下在 FPGA 中产生时序误差。我们具有硬件-软件生态系统的广泛实验平台提供了一种方法论途径,揭示了超频环境下 SMA 行为的迷人特征。人们可能直觉地认为,超频硬件导致的时序错误可能会使输出值变化很大,但我们的硅后评估却发现错误输出值缺乏变化。我们发现了一种有趣的模式,即在远超 FPGA 额定频率的工作频率范围内,给定输入的错误输出值保持稳定。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Understanding Timing Error Characteristics from Overclocked Systolic Multiply–Accumulate Arrays in FPGAs
Artificial Intelligence (AI) hardware accelerators have seen tremendous developments in recent years due to the rapid growth of AI in multiple fields. Many such accelerators comprise a Systolic Multiply–Accumulate Array (SMA) as its computational brain. In this paper, we investigate the faulty output characterization of an SMA in a real silicon FPGA board. Experiments were run on a single Zybo Z7-20 board to control for process variation at nominal voltage and in small batches to control for temperature. The FPGA is rated up to 800 MHz in the data sheet due to the max frequency of the PLL, but the design is written using Verilog for the FPGA and C++ for the processor and synthesized with a chosen constraint of a 125 MHz clock. We then operate the system at a frequency range of 125 MHz to 450 MHz for the FPGA and the nominal 667 MHz for the processor core to produce timing errors in the FPGA without affecting the processor. Our extensive experimental platform with a hardware–software ecosystem provides a methodological pathway that reveals fascinating characteristics of SMA behavior under an overclocked environment. While one may intuitively expect that timing errors resulting from overclocked hardware may produce a wide variation in output values, our post-silicon evaluation reveals a lack of variation in erroneous output values. We found an intriguing pattern where error output values are stable for a given input across a range of operating frequencies far exceeding the rated frequency of the FPGA.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Low Power Electronics and Applications
Journal of Low Power Electronics and Applications Engineering-Electrical and Electronic Engineering
CiteScore
3.60
自引率
14.30%
发文量
57
审稿时长
11 weeks
期刊最新文献
Understanding Timing Error Characteristics from Overclocked Systolic Multiply–Accumulate Arrays in FPGAs Design and Assessment of Hybrid MTJ/CMOS Circuits for In-Memory-Computation Speed, Power and Area Optimized Monotonic Asynchronous Array Multipliers An Ultra Low Power Integer-N PLL with a High-Gain Sampling Phase Detector for IOT Applications in 65 nm CMOS Design of a Low-Power Delay-Locked Loop-Based 8× Frequency Multiplier in 22 nm FDSOI
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1