首页 > 最新文献

IEEE Embedded Systems Letters最新文献

英文 中文
Using Static Analysis for Enhancing HLS Security 使用静态分析增强 HLS 安全性
IF 1.6 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-11-03 DOI: 10.1109/LES.2023.3329417
Luca Collini;Joey Ah-Kiow;Christian Pilato;Ramesh Karri;Benjamin Tan
Due to the increasing complexity of modern integrated circuits, high-level synthesis (HLS) is becoming a key technology in hardware design. HLS uses optimizations to assist during design space exploration. However, some of them can introduce security weaknesses. We propose an approach that leverages static analysis to identify a class of weaknesses in HLS-generated code. We show that some of these weaknesses can be corrected through the automatic generation of HLS directives. We evaluate our approach by comparing the static analysis results with formal verification. Our results show that the static approach has the same accuracy as formal methods while being $3times $ to $200times $ faster.
由于现代集成电路的复杂性不断增加,高级综合(HLS)正成为硬件设计的一项关键技术。HLS 使用优化技术来帮助探索设计空间。然而,其中一些优化可能会引入安全漏洞。我们提出了一种方法,利用静态分析来识别 HLS 生成代码中的一类弱点。我们表明,其中一些弱点可以通过自动生成 HLS 指令来纠正。我们通过比较静态分析结果和形式验证来评估我们的方法。我们的结果表明,静态方法与形式方法具有相同的准确性,而速度却要快上 3 到 200 倍。
{"title":"Using Static Analysis for Enhancing HLS Security","authors":"Luca Collini;Joey Ah-Kiow;Christian Pilato;Ramesh Karri;Benjamin Tan","doi":"10.1109/LES.2023.3329417","DOIUrl":"10.1109/LES.2023.3329417","url":null,"abstract":"Due to the increasing complexity of modern integrated circuits, high-level synthesis (HLS) is becoming a key technology in hardware design. HLS uses optimizations to assist during design space exploration. However, some of them can introduce security weaknesses. We propose an approach that leverages static analysis to identify a class of weaknesses in HLS-generated code. We show that some of these weaknesses can be corrected through the automatic generation of HLS directives. We evaluate our approach by comparing the static analysis results with formal verification. Our results show that the static approach has the same accuracy as formal methods while being \u0000<inline-formula> <tex-math>$3times $ </tex-math></inline-formula>\u0000 to \u0000<inline-formula> <tex-math>$200times $ </tex-math></inline-formula>\u0000 faster.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"16 2","pages":"166-169"},"PeriodicalIF":1.6,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134982022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Experimental Investigation of Side-Channel Attacks on Neuromorphic Spiking Neural Networks 神经形态尖峰神经网络侧信道攻击的实验研究
IF 1.6 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-27 DOI: 10.1109/LES.2023.3328223
Bhanprakash Goswami;Tamoghno Das;Manan Suri
This study investigates the reliability of commonly utilized digital spiking neurons and the potential side-channel vulnerabilities in neuromorphic systems that employ them. Through our experiments, we have successfully decoded the parametric information of Izhikevich and leaky integrate-and-fire (LIF) neuron-based spiking neural networks (SNNs) using differential power analysis. Furthermore, we have demonstrated the practical application of extracted information from the 92% accurate pretrained standard spiking convolution neural network classifier on the FashionMNIST dataset. These findings highlight the potential dangers of utilizing internal information for side-channel and denial-of-service attacks, even when using the usual input as the attack vector.
本研究调查了常用数字尖峰神经元的可靠性,以及采用这些神经元的神经形态系统中潜在的侧信道漏洞。通过实验,我们利用差分功率分析法成功地解码了基于 Izhikevich 和泄漏积分发射(LIF)神经元的尖峰神经网络(SNN)的参数信息。此外,我们还在 FashionMNIST 数据集上展示了从预训练的标准尖峰卷积神经网络分类器中提取的信息的实际应用,其准确率高达 92%。这些发现凸显了利用内部信息进行侧信道攻击和拒绝服务攻击的潜在危险,即使使用常规输入作为攻击向量也是如此。
{"title":"Experimental Investigation of Side-Channel Attacks on Neuromorphic Spiking Neural Networks","authors":"Bhanprakash Goswami;Tamoghno Das;Manan Suri","doi":"10.1109/LES.2023.3328223","DOIUrl":"10.1109/LES.2023.3328223","url":null,"abstract":"This study investigates the reliability of commonly utilized digital spiking neurons and the potential side-channel vulnerabilities in neuromorphic systems that employ them. Through our experiments, we have successfully decoded the parametric information of Izhikevich and leaky integrate-and-fire (LIF) neuron-based spiking neural networks (SNNs) using differential power analysis. Furthermore, we have demonstrated the practical application of extracted information from the 92% accurate pretrained standard spiking convolution neural network classifier on the FashionMNIST dataset. These findings highlight the potential dangers of utilizing internal information for side-channel and denial-of-service attacks, even when using the usual input as the attack vector.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"16 2","pages":"231-234"},"PeriodicalIF":1.6,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134883484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting Black-Hat HLS: A Lightweight Countermeasure to HLS-Aided Trojan Attack 重温黑帽 HLS:应对 HLS 辅助木马攻击的轻量级对策
IF 1.6 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-26 DOI: 10.1109/LES.2023.3327793
Mahendra Rathor;Anirban Sengupta
One of the dark side of horizontal semiconductor business model could be the supply of compromised computer-aided design (CAD) tools by an adversary to the designers. A compromised or black-hat high-level synthesis (HLS) tool may secretly insert Trojan into the design being synthesized to affect its functional or nonfunctional aspects. Recently, a black-hat HLS was presented which inserts fake operations during the scheduling process to enable battery exhaustion attack. In this letter, we present a framework to detect the fake operations inserted by a compromised HLS with the help of scheduling information provided by the tool. We implemented our detection framework on a number of benchmarks and analyzed the detection time and accuracy. We also analyzed the cost of fake operation insertion in terms of design area and delay overhead.
横向半导体商业模式的阴暗面之一可能是对手向设计人员提供被破解的计算机辅助设计(CAD)工具。被破解或黑帽的高级综合(HLS)工具可能会在被综合的设计中秘密植入木马,从而影响其功能或非功能方面。最近,有一种黑帽 HLS 在调度过程中插入虚假操作,以实现电池耗尽攻击。在这封信中,我们提出了一个框架,借助工具提供的调度信息来检测被入侵的 HLS 所插入的虚假操作。我们在一些基准上实施了我们的检测框架,并分析了检测时间和准确性。我们还分析了插入虚假操作在设计面积和延迟开销方面的成本。
{"title":"Revisiting Black-Hat HLS: A Lightweight Countermeasure to HLS-Aided Trojan Attack","authors":"Mahendra Rathor;Anirban Sengupta","doi":"10.1109/LES.2023.3327793","DOIUrl":"10.1109/LES.2023.3327793","url":null,"abstract":"One of the dark side of horizontal semiconductor business model could be the supply of compromised computer-aided design (CAD) tools by an adversary to the designers. A compromised or black-hat high-level synthesis (HLS) tool may secretly insert Trojan into the design being synthesized to affect its functional or nonfunctional aspects. Recently, a black-hat HLS was presented which inserts fake operations during the scheduling process to enable battery exhaustion attack. In this letter, we present a framework to detect the fake operations inserted by a compromised HLS with the help of scheduling information provided by the tool. We implemented our detection framework on a number of benchmarks and analyzed the detection time and accuracy. We also analyzed the cost of fake operation insertion in terms of design area and delay overhead.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"16 2","pages":"170-173"},"PeriodicalIF":1.6,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134883599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of Tri-Band Bandpass Filter Using Modified X-Shaped Structure for IoT-Based Wireless Applications 为基于物联网的无线应用设计使用改进型 X 形结构的三频带通滤波器
IF 1.6 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-19 DOI: 10.1109/LES.2023.3325898
Bilal Mushtaq;Muhammad Abdul Rehman;Sohail Khalid;Majed Alhaisoni
This letter introduces a straightforward method for designing a tri-band bandpass filter using a loaded symmetrical dual-step impedance resonator (D-SIR) structure, augmented by multiple open and short circuit stubs. The resulting filter operates at 2.2, 7.9, and 14.1 GHz, showcasing high-band selectivity. Experimental measurements confirm six transmission zeros and six transmission poles across the passbands, yielding fractional bandwidth (FBW) of 42.6%, 63.8%, and 27.9%. Particularly noteworthy is the broader upper stopband rejection, enhancing the filter’s overall performance. The design achieves impressive performance metrics, with minimal insertion loss (IL) of 0.71, 0.87, and 0.76 dB, accompanied by substantial return loss (RL) of 23, 18, and 13.8 dB across the three passbands. Furthermore, the fabricated design have a compact size of ( $0.20times 0.15$ ) $lambda _{g}$ on a Rogers Duroid RT/5880 substrate. Remarkable agreement between measured and simulated results underscores its viability for Internet of Things-based communication systems.
这封信介绍了一种设计三频带带通滤波器的简单方法,该方法采用了加载对称双步阻抗谐振器(D-SIR)结构,并通过多个开路和短路存根进行增强。由此产生的滤波器工作频率分别为 2.2、7.9 和 14.1 GHz,展现了高带选择性。实验测量证实,通带上有六个传输零点和六个传输极点,分数带宽(FBW)分别为 42.6%、63.8% 和 27.9%。尤其值得一提的是,该滤波器具有更宽的上阻带抑制能力,从而提高了滤波器的整体性能。该设计实现了令人印象深刻的性能指标,插入损耗(IL)分别为 0.71、0.87 和 0.76 dB,三个通带的回波损耗(RL)分别为 23、18 和 13.8 dB。此外,在罗杰斯 Duroid RT/5880 衬底上制作的设计具有 ( 0.20 美元乘以 0.15 美元 ) $lambda _{g}$的紧凑尺寸。测量结果与模拟结果之间的显著一致性强调了其在基于物联网的通信系统中的可行性。
{"title":"Design of Tri-Band Bandpass Filter Using Modified X-Shaped Structure for IoT-Based Wireless Applications","authors":"Bilal Mushtaq;Muhammad Abdul Rehman;Sohail Khalid;Majed Alhaisoni","doi":"10.1109/LES.2023.3325898","DOIUrl":"10.1109/LES.2023.3325898","url":null,"abstract":"This letter introduces a straightforward method for designing a tri-band bandpass filter using a loaded symmetrical dual-step impedance resonator (D-SIR) structure, augmented by multiple open and short circuit stubs. The resulting filter operates at 2.2, 7.9, and 14.1 GHz, showcasing high-band selectivity. Experimental measurements confirm six transmission zeros and six transmission poles across the passbands, yielding fractional bandwidth (FBW) of 42.6%, 63.8%, and 27.9%. Particularly noteworthy is the broader upper stopband rejection, enhancing the filter’s overall performance. The design achieves impressive performance metrics, with minimal insertion loss (IL) of 0.71, 0.87, and 0.76 dB, accompanied by substantial return loss (RL) of 23, 18, and 13.8 dB across the three passbands. Furthermore, the fabricated design have a compact size of (\u0000<inline-formula> <tex-math>$0.20times 0.15$ </tex-math></inline-formula>\u0000)\u0000<inline-formula> <tex-math>$lambda _{g}$ </tex-math></inline-formula>\u0000 on a Rogers Duroid RT/5880 substrate. Remarkable agreement between measured and simulated results underscores its viability for Internet of Things-based communication systems.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"16 2","pages":"194-197"},"PeriodicalIF":1.6,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135057254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of a Stub-Loaded Coupled Line Diplexer for IoT-Based Applications 为基于物联网的应用设计短管负载耦合线路双工器
IF 1.6 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-19 DOI: 10.1109/LES.2023.3325578
Muhammad Idrees;Sohail Khalid;Muhammad Abdulrehman;Bilal Mushtaq;Ali Imran Najam;Majed Alhaisoni
This letter presents the design of a microstrip-based diplexer for the applications of the Internet of Things (IoT). By incorporating stub-loaded coupled line resonators, the diplexer achieves significant enhancements in its passband performance. Specifically engineered to operate at precise frequencies of 2.55 and 3.94 GHz, the diplexer demonstrates improved isolation and selectivity through the integration of five transmission poles (TPs). A comprehensive analysis is conducted, evaluating crucial parameters, such as size, insertion loss, return loss, and isolation. The diplexer is fabricated on a compact Rogers Duroid 5880 substrate, and experimental measurements validate its effectiveness, exhibiting a low insertion loss of 0.3 dB at 2.55 GHz and 0.4 dB at 3.94 GHz, in close agreement with simulated predictions. The proposed design, featuring stub-loaded coupled line resonators, showcases highly promising passband characteristics, making it a compelling solution for efficient multiplexing of diverse frequency bands in wireless communication applications within the IoT and network-on-chip (NoC) domains.
这封信介绍了一种基于微带的双工器设计,用于物联网(IoT)应用。通过采用存根加载耦合线谐振器,该双工器的通带性能得到显著提升。该双工器专为在 2.55 和 3.94 GHz 的精确频率下工作而设计,通过集成五个传输极(TP)提高了隔离度和选择性。我们进行了全面分析,评估了尺寸、插入损耗、回波损耗和隔离度等关键参数。该双工器是在紧凑型罗杰斯 Duroid 5880 基板上制造的,实验测量验证了其有效性,在 2.55 GHz 和 3.94 GHz 时分别显示出 0.3 dB 和 0.4 dB 的低插入损耗,与模拟预测值非常接近。所提出的设计具有存根加载耦合线谐振器的特点,展示了极具前景的通带特性,使其成为物联网和片上网络(NoC)领域无线通信应用中有效复用不同频段的一个引人注目的解决方案。
{"title":"Design of a Stub-Loaded Coupled Line Diplexer for IoT-Based Applications","authors":"Muhammad Idrees;Sohail Khalid;Muhammad Abdulrehman;Bilal Mushtaq;Ali Imran Najam;Majed Alhaisoni","doi":"10.1109/LES.2023.3325578","DOIUrl":"10.1109/LES.2023.3325578","url":null,"abstract":"This letter presents the design of a microstrip-based diplexer for the applications of the Internet of Things (IoT). By incorporating stub-loaded coupled line resonators, the diplexer achieves significant enhancements in its passband performance. Specifically engineered to operate at precise frequencies of 2.55 and 3.94 GHz, the diplexer demonstrates improved isolation and selectivity through the integration of five transmission poles (TPs). A comprehensive analysis is conducted, evaluating crucial parameters, such as size, insertion loss, return loss, and isolation. The diplexer is fabricated on a compact Rogers Duroid 5880 substrate, and experimental measurements validate its effectiveness, exhibiting a low insertion loss of 0.3 dB at 2.55 GHz and 0.4 dB at 3.94 GHz, in close agreement with simulated predictions. The proposed design, featuring stub-loaded coupled line resonators, showcases highly promising passband characteristics, making it a compelling solution for efficient multiplexing of diverse frequency bands in wireless communication applications within the IoT and network-on-chip (NoC) domains.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"16 2","pages":"186-189"},"PeriodicalIF":1.6,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135058234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Common Subexpression-Based Compression and Multiplication of Sparse Constant Matrices 基于常见子表达式的稀疏常量矩阵压缩与乘法
IF 1.6 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-13 DOI: 10.1109/LES.2023.3323635
Emre Bilgili;Arda Yurdakul
In deep learning inference, model parameters are pruned and quantized to reduce the model size. Compression methods and common subexpression (CSE) elimination algorithms are applied on sparse constant matrices to deploy the models on low-cost embedded devices. However, the state-of-the-art CSE elimination methods do not scale well for handling large matrices. They reach hours for extracting CSEs in a $200 times 200$ matrix while their matrix multiplication algorithms execute longer than the conventional matrix multiplication methods. Besides, there exist no compression methods for matrices utilizing CSEs. As a remedy to this problem, a random search-based algorithm is proposed in this letter to extract CSEs in the column pairs of a constant matrix. It produces an adder tree for a $1000 times 1000$ matrix in a minute. To compress the adder tree, this letter presents a compression format by extending the compressed sparse row (CSR) to include CSEs. While compression rates of more than 50% can be achieved compared to the original CSR format, simulations for a single-core embedded system show that the matrix multiplication execution time can be reduced by 20%.
在深度学习推理中,对模型参数进行剪枝和量化,以减小模型大小。压缩方法和公共子表达(CSE)消除算法适用于稀疏常量矩阵,以便在低成本嵌入式设备上部署模型。然而,最先进的 CSE 消除方法并不能很好地扩展到处理大型矩阵。在一个 200 美元乘以 200 美元的矩阵中提取 CSE 需要数小时,而其矩阵乘法算法的执行时间比传统矩阵乘法更长。此外,目前还没有利用 CSE 的矩阵压缩方法。为解决这一问题,本文提出了一种基于随机搜索的算法,用于提取常量矩阵列对中的 CSE。它能在一分钟内生成 1000 美元乘以 1000 美元矩阵的加法树。为了压缩加法树,这封信提出了一种压缩格式,通过扩展压缩稀疏行(CSR)来包含 CSE。与原始 CSR 格式相比,压缩率可达到 50%以上,而对单核嵌入式系统的仿真表明,矩阵乘法的执行时间可缩短 20%。
{"title":"Common Subexpression-Based Compression and Multiplication of Sparse Constant Matrices","authors":"Emre Bilgili;Arda Yurdakul","doi":"10.1109/LES.2023.3323635","DOIUrl":"10.1109/LES.2023.3323635","url":null,"abstract":"In deep learning inference, model parameters are pruned and quantized to reduce the model size. Compression methods and common subexpression (CSE) elimination algorithms are applied on sparse constant matrices to deploy the models on low-cost embedded devices. However, the state-of-the-art CSE elimination methods do not scale well for handling large matrices. They reach hours for extracting CSEs in a \u0000<inline-formula> <tex-math>$200 times 200$ </tex-math></inline-formula>\u0000 matrix while their matrix multiplication algorithms execute longer than the conventional matrix multiplication methods. Besides, there exist no compression methods for matrices utilizing CSEs. As a remedy to this problem, a random search-based algorithm is proposed in this letter to extract CSEs in the column pairs of a constant matrix. It produces an adder tree for a \u0000<inline-formula> <tex-math>$1000 times 1000$ </tex-math></inline-formula>\u0000 matrix in a minute. To compress the adder tree, this letter presents a compression format by extending the compressed sparse row (CSR) to include CSEs. While compression rates of more than 50% can be achieved compared to the original CSR format, simulations for a single-core embedded system show that the matrix multiplication execution time can be reduced by 20%.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"16 2","pages":"82-85"},"PeriodicalIF":1.6,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136304389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Dynamic Duty Cycling for Energy Efficiency in Coherent DSP ASIC 探索动态占空比,提高相干 DSP ASIC 的能效
IF 1.6 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-05 DOI: 10.1109/LES.2023.3322301
Lucas Castro;Jonathas Silveira;Rodrigo Zeli;Victor Araújo;Marcelo Guedes;Daniel Lazari;Rodolfo Azevedo;Lucas Wanner
In coherent optics transmission systems, the digital signal processor (DSP) application-specific integrated circuit (ASIC) is the most power-hungry part of the optical transceiver. Already in the edge of transistor technology, to achieve the power budget, we must look for opportunities to further optimize the designs. This letter explores a dynamic duty cycle for reducing the consumption in the pipeline of such DSP ASICs. We exploit the characteristics of estimator algorithms to introduce a dynamic duty cycle, reducing the mean consumption of designs originally constrained only for worst-case scenarios. We present the methodology to implement duty cycle control using the carrier frequency offset estimator (CFE) algorithm as case study, achieving in simulation level from 22% to 74% power consumption reduction in this algorithm, varying on-chip operation conditions.
在相干光学传输系统中,数字信号处理器(DSP)专用集成电路(ASIC)是光收发器中最耗电的部分。在晶体管技术发展的今天,为了实现功耗预算,我们必须寻找进一步优化设计的机会。这封信探讨了一种动态占空比,以降低此类 DSP ASIC 的流水线功耗。我们利用估算算法的特点引入动态占空比,降低了原本只针对最坏情况的设计的平均消耗。我们以载波频率偏移估算器 (CFE) 算法为例,介绍了实现占空比控制的方法,在不同的片上运行条件下,该算法在仿真水平上实现了 22% 到 74% 的功耗降低。
{"title":"Exploring Dynamic Duty Cycling for Energy Efficiency in Coherent DSP ASIC","authors":"Lucas Castro;Jonathas Silveira;Rodrigo Zeli;Victor Araújo;Marcelo Guedes;Daniel Lazari;Rodolfo Azevedo;Lucas Wanner","doi":"10.1109/LES.2023.3322301","DOIUrl":"10.1109/LES.2023.3322301","url":null,"abstract":"In coherent optics transmission systems, the digital signal processor (DSP) application-specific integrated circuit (ASIC) is the most power-hungry part of the optical transceiver. Already in the edge of transistor technology, to achieve the power budget, we must look for opportunities to further optimize the designs. This letter explores a dynamic duty cycle for reducing the consumption in the pipeline of such DSP ASICs. We exploit the characteristics of estimator algorithms to introduce a dynamic duty cycle, reducing the mean consumption of designs originally constrained only for worst-case scenarios. We present the methodology to implement duty cycle control using the carrier frequency offset estimator (CFE) algorithm as case study, achieving in simulation level from 22% to 74% power consumption reduction in this algorithm, varying on-chip operation conditions.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"16 2","pages":"202-205"},"PeriodicalIF":1.6,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136002620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EvoLP: Self-Evolving Latency Predictor for Model Compression in Real-Time Edge Systems EvoLP:用于实时边缘系统模型压缩的自进化延迟预测器
IF 1.6 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-02 DOI: 10.1109/LES.2023.3321599
Shuo Huai;Hao Kong;Shiqing Li;Xiangzhong Luo;Ravi Subramaniam;Christian Makaya;Qian Lin;Weichen Liu
Edge devices are increasingly utilized for deploying deep learning applications on embedded systems. The real-time nature of many applications and the limited resources of edge devices necessitate latency-targeted neural network compression. However, measuring latency on real devices is challenging and expensive. Therefore, this letter presents a novel and efficient framework, named EvoLP, to accurately predict the inference latency of models on edge devices. This predictor can evolve to achieve higher latency prediction precision during the network compression process. Experimental results demonstrate that EvoLP outperforms previous state-of-the-art approaches by being evaluated on three edge devices and four model variants. Moreover, when incorporated into a model compression framework, it effectively guides the compression process for higher model accuracy while satisfying strict latency constraints. We open-source EvoLP at https://github.com/ntuliuteam/EvoLP.
在嵌入式系统上部署深度学习应用时,越来越多地使用边缘设备。许多应用具有实时性,而边缘设备的资源有限,因此有必要针对延迟进行神经网络压缩。然而,在真实设备上测量延迟具有挑战性且成本高昂。因此,这封信提出了一个名为 EvoLP 的新型高效框架,用于准确预测边缘设备上模型的推理延迟。这种预测器可以在网络压缩过程中不断发展,以实现更高的延迟预测精度。实验结果表明,通过在三种边缘设备和四种模型变体上进行评估,EvoLP 的性能优于以前的先进方法。此外,当将 EvoLP 纳入模型压缩框架时,它能有效地指导压缩过程,在满足严格的时延限制的同时提高模型精度。我们在 https://github.com/ntuliuteam/EvoLP 上开源了 EvoLP。
{"title":"EvoLP: Self-Evolving Latency Predictor for Model Compression in Real-Time Edge Systems","authors":"Shuo Huai;Hao Kong;Shiqing Li;Xiangzhong Luo;Ravi Subramaniam;Christian Makaya;Qian Lin;Weichen Liu","doi":"10.1109/LES.2023.3321599","DOIUrl":"10.1109/LES.2023.3321599","url":null,"abstract":"Edge devices are increasingly utilized for deploying deep learning applications on embedded systems. The real-time nature of many applications and the limited resources of edge devices necessitate latency-targeted neural network compression. However, measuring latency on real devices is challenging and expensive. Therefore, this letter presents a novel and efficient framework, named EvoLP, to accurately predict the inference latency of models on edge devices. This predictor can evolve to achieve higher latency prediction precision during the network compression process. Experimental results demonstrate that EvoLP outperforms previous state-of-the-art approaches by being evaluated on three edge devices and four model variants. Moreover, when incorporated into a model compression framework, it effectively guides the compression process for higher model accuracy while satisfying strict latency constraints. We open-source EvoLP at \u0000<uri>https://github.com/ntuliuteam/EvoLP</uri>\u0000.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"16 2","pages":"174-177"},"PeriodicalIF":1.6,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135910262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An SoC Design for Future Mobile DNA Detection 用于未来移动 DNA 检测的 SoC 设计
IF 1.6 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-02 DOI: 10.1109/LES.2023.3321587
Yunus Dawji;Zhongpan Wu;Abel Beyene;Karim Hammad;Ebrahim Ghafar-Zadeh;Sebastian Magierowski
Existing miniature DNA sequencing devices hold significant promise to serve as mobile/personal genomic analysis systems in the future. But a key challenge to this vision is the absence of adequate low-power bioinformatic computing ability within the sequencing device itself. In this letter, we discuss the design and demonstrate a system-on-chip (SoC) based on an accelerated RISC-V core for such a task. The chip was fabricated in 22-nm CMOS and executes almost $10times $ faster than a commercial mobile processor on a DNA sequence detection task while achieving $200times $ better energy efficiency.
现有的微型 DNA 测序设备很有希望成为未来的移动/个人基因组分析系统。但实现这一愿景的关键挑战在于测序设备本身缺乏足够的低功耗生物信息计算能力。在这封信中,我们讨论并演示了基于加速 RISC-V 内核的片上系统 (SoC),以完成这样的任务。该芯片采用 22 纳米 CMOS 制造,在 DNA 序列检测任务上的执行速度比商用移动处理器快近 10 倍,同时能效比商用处理器高 200 倍。
{"title":"An SoC Design for Future Mobile DNA Detection","authors":"Yunus Dawji;Zhongpan Wu;Abel Beyene;Karim Hammad;Ebrahim Ghafar-Zadeh;Sebastian Magierowski","doi":"10.1109/LES.2023.3321587","DOIUrl":"10.1109/LES.2023.3321587","url":null,"abstract":"Existing miniature DNA sequencing devices hold significant promise to serve as mobile/personal genomic analysis systems in the future. But a key challenge to this vision is the absence of adequate low-power bioinformatic computing ability within the sequencing device itself. In this letter, we discuss the design and demonstrate a system-on-chip (SoC) based on an accelerated RISC-V core for such a task. The chip was fabricated in 22-nm CMOS and executes almost \u0000<inline-formula> <tex-math>$10times $ </tex-math></inline-formula>\u0000 faster than a commercial mobile processor on a DNA sequence detection task while achieving \u0000<inline-formula> <tex-math>$200times $ </tex-math></inline-formula>\u0000 better energy efficiency.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"16 2","pages":"86-89"},"PeriodicalIF":1.6,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135846365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and Implementation of an NoC-Based Convolution Architecture With GEMM and Systolic Arrays 利用 GEMM 和 Systolic 阵列设计和实现基于 NoC 的卷积架构
IF 1.6 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-29 DOI: 10.1109/LES.2023.3321019
S. Ortega-Cisneros
Neural networks have been used for a long time for image detection and recognition applications due to their ability and efficiency in complex problem solving. Several researchers have chosen to design and develop hardware accelerators for the convolution layer due to the large computational expense consumed by this layer. For that reason, a system that performs indirect GEMM convolution is implemented in a FPGA in this letter. Thus, the input data is segmented and distributed into acceleration modules in a parallel and distributed manner using the Network-on-Chip (NoC) paradigm, and a systolic array (SA) is implemented for the matrix multiplication operation as processing blocks within each NoC Node. Synthesis and performance results show that the implementation of this system presents better results compared to the state of the art in areas, such as acceleration factor, consumption of resources, latency, and operational frequency.
神经网络因其解决复杂问题的能力和效率,已在图像检测和识别应用中使用了很长时间。由于卷积层需要消耗大量的计算费用,一些研究人员选择为该层设计和开发硬件加速器。因此,本文在 FPGA 中实现了一个执行间接 GEMM 卷积的系统。因此,使用片上网络(NoC)范例,以并行和分布式的方式将输入数据分割并分配到加速模块中,并在每个 NoC 节点内以处理块的形式为矩阵乘法操作实施了一个收缩阵列(SA)。合成和性能结果表明,该系统的实施在加速因子、资源消耗、延迟和运行频率等方面都优于目前的技术水平。
{"title":"Design and Implementation of an NoC-Based Convolution Architecture With GEMM and Systolic Arrays","authors":"S. Ortega-Cisneros","doi":"10.1109/LES.2023.3321019","DOIUrl":"10.1109/LES.2023.3321019","url":null,"abstract":"Neural networks have been used for a long time for image detection and recognition applications due to their ability and efficiency in complex problem solving. Several researchers have chosen to design and develop hardware accelerators for the convolution layer due to the large computational expense consumed by this layer. For that reason, a system that performs indirect GEMM convolution is implemented in a FPGA in this letter. Thus, the input data is segmented and distributed into acceleration modules in a parallel and distributed manner using the Network-on-Chip (NoC) paradigm, and a systolic array (SA) is implemented for the matrix multiplication operation as processing blocks within each NoC Node. Synthesis and performance results show that the implementation of this system presents better results compared to the state of the art in areas, such as acceleration factor, consumption of resources, latency, and operational frequency.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"16 1","pages":"49-52"},"PeriodicalIF":1.6,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135839036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Embedded Systems Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1