首页 > 最新文献

Proceedings of the Great Lakes Symposium on VLSI 2022最新文献

英文 中文
A Senior-Level Analog IC Design Course built on Open-Source Technologies 基于开源技术的高级模拟IC设计课程
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530334
J. Hasler
We present a project-based alternative to a classical senior-level first Analog IC design course. This hands-on approach is enabled through a systematic approach to on-line lectures and course material, as well as open-source IC design process (Skywater 130nm CMOS) and tools (magic, Xschem) that are capable of fabricating working ICs. This realistic student design experience builds student confidence in designing 10-100 transistor circuits that could be fabricated on this IC process.
我们提出了一个基于项目的替代经典的高级模拟IC设计课程。通过系统的在线讲座和课程材料,以及开源IC设计过程(Skywater 130nm CMOS)和能够制造工作IC的工具(magic, xscheme),这种实践方法得以实现。这种真实的学生设计经验建立了学生设计10-100个晶体管电路的信心,这些电路可以在这种IC工艺上制造。
{"title":"A Senior-Level Analog IC Design Course built on Open-Source Technologies","authors":"J. Hasler","doi":"10.1145/3526241.3530334","DOIUrl":"https://doi.org/10.1145/3526241.3530334","url":null,"abstract":"We present a project-based alternative to a classical senior-level first Analog IC design course. This hands-on approach is enabled through a systematic approach to on-line lectures and course material, as well as open-source IC design process (Skywater 130nm CMOS) and tools (magic, Xschem) that are capable of fabricating working ICs. This realistic student design experience builds student confidence in designing 10-100 transistor circuits that could be fabricated on this IC process.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126334488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Radiation Hardening by Design Techniques for the Mutual Exclusion Element 互斥元件的辐射硬化设计技术
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530310
Moisés Herrera, P. Beerel
Circuits in advanced CMOS technology are increasingly more sensitive to transient pulses caused by radiation particles that strike vulnerable circuit components, specially turned off transistors, often generating multiple voltage upsets. Towards mitigating these issues, this paper presents a novel Radiation Hardened by Design (RHBD) mutual exclusion element (mutex) that incorporates multiple RHBD techniques with reduced area overhead. We compared our proposed circuit to the baseline and the state-of-the-art designs, in terms of resiliency to Single Event Transients (SET) and Single Event Upsets (SEU), request to grant latency, and area overhead. Results shows that the proposed circuit mitigates SET and prevents SEU events incurring in 1.42x performance and 5.1x transistor area overhead compared to the baseline (unhardened) design. On the other hand, the proposed mutex circuit improves SEU resiliency at outputs, achieving 0.58x transistor area and 0.62x latency compared to the state-of-the-art RHBD mutex that uses modular redundancy.
采用先进CMOS技术的电路对瞬态脉冲越来越敏感,这些瞬态脉冲是由辐射粒子撞击脆弱的电路元件(特别是关闭的晶体管)引起的,通常会产生多次电压异常。为了缓解这些问题,本文提出了一种新的抗辐射设计(RHBD)互斥元件(mutex),它结合了多种RHBD技术,减少了面积开销。我们将我们提出的电路与基线和最先进的设计进行了比较,包括单事件瞬变(SET)和单事件中断(SEU)的弹性、授予延迟请求和面积开销。结果表明,与基线(未硬化)设计相比,所提出的电路减轻了SET并防止了SEU事件的发生,性能降低了1.42倍,晶体管面积开销降低了5.1倍。另一方面,与使用模块化冗余的最先进的RHBD互斥锁相比,该互斥锁电路提高了输出端的SEU弹性,实现了0.58倍的晶体管面积和0.62倍的延迟。
{"title":"Radiation Hardening by Design Techniques for the Mutual Exclusion Element","authors":"Moisés Herrera, P. Beerel","doi":"10.1145/3526241.3530310","DOIUrl":"https://doi.org/10.1145/3526241.3530310","url":null,"abstract":"Circuits in advanced CMOS technology are increasingly more sensitive to transient pulses caused by radiation particles that strike vulnerable circuit components, specially turned off transistors, often generating multiple voltage upsets. Towards mitigating these issues, this paper presents a novel Radiation Hardened by Design (RHBD) mutual exclusion element (mutex) that incorporates multiple RHBD techniques with reduced area overhead. We compared our proposed circuit to the baseline and the state-of-the-art designs, in terms of resiliency to Single Event Transients (SET) and Single Event Upsets (SEU), request to grant latency, and area overhead. Results shows that the proposed circuit mitigates SET and prevents SEU events incurring in 1.42x performance and 5.1x transistor area overhead compared to the baseline (unhardened) design. On the other hand, the proposed mutex circuit improves SEU resiliency at outputs, achieving 0.58x transistor area and 0.62x latency compared to the state-of-the-art RHBD mutex that uses modular redundancy.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117206367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effective and Efficient Detailed Routing with Adaptive Rip-up Scheme and Pin Access Refinement 具有自适应撕扯方案和引脚访问细化的高效详细路由
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530361
Zhongdong Qi, Jingchong Zhang, Gengjie Chen, Hailong You
Detailed routing is one of the most complex and time-consuming stages of VLSI design process. Due to the rapidly growing problem scale and increasing number of design rules in advanced technology nodes, a feasible routing result can only be achieved after many rounds of rip-up and reroute (R&R) iterations, which takes a significantly long runtime. In this paper, we propose several effective and efficient techniques to handle the design rule violations in detailed routing. An adaptive rip-up scheme with two strategies of different effort is designed, which can speed up the R&R phase with comparable solution quality. To cope with the pin access challenge with complex design rule constraints, approaches to refine the pin connections are proposed. Besides, some specific design rules are handled in a post-processing manner efficiently. Experiment result shows that the number of design rule violations can be reduced by 69% with 28% lower runtime on average, after integrating these techniques in Dr. CU 2.0.
详细布线是VLSI设计过程中最复杂、最耗时的阶段之一。由于先进技术节点中问题规模的快速增长和设计规则数量的不断增加,需要经过多轮的撕裂和重路由(R&R)迭代才能获得可行的路由结果,这需要很长的运行时间。在本文中,我们提出了几种有效和高效的技术来处理详细路由中的设计规则违反。设计了一种具有两种不同努力策略的自适应拆解方案,可以在求解质量相当的情况下加快R&R阶段。针对复杂设计规则约束下的引脚接入问题,提出了改进引脚连接的方法。此外,对一些具体的设计规则进行了有效的后处理处理。实验结果表明,在Dr. CU 2.0中集成这些技术后,违反设计规则的次数可以减少69%,平均运行时间减少28%。
{"title":"Effective and Efficient Detailed Routing with Adaptive Rip-up Scheme and Pin Access Refinement","authors":"Zhongdong Qi, Jingchong Zhang, Gengjie Chen, Hailong You","doi":"10.1145/3526241.3530361","DOIUrl":"https://doi.org/10.1145/3526241.3530361","url":null,"abstract":"Detailed routing is one of the most complex and time-consuming stages of VLSI design process. Due to the rapidly growing problem scale and increasing number of design rules in advanced technology nodes, a feasible routing result can only be achieved after many rounds of rip-up and reroute (R&R) iterations, which takes a significantly long runtime. In this paper, we propose several effective and efficient techniques to handle the design rule violations in detailed routing. An adaptive rip-up scheme with two strategies of different effort is designed, which can speed up the R&R phase with comparable solution quality. To cope with the pin access challenge with complex design rule constraints, approaches to refine the pin connections are proposed. Besides, some specific design rules are handled in a post-processing manner efficiently. Experiment result shows that the number of design rule violations can be reduced by 69% with 28% lower runtime on average, after integrating these techniques in Dr. CU 2.0.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121102838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fault-Injection Based Chosen-Plaintext Attacks on Multicycle AES Implementations 基于故障注入的多周期AES选择明文攻击
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530826
Yadi Zhong, Ujjwal Guin
Hardware implementations of cryptographic algorithms offer significantly higher throughput on both encryption and decryption than their software counterparts. Advanced Encryption Standard (AES) is a widely used symmetric block cipher for data encryption. The most commonly used architecture for AES hardware implementations is the multicycle design, where each round uses the same hardware resource multiple times to increase area efficiency. In this paper, we successfully decouple the interdependency of multiple key bytes from the AES encryption. Thus, we solve each key byte separately with an overall attack complexity in O(28). Moreover, we uniquely determine each key byte through a chosen set of three plaintext-ciphertext pairs. We propose two novel chosen-plaintext attacks on multicycle AES implementations. Both attacks can eliminate the key diffusion from the MixColumns and Key Schedule modules. The first attack takes advantage of vulnerable AES implementations where an adversary can observe the output of each round. The second attack is based on fault injection, where a single fault on the completion-indicator register is sufficient to launch the attack. Because no faults are injected in the internal computations of AES, the current fault detection mechanisms are bypassed as no intermediate result has been altered. Lastly, we explore the theoretical aspect for the inherent property of our attacks.
加密算法的硬件实现在加密和解密方面都比相应的软件提供更高的吞吐量。高级加密标准AES (Advanced Encryption Standard)是一种广泛应用于数据加密的对称分组密码。AES硬件实现中最常用的架构是多周期设计,其中每轮使用相同的硬件资源多次以提高区域效率。在本文中,我们成功地从AES加密中解耦了多个密钥字节的相互依赖性。因此,我们单独解决每个密钥字节,总体攻击复杂度为0(28)。此外,我们通过一组选定的三个明文-密文对唯一地确定每个密钥字节。我们提出了两种针对多周期AES实现的选择明文攻击。这两种攻击都可以消除MixColumns和key Schedule模块中的密钥扩散。第一种攻击利用了易受攻击的AES实现,攻击者可以观察到每一轮的输出。第二种攻击基于故障注入,其中完井指示器寄存器上的单个故障足以启动攻击。由于在AES的内部计算中没有注入故障,因此由于没有改变中间结果,因此绕过了当前的故障检测机制。最后,我们探讨了我们攻击的固有属性的理论方面。
{"title":"Fault-Injection Based Chosen-Plaintext Attacks on Multicycle AES Implementations","authors":"Yadi Zhong, Ujjwal Guin","doi":"10.1145/3526241.3530826","DOIUrl":"https://doi.org/10.1145/3526241.3530826","url":null,"abstract":"Hardware implementations of cryptographic algorithms offer significantly higher throughput on both encryption and decryption than their software counterparts. Advanced Encryption Standard (AES) is a widely used symmetric block cipher for data encryption. The most commonly used architecture for AES hardware implementations is the multicycle design, where each round uses the same hardware resource multiple times to increase area efficiency. In this paper, we successfully decouple the interdependency of multiple key bytes from the AES encryption. Thus, we solve each key byte separately with an overall attack complexity in O(28). Moreover, we uniquely determine each key byte through a chosen set of three plaintext-ciphertext pairs. We propose two novel chosen-plaintext attacks on multicycle AES implementations. Both attacks can eliminate the key diffusion from the MixColumns and Key Schedule modules. The first attack takes advantage of vulnerable AES implementations where an adversary can observe the output of each round. The second attack is based on fault injection, where a single fault on the completion-indicator register is sufficient to launch the attack. Because no faults are injected in the internal computations of AES, the current fault detection mechanisms are bypassed as no intermediate result has been altered. Lastly, we explore the theoretical aspect for the inherent property of our attacks.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"262 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122698503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
GAN-Dummy Fill: Timing-aware Dummy Fill Method using GAN GAN-Dummy填充:使用GAN的定时感知Dummy填充方法
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530352
Myong Kong, D. Kim, Minhyuk Kweon, Seokhyeong Kang
The chemical mechanical polishing (CMP) dummy fill method is commonly used for the planarization of the CMP process, resulting in the development of many automated methods. We propose a dummy fill method using a generative adversarial network (GAN) that improves the existing dummy fill methods in terms of the uniformity of metal density and timing of critical nets. The dummy patterns created were similar to those of existing methods. However, the GAN dummy fill method applies additional optimizations to make the CMP dummy fill pattern efficient. The method learns by adding density and parasitic capacitance to the loss function of the GAN. Compared to dummy patterns generated from commercial tools, dummy patterns generated from GAN-dummy fill reduced the negative timing slack due to parasitic capacitance by up to 45%.
化学机械抛光(CMP)假人填充法是CMP工艺中常用的平面化方法,导致了许多自动化方法的发展。我们提出了一种使用生成对抗网络(GAN)的虚拟填充方法,该方法在金属密度均匀性和临界网的定时方面改进了现有的虚拟填充方法。创建的虚拟模式与现有方法相似。然而,GAN假体填充方法应用了额外的优化,使CMP假体填充模式高效。该方法通过在GAN的损失函数中加入密度和寄生电容进行学习。与商用工具生成的虚拟模式相比,GAN-dummy填充生成的虚拟模式将寄生电容导致的负时序松弛减少了45%。
{"title":"GAN-Dummy Fill: Timing-aware Dummy Fill Method using GAN","authors":"Myong Kong, D. Kim, Minhyuk Kweon, Seokhyeong Kang","doi":"10.1145/3526241.3530352","DOIUrl":"https://doi.org/10.1145/3526241.3530352","url":null,"abstract":"The chemical mechanical polishing (CMP) dummy fill method is commonly used for the planarization of the CMP process, resulting in the development of many automated methods. We propose a dummy fill method using a generative adversarial network (GAN) that improves the existing dummy fill methods in terms of the uniformity of metal density and timing of critical nets. The dummy patterns created were similar to those of existing methods. However, the GAN dummy fill method applies additional optimizations to make the CMP dummy fill pattern efficient. The method learns by adding density and parasitic capacitance to the loss function of the GAN. Compared to dummy patterns generated from commercial tools, dummy patterns generated from GAN-dummy fill reduced the negative timing slack due to parasitic capacitance by up to 45%.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123051812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Sextuple Cross-Coupled-DICE Based Double-Node-Upset Recoverable and Low-Delay Flip-Flop for Aerospace Applications 航天用六重交叉耦合dice双节点扰动可恢复低延迟触发器
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530355
Aibin Yan, Yu Chen, Shukai Song, Zijie Zhai, Jie Cui, Zhengfeng Huang, P. Girard, X. Wen
This paper proposes a novel sextuple cross-coupled dual-interlocked-storage-cell (DICE) based double-node-upset (DNU) recoverable and low-delay flip-flop (FF), namely SCDRL-FF, for aerospace applications. The SCDRL-FF mainly consists of sextuple cross-coupled DICEs controlled by clock-gating. The use of clock-gating based DICEs significantly reduces the CLK-Q transmission delay of the SCDRL-FF. Through the redundant and interlocked clock-gating based DICEs, the SCDRL-FF can provide complete DNU recoverability. Simulation results demonstrate the DNU recoverability of the SCDRL-FF and a 65% delay reduction on average compared with the state-of-the-art hardened FFs. The low delay overhead makes the proposed SCDRL-FF effectively applicable to high-performance applications and the DNU recoverability makes the proposed SCDRL-FF also suitable for aerospace applications.
本文提出了一种新型的基于双节点扰动(DNU)可恢复低延迟触发器(FF)的六重交叉耦合双联锁存储单元(DICE),即SCDRL-FF。SCDRL-FF主要由六个通过时钟门控控制的交叉耦合器件组成。基于时钟门控的器件的使用显著降低了SCDRL-FF的CLK-Q传输延迟。通过冗余和互锁的基于时钟门控的器件,SCDRL-FF可以提供完全的DNU恢复能力。仿真结果表明,SCDRL-FF的DNU可恢复性,与最先进的硬化ff相比,延迟平均减少65%。低时延开销使所提出的SCDRL-FF有效地适用于高性能应用,DNU可恢复性使所提出的SCDRL-FF也适用于航空航天应用。
{"title":"Sextuple Cross-Coupled-DICE Based Double-Node-Upset Recoverable and Low-Delay Flip-Flop for Aerospace Applications","authors":"Aibin Yan, Yu Chen, Shukai Song, Zijie Zhai, Jie Cui, Zhengfeng Huang, P. Girard, X. Wen","doi":"10.1145/3526241.3530355","DOIUrl":"https://doi.org/10.1145/3526241.3530355","url":null,"abstract":"This paper proposes a novel sextuple cross-coupled dual-interlocked-storage-cell (DICE) based double-node-upset (DNU) recoverable and low-delay flip-flop (FF), namely SCDRL-FF, for aerospace applications. The SCDRL-FF mainly consists of sextuple cross-coupled DICEs controlled by clock-gating. The use of clock-gating based DICEs significantly reduces the CLK-Q transmission delay of the SCDRL-FF. Through the redundant and interlocked clock-gating based DICEs, the SCDRL-FF can provide complete DNU recoverability. Simulation results demonstrate the DNU recoverability of the SCDRL-FF and a 65% delay reduction on average compared with the state-of-the-art hardened FFs. The low delay overhead makes the proposed SCDRL-FF effectively applicable to high-performance applications and the DNU recoverability makes the proposed SCDRL-FF also suitable for aerospace applications.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132471847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph Neural Network based Netlist Operator Detection under Circuit Rewriting 电路改写下基于图神经网络的网表算子检测
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530330
Guangwei Zhao, Kaveh Shamsi
Recently graph neural networks (GNN) have shown promise in detecting operators (multiplication, addition, comparison, etc.) and their boundaries in gate-level digital circuit netlists. Unlike formal approaches such as NPN Boolean matching, GNN-based methods are structural and statistical. This means that making structural changes to the circuit while maintaining its functionality may negatively impact their accuracy. In this paper, we explore this question. We show that indeed the prediction accuracy of GNN-based operator detection does fall following simple circuit rewriting. This means that custom rewrites may be a way to hamper operator detection in applications such as logic obfuscation where such undetectability is a security goal. We then present ways to improve the accuracy of prediction under such transforms by combining functional/semi-canonical information into the training and evaluation of the ML model.
近年来,图神经网络(GNN)在检测门级数字电路网络中的运算符(乘法、加法、比较等)及其边界方面显示出了很大的前景。与NPN布尔匹配等正式方法不同,基于gnn的方法是结构性和统计性的。这意味着在保持其功能的同时对电路进行结构更改可能会对其准确性产生负面影响。本文对这一问题进行了探讨。我们证明,在简单的电路重写之后,基于gnn的算子检测的预测精度确实会下降。这意味着自定义重写可能是一种阻碍应用程序中操作符检测的方法,例如逻辑混淆,其中这种不可检测性是一个安全目标。然后,我们提出了通过将功能/半规范信息结合到ML模型的训练和评估中来提高这种转换下预测准确性的方法。
{"title":"Graph Neural Network based Netlist Operator Detection under Circuit Rewriting","authors":"Guangwei Zhao, Kaveh Shamsi","doi":"10.1145/3526241.3530330","DOIUrl":"https://doi.org/10.1145/3526241.3530330","url":null,"abstract":"Recently graph neural networks (GNN) have shown promise in detecting operators (multiplication, addition, comparison, etc.) and their boundaries in gate-level digital circuit netlists. Unlike formal approaches such as NPN Boolean matching, GNN-based methods are structural and statistical. This means that making structural changes to the circuit while maintaining its functionality may negatively impact their accuracy. In this paper, we explore this question. We show that indeed the prediction accuracy of GNN-based operator detection does fall following simple circuit rewriting. This means that custom rewrites may be a way to hamper operator detection in applications such as logic obfuscation where such undetectability is a security goal. We then present ways to improve the accuracy of prediction under such transforms by combining functional/semi-canonical information into the training and evaluation of the ML model.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130994838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
PrGEMM: A Parallel Reduction SpGEMM Accelerator 平行还原SpGEMM加速器
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530387
Chien-Fu Chen, Mikko H. Lipasti
Due to increasing data sparsity in scientific data sets and pruned neural networks, it becomes more challenging to compute with these kinds of sparse data sets efficiently. Several works discuss efficient sparse matrix-vector multiplication (SpMV). However, because of index irregularity in compact stored matrices, sparse matrix-vector multiplication (SpGEMM) still suffers from the trade-off between space and efficiency of computation. In this work, we propose PrGEMM, a multiple reduction scheme which (1) computes SpGEMM under compact storage format without expansion of the operands, (2) by using index lookahead, computes and compares multiple index-data pairs at the same time with no order violation of indices. We evaluate our work with the matrices with different sizes in the SuiteSparse data set. Our work can achieve 3.3x of execution cycle improvement compared to the state-of-the-art SpGEMM scheme.
由于科学数据集和神经网络的数据稀疏性越来越高,对这些稀疏数据集的高效计算变得越来越具有挑战性。一些著作讨论了有效的稀疏矩阵向量乘法。然而,由于紧凑存储矩阵中索引的不规则性,稀疏矩阵向量乘法(SpGEMM)仍然面临着空间与计算效率之间的权衡问题。在这项工作中,我们提出了一种多重约简方案PrGEMM,它(1)在紧凑存储格式下计算SpGEMM,而不扩展操作数;(2)使用索引前瞻性,同时计算和比较多个索引数据对,而不违反索引的顺序。我们使用SuiteSparse数据集中不同大小的矩阵来评估我们的工作。与最先进的SpGEMM方案相比,我们的工作可以实现3.3倍的执行周期改进。
{"title":"PrGEMM: A Parallel Reduction SpGEMM Accelerator","authors":"Chien-Fu Chen, Mikko H. Lipasti","doi":"10.1145/3526241.3530387","DOIUrl":"https://doi.org/10.1145/3526241.3530387","url":null,"abstract":"Due to increasing data sparsity in scientific data sets and pruned neural networks, it becomes more challenging to compute with these kinds of sparse data sets efficiently. Several works discuss efficient sparse matrix-vector multiplication (SpMV). However, because of index irregularity in compact stored matrices, sparse matrix-vector multiplication (SpGEMM) still suffers from the trade-off between space and efficiency of computation. In this work, we propose PrGEMM, a multiple reduction scheme which (1) computes SpGEMM under compact storage format without expansion of the operands, (2) by using index lookahead, computes and compares multiple index-data pairs at the same time with no order violation of indices. We evaluate our work with the matrices with different sizes in the SuiteSparse data set. Our work can achieve 3.3x of execution cycle improvement compared to the state-of-the-art SpGEMM scheme.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128471584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Energy-efficient and High-precision Approximate MAC with Distributed Arithmetic Circuits 基于分布式算术电路的节能高精度近似MAC
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530383
Ziying Cui, Ke Chen, Bi Wu, C. Yan, Weiqiang Liu
In this paper, an approximate distributed arithmetic (DA) based parallel MAC is proposed. First, by adopting three kinds of approximation methods, the novel structure significantly reduces hardware complexity. Then, the result is compensated according to the analysis of the probability to enhance the precision. The hardware and error metric evaluation demonstrates that the proposed MAC achieves 25% power-delay product reduction while maintaining better precision. Finally, the Gaussian Blur application is employed to verify the proposed DA-based MAC with 6dB average PSNR improvement compared with recent state-of-the-art work.
提出了一种基于近似分布式算法(DA)的并行MAC算法。首先,采用三种近似方法,显著降低了硬件复杂度;然后根据概率分析对结果进行补偿,提高精度。硬件和误差度量评估表明,所提出的MAC在保持较高精度的同时实现了25%的功耗延迟产品降低。最后,采用高斯模糊应用程序验证了与最近最先进的工作相比,所提出的基于da的MAC平均PSNR提高了6dB。
{"title":"An Energy-efficient and High-precision Approximate MAC with Distributed Arithmetic Circuits","authors":"Ziying Cui, Ke Chen, Bi Wu, C. Yan, Weiqiang Liu","doi":"10.1145/3526241.3530383","DOIUrl":"https://doi.org/10.1145/3526241.3530383","url":null,"abstract":"In this paper, an approximate distributed arithmetic (DA) based parallel MAC is proposed. First, by adopting three kinds of approximation methods, the novel structure significantly reduces hardware complexity. Then, the result is compensated according to the analysis of the probability to enhance the precision. The hardware and error metric evaluation demonstrates that the proposed MAC achieves 25% power-delay product reduction while maintaining better precision. Finally, the Gaussian Blur application is employed to verify the proposed DA-based MAC with 6dB average PSNR improvement compared with recent state-of-the-art work.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122059567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Semi-formal Information Flow Validation for Analyzing Secret Asset Propagation in COTS IC Integrated Systems 用于分析COTS集成系统中秘密资产传播的半形式化信息流验证
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530328
Xingyu Meng, Mahmudul Hasan, K. Basu, Tamzidul Hoque
Integration of off-the-shelf components from commercial sources during system design provides a drastic reduction of product cost and development time. It also allows faster adoption of new technologies without the risks associated with research and development. Therefore, commercial off-the-shelf (COTS) components can be found in a wide range of applications, including military, aerospace, etc. However, any untrusted vendors could include hidden malicious hardware to compromise the functionality of the system or leak secret information through COTS integrated circuits (ICs). Existing trust-verification solutions are generally inapplicable for COTS hardware due to the absence of golden models for analysis. In this paper, we propose a semi-formal validation technique to protect the secret assets in a system that integrates COTS IC. Our framework identifies the paths that could propagate secret assets to surrounding COTS ICs in the system by analyzing the IC design. Our experimental results on a significantly large microprocessor core demonstrate that the proposed approach is effective in determining information flow violations within a short time and provides greater coverage and accurate identification.
在系统设计期间,集成来自商业来源的现成组件可以大大降低产品成本和开发时间。它还允许更快地采用新技术,而没有与研究和开发相关的风险。因此,商用现货(COTS)组件可以在广泛的应用中找到,包括军事,航空航天等。然而,任何不受信任的供应商都可能包含隐藏的恶意硬件,以损害系统的功能或通过COTS集成电路(ic)泄露机密信息。由于缺乏用于分析的黄金模型,现有的信任验证解决方案通常不适用于COTS硬件。在本文中,我们提出了一种半形式化的验证技术来保护集成了COTS IC的系统中的秘密资产。我们的框架通过分析IC的设计,确定了可以将秘密资产传播到系统中周围COTS IC的路径。我们在一个非常大的微处理器核心上的实验结果表明,所提出的方法在短时间内有效地确定信息流违规,并提供更大的覆盖范围和准确的识别。
{"title":"A Semi-formal Information Flow Validation for Analyzing Secret Asset Propagation in COTS IC Integrated Systems","authors":"Xingyu Meng, Mahmudul Hasan, K. Basu, Tamzidul Hoque","doi":"10.1145/3526241.3530328","DOIUrl":"https://doi.org/10.1145/3526241.3530328","url":null,"abstract":"Integration of off-the-shelf components from commercial sources during system design provides a drastic reduction of product cost and development time. It also allows faster adoption of new technologies without the risks associated with research and development. Therefore, commercial off-the-shelf (COTS) components can be found in a wide range of applications, including military, aerospace, etc. However, any untrusted vendors could include hidden malicious hardware to compromise the functionality of the system or leak secret information through COTS integrated circuits (ICs). Existing trust-verification solutions are generally inapplicable for COTS hardware due to the absence of golden models for analysis. In this paper, we propose a semi-formal validation technique to protect the secret assets in a system that integrates COTS IC. Our framework identifies the paths that could propagate secret assets to surrounding COTS ICs in the system by analyzing the IC design. Our experimental results on a significantly large microprocessor core demonstrate that the proposed approach is effective in determining information flow violations within a short time and provides greater coverage and accurate identification.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117293511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the Great Lakes Symposium on VLSI 2022
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1