首页 > 最新文献

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)最新文献

英文 中文
ATLAS: A Two-Level Layer-Aware Scheme for Routing with Cell Movement ATLAS:一种具有单元移动的两级层感知路由方案
Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549470
Xinshi Zang, Fangzhou Wang, Jinwei Liu, Martin D. F. Wong
Placement and routing are two crucial steps in the physical design of integrated circuits (ICs). To close the gap between placement and routing, the routing with cell movement problem has attracted great attention recently. In this problem, a certain number of cells can be moved to new positions and the nets can be rerouted to improve the total wire length. In this work, we advance the study on this problem by proposing a two-level layer-aware scheme, named ATLAS. A coarse-level cluster-based cell movement is first performed to optimize via usage and provides a better starting point for the next fine-level single cell movement. To further encourage routing on the upper metal layers, we utilize a set of adjusted layer weights to increase the routing cost on lower layers. Experimental results on the ICCAD 2020 contest benchmarks show that ATLAS achieves much more wire length reduction compared with the state-of-the-art routing with cell movement engine. Furthermore, applied on the ICCAD 2021 contest benchmarks, ATLAS outperforms the first place team of the contest with much better solution quality while being 3× faster.
放置和布线是集成电路物理设计中的两个关键步骤。为了缩小布局和路由之间的差距,带单元移动的路由问题近年来引起了人们的广泛关注。在这个问题中,一定数量的单元可以移动到新的位置,网可以重新路由,以提高总导线长度。在这项工作中,我们通过提出一种名为ATLAS的两层感知方案来推进这一问题的研究。首先执行粗级别的基于集群的单元移动,通过使用进行优化,并为下一个精细级别的单个单元移动提供更好的起点。为了进一步鼓励上层金属层的路由,我们利用一组调整的层权值来增加下层的路由成本。在ICCAD 2020竞赛基准测试上的实验结果表明,与最先进的带有单元移动引擎的路由相比,ATLAS实现了更多的导线长度缩减。此外,应用于ICCAD 2021比赛基准,ATLAS以更好的解决方案质量超过了比赛第一名的团队,同时速度提高了3倍。
{"title":"ATLAS: A Two-Level Layer-Aware Scheme for Routing with Cell Movement","authors":"Xinshi Zang, Fangzhou Wang, Jinwei Liu, Martin D. F. Wong","doi":"10.1145/3508352.3549470","DOIUrl":"https://doi.org/10.1145/3508352.3549470","url":null,"abstract":"Placement and routing are two crucial steps in the physical design of integrated circuits (ICs). To close the gap between placement and routing, the routing with cell movement problem has attracted great attention recently. In this problem, a certain number of cells can be moved to new positions and the nets can be rerouted to improve the total wire length. In this work, we advance the study on this problem by proposing a two-level layer-aware scheme, named ATLAS. A coarse-level cluster-based cell movement is first performed to optimize via usage and provides a better starting point for the next fine-level single cell movement. To further encourage routing on the upper metal layers, we utilize a set of adjusted layer weights to increase the routing cost on lower layers. Experimental results on the ICCAD 2020 contest benchmarks show that ATLAS achieves much more wire length reduction compared with the state-of-the-art routing with cell movement engine. Furthermore, applied on the ICCAD 2021 contest benchmarks, ATLAS outperforms the first place team of the contest with much better solution quality while being 3× faster.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122347264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Stochastic Approach to Handle Non-Determinism in Deep Learning-Based Design Rule Violation Predictions 在基于深度学习的设计规则违反预测中处理非确定性的随机方法
Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549347
Rongjian Liang, Hua Xiang, Jinwook Jung, Jiang Hu, Gi-Joon Nam
Deep learning is a promising approach to early DRV (Design Rule Violation) prediction. However, non-deterministic parallel routing hampers model training and degrades prediction accuracy. In this work, we propose a stochastic approach, called LGC-Net, to solve this problem. In this approach, we develop new techniques of Gaussian random field layer and focal likelihood loss function to seamlessly integrate Log Gaussian Cox process with deep learning. This approach provides not only statistical regression results but also classification ones with different thresholds without retraining. Experimental results with noisy training data on industrial designs demonstrate that LGC-Net achieves significantly better accuracy of DRV density prediction than prior arts.
深度学习是一种很有前途的早期DRV(设计规则违反)预测方法。然而,不确定性并行路由阻碍了模型训练,降低了预测精度。在这项工作中,我们提出了一种称为LGC-Net的随机方法来解决这个问题。在这种方法中,我们开发了高斯随机场层和焦点似然损失函数的新技术,将Log高斯Cox过程与深度学习无缝集成。该方法不仅可以提供统计回归结果,还可以在不进行再训练的情况下提供不同阈值的分类结果。工业品外观设计噪声训练数据的实验结果表明,LGC-Net对DRV密度的预测精度明显优于现有技术。
{"title":"A Stochastic Approach to Handle Non-Determinism in Deep Learning-Based Design Rule Violation Predictions","authors":"Rongjian Liang, Hua Xiang, Jinwook Jung, Jiang Hu, Gi-Joon Nam","doi":"10.1145/3508352.3549347","DOIUrl":"https://doi.org/10.1145/3508352.3549347","url":null,"abstract":"Deep learning is a promising approach to early DRV (Design Rule Violation) prediction. However, non-deterministic parallel routing hampers model training and degrades prediction accuracy. In this work, we propose a stochastic approach, called LGC-Net, to solve this problem. In this approach, we develop new techniques of Gaussian random field layer and focal likelihood loss function to seamlessly integrate Log Gaussian Cox process with deep learning. This approach provides not only statistical regression results but also classification ones with different thresholds without retraining. Experimental results with noisy training data on industrial designs demonstrate that LGC-Net achieves significantly better accuracy of DRV density prediction than prior arts.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126277495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Attack Directories on ARM big.LITTLE Processors 攻击目录对ARM大。小的处理器
Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549340
Zili Kou, Sharad Sinha, Wenjian He, W. Zhang
Eviction-based cache side-channel attacks take advantage of inclusive cache hierarchies and shared cache hardware. Processors with the template ARM big.LITTLE architecture do not guarantee such preconditions and therefore will not usually allow cross-core attacks let alone cross-cluster attacks. This work reveals a new side-channel based on the snoop filter (SF), an unexplored directory structure embedded in template ARM big.LITTLE processors. Our systematic reverse engineering unveils the undocumented structure and property of the SF, and we successfully utilize it to bootstrap cross-core and cross-cluster cache eviction. We demonstrate a comprehensive methodology to exploit the SF side-channel, including the construction of eviction sets, the covert channel, and attacks against RSA and AES. When attacking TrustZone, we conduct an interrupt-based side-channel attack to extract the key of RSA by a single profiling trace, despite the strict cache clean defense. Supported by detailed experiments, the SF side-channel not only achieves competitive performance but also overcomes the main challenge of cache side-channel attacks on ARM big.LITTLE processors.
基于驱逐的缓存侧通道攻击利用了包容性缓存层次结构和共享缓存硬件。处理器用ARM大的模板。LITTLE架构不保证这样的前提条件,因此通常不允许跨核心攻击,更不用说跨集群攻击了。这项工作揭示了一个基于snoop过滤器(SF)的新侧信道,这是一个未开发的目录结构,嵌入在模板ARM big中。小的处理器。我们的系统逆向工程揭示了SF的文档结构和特性,并成功地利用它来引导跨核和跨集群的缓存清除。我们展示了一种利用SF侧信道的综合方法,包括驱逐集的构建,隐蔽信道以及对RSA和AES的攻击。在攻击TrustZone时,我们进行了基于中断的侧信道攻击,通过单个分析跟踪提取RSA密钥,尽管有严格的缓存清理防御。在详细的实验支持下,SF侧信道不仅达到了具有竞争力的性能,而且克服了ARM大缓存侧信道攻击的主要挑战。小的处理器。
{"title":"Attack Directories on ARM big.LITTLE Processors","authors":"Zili Kou, Sharad Sinha, Wenjian He, W. Zhang","doi":"10.1145/3508352.3549340","DOIUrl":"https://doi.org/10.1145/3508352.3549340","url":null,"abstract":"Eviction-based cache side-channel attacks take advantage of inclusive cache hierarchies and shared cache hardware. Processors with the template ARM big.LITTLE architecture do not guarantee such preconditions and therefore will not usually allow cross-core attacks let alone cross-cluster attacks. This work reveals a new side-channel based on the snoop filter (SF), an unexplored directory structure embedded in template ARM big.LITTLE processors. Our systematic reverse engineering unveils the undocumented structure and property of the SF, and we successfully utilize it to bootstrap cross-core and cross-cluster cache eviction. We demonstrate a comprehensive methodology to exploit the SF side-channel, including the construction of eviction sets, the covert channel, and attacks against RSA and AES. When attacking TrustZone, we conduct an interrupt-based side-channel attack to extract the key of RSA by a single profiling trace, despite the strict cache clean defense. Supported by detailed experiments, the SF side-channel not only achieves competitive performance but also overcomes the main challenge of cache side-channel attacks on ARM big.LITTLE processors.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121031571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Security of eFPGA-based Redaction Algorithms 基于efpga编校算法的安全性评估
Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549425
Amin Rezaei, Raheel Afsharmazayejani, Jordan Maynard
Hardware IP owners must envision procedures to avoid piracy and overproduction of their designs under a fabless paradigm. A newly proposed technique to obfuscate critical components in a logic design is called eFPGA-based redaction, which replaces a sensitive sub-circuit with an embedded FPGA, and the eFPGA is configured to perform the same functionality as the missing sub-circuit. In this case, the configuration bitstream acts as a hidden key only known to the hardware IP owner. In this paper, we first evaluate the security promise of the existing eFPGA-based redaction algorithms as a preliminary study. Then, we break eFPGA-based redaction schemes by an initial but not necessarily efficient attack named DIP Exclusion that excludes problematic input patterns from checking in a brute-force manner. Finally, by combining cycle breaking and unrolling, we propose a novel and powerful attack called Break & Unroll that is able to recover the bitstream of state-of-the-art eFPGA-based redaction schemes in a relatively short time even with the existence of hard cycles and large size keys. This study reveals that the common perception that eFPGA-based redaction is by default secure against oracle-guided attacks, is prejudice. It also shows that additional research on how to systematically create an exponential number of non-combinational hard cycles is required to secure eFPGA-based redaction schemes.
硬件IP所有者必须设想在无晶圆厂范式下避免盗版和过度生产其设计的程序。一种新提出的在逻辑设计中混淆关键组件的技术被称为基于eFPGA的编校,它用嵌入式FPGA取代敏感子电路,并且eFPGA被配置为执行与缺失子电路相同的功能。在这种情况下,配置位流充当只有硬件IP所有者知道的隐藏密钥。在本文中,我们首先评估了现有的基于efpga的编校算法的安全性作为初步研究。然后,我们通过一种名为DIP Exclusion的初始但不一定有效的攻击来破坏基于efpga的编校方案,该攻击以暴力方式将有问题的输入模式从检查中排除。最后,通过结合循环破坏和展开,我们提出了一种新颖而强大的攻击,称为Break & Unroll,即使存在硬循环和大尺寸密钥,也能够在相对较短的时间内恢复最先进的基于efpga的编码器方案的比特流。这项研究表明,普遍认为基于efpga的编校在默认情况下是安全的,可以抵御神谕引导的攻击,这是一种偏见。它还表明,需要对如何系统地创建指数数量的非组合硬循环进行额外的研究,以确保基于efpga的编目方案。
{"title":"Evaluating the Security of eFPGA-based Redaction Algorithms","authors":"Amin Rezaei, Raheel Afsharmazayejani, Jordan Maynard","doi":"10.1145/3508352.3549425","DOIUrl":"https://doi.org/10.1145/3508352.3549425","url":null,"abstract":"Hardware IP owners must envision procedures to avoid piracy and overproduction of their designs under a fabless paradigm. A newly proposed technique to obfuscate critical components in a logic design is called eFPGA-based redaction, which replaces a sensitive sub-circuit with an embedded FPGA, and the eFPGA is configured to perform the same functionality as the missing sub-circuit. In this case, the configuration bitstream acts as a hidden key only known to the hardware IP owner. In this paper, we first evaluate the security promise of the existing eFPGA-based redaction algorithms as a preliminary study. Then, we break eFPGA-based redaction schemes by an initial but not necessarily efficient attack named DIP Exclusion that excludes problematic input patterns from checking in a brute-force manner. Finally, by combining cycle breaking and unrolling, we propose a novel and powerful attack called Break & Unroll that is able to recover the bitstream of state-of-the-art eFPGA-based redaction schemes in a relatively short time even with the existence of hard cycles and large size keys. This study reveals that the common perception that eFPGA-based redaction is by default secure against oracle-guided attacks, is prejudice. It also shows that additional research on how to systematically create an exponential number of non-combinational hard cycles is required to secure eFPGA-based redaction schemes.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123083049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Stochastic Mixed-Signal Circuit Design for In-sensor Privacy : (Invited Paper) 传感器内隐私的随机混合信号电路设计(特邀论文)
Pub Date : 2022-10-29 DOI: 10.1145/3508352.3561099
N. Cao, Jianbo Liu, Boyang Cheng, Muya Chang
The ubiquitous data acquisition and extensive data exchange of sensors pose severe security and privacy concerns for the end-users and the public. To enable real-time protection of raw data, it is demanding to facilitate privacy-preserving algorithms at data generation, or in-sensory privacy. However, due to the severe sensor resource constraints and intensive computation/security cost, it remains an open question of how to enable data protection algorithms with efficient c ircuit techniques. To answer this question, this paper discusses the potential of a stochastic mixed-signal (SMS) circuit for ultra-low-power, small-foot-print data security. In particular, this paper discusses digitally-controlled-oscillators (DCO) and their advantages in (1) seamless analog interface, (2) stochastic computation efficiency, and (3) unified entropy generation over conventional digital circuit baselines. With DCO as an illustrative case, we target (1) SMS privacy-preserving architecture definition and systematic SMS analysis on its performance gains across various hardware/software configurations, and (2) revisit analog/mixed-signal voltage/transistor scaling in the context of entropy-based data protection.
传感器无处不在的数据采集和广泛的数据交换给最终用户和公众带来了严重的安全和隐私问题。为了实现对原始数据的实时保护,需要在数据生成或感知隐私时促进隐私保护算法。然而,由于严重的传感器资源限制和密集的计算/安全成本,如何使用高效的c电路技术实现数据保护算法仍然是一个悬而未决的问题。为了回答这个问题,本文讨论了随机混合信号(SMS)电路在超低功耗、小足迹数据安全方面的潜力。本文特别讨论了数字控制振荡器(DCO)及其在(1)无缝模拟接口,(2)随机计算效率和(3)与传统数字电路基线相比统一熵产生方面的优势。以DCO为例,我们的目标是(1)SMS隐私保护架构定义和系统的SMS分析其在各种硬件/软件配置下的性能增益,以及(2)在基于熵的数据保护背景下重新审视模拟/混合信号电压/晶体管缩放。
{"title":"Stochastic Mixed-Signal Circuit Design for In-sensor Privacy : (Invited Paper)","authors":"N. Cao, Jianbo Liu, Boyang Cheng, Muya Chang","doi":"10.1145/3508352.3561099","DOIUrl":"https://doi.org/10.1145/3508352.3561099","url":null,"abstract":"The ubiquitous data acquisition and extensive data exchange of sensors pose severe security and privacy concerns for the end-users and the public. To enable real-time protection of raw data, it is demanding to facilitate privacy-preserving algorithms at data generation, or in-sensory privacy. However, due to the severe sensor resource constraints and intensive computation/security cost, it remains an open question of how to enable data protection algorithms with efficient c ircuit techniques. To answer this question, this paper discusses the potential of a stochastic mixed-signal (SMS) circuit for ultra-low-power, small-foot-print data security. In particular, this paper discusses digitally-controlled-oscillators (DCO) and their advantages in (1) seamless analog interface, (2) stochastic computation efficiency, and (3) unified entropy generation over conventional digital circuit baselines. With DCO as an illustrative case, we target (1) SMS privacy-preserving architecture definition and systematic SMS analysis on its performance gains across various hardware/software configurations, and (2) revisit analog/mixed-signal voltage/transistor scaling in the context of entropy-based data protection.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128135193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Usage-Based RTL Subsetting for Hardware Accelerators 硬件加速器基于使用的RTL子集
Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549391
Qinhan Tan, Aarti Gupta, S. Malik
Recent years have witnessed increasing use of domain-specific accelerators in computing platforms to provide power-performance efficiency for emerging applications. To increase their applicability within the domain, these accelerators tend to support a large set of functions, e.g. Nvidia’s open-source Deep Learning Accelerator, NVDLA, supports five distinct groups of functions [17]. However, an individual use case of an accelerator may utilize only a subset of these functions. The unused functions lead to unnecessary overhead of silicon area, power, and hardware verification/hardware-software co-verification complexity. This motivates our research question: Given an RTL design for an accelerator and a subset of functions of interest, can we automatically extract a subset of the RTL that is sufficient for these functions and sequentially equivalent to the original RTL? We call this the Usage-based RTL Subsetting problem, referred to as the RTL subsetting problem in short. We first formally define this problem and show that it can be formulated as a program synthesis problem, which can be solved by performing expensive hyperproperty checks. To overcome the high cost, we propose multiple levels of sound over-approximations to construct an effective algorithm based on relatively less expensive temporal property checking and taint analysis for information flow checking. We demonstrate the acceptable computation cost and the quality of the results of our algorithm through several case studies of accelerators from different domains. The applicability of our proposed algorithm can be seen in its ability to subset the large NVDLA accelerator (with over 50,000 registers and 1,600,000 gates) for the group of convolution functions, where the subset reduces the total number of registers by 18.6% and the total number of gates by 37.1%.
近年来,在计算平台中越来越多地使用特定领域的加速器,为新兴应用程序提供功率性能效率。但是,加速器的单个用例可能只使用这些功能的一个子集。未使用的功能会导致不必要的硅面积、功率和硬件验证/硬件软件协同验证复杂性的开销。这激发了我们的研究问题:给定加速器的RTL设计和感兴趣的函数子集,我们能否自动提取RTL的一个子集,该子集足以满足这些函数,并且顺序等效于原始RTL?我们称之为基于使用的RTL子集问题,简称RTL子集问题。我们首先正式定义了这个问题,并表明它可以公式化为一个程序综合问题,该问题可以通过执行昂贵的超性质检查来解决。为了克服高成本的问题,我们提出了多级声音过近似,构建了一种基于相对便宜的时间属性检查和污染分析的有效算法来进行信息流检查。通过对不同领域加速器的几个案例研究,我们证明了我们的算法可以接受的计算成本和结果质量。我们提出的算法的适用性可以从它对卷积函数组的大型NVDLA加速器(超过50,000个寄存器和1,600,000个门)进行子集的能力中看出,其中子集将寄存器总数减少了18.6%,门总数减少了37.1%。
{"title":"Usage-Based RTL Subsetting for Hardware Accelerators","authors":"Qinhan Tan, Aarti Gupta, S. Malik","doi":"10.1145/3508352.3549391","DOIUrl":"https://doi.org/10.1145/3508352.3549391","url":null,"abstract":"Recent years have witnessed increasing use of domain-specific accelerators in computing platforms to provide power-performance efficiency for emerging applications. To increase their applicability within the domain, these accelerators tend to support a large set of functions, e.g. Nvidia’s open-source Deep Learning Accelerator, NVDLA, supports five distinct groups of functions [17]. However, an individual use case of an accelerator may utilize only a subset of these functions. The unused functions lead to unnecessary overhead of silicon area, power, and hardware verification/hardware-software co-verification complexity. This motivates our research question: Given an RTL design for an accelerator and a subset of functions of interest, can we automatically extract a subset of the RTL that is sufficient for these functions and sequentially equivalent to the original RTL? We call this the Usage-based RTL Subsetting problem, referred to as the RTL subsetting problem in short. We first formally define this problem and show that it can be formulated as a program synthesis problem, which can be solved by performing expensive hyperproperty checks. To overcome the high cost, we propose multiple levels of sound over-approximations to construct an effective algorithm based on relatively less expensive temporal property checking and taint analysis for information flow checking. We demonstrate the acceptable computation cost and the quality of the results of our algorithm through several case studies of accelerators from different domains. The applicability of our proposed algorithm can be seen in its ability to subset the large NVDLA accelerator (with over 50,000 registers and 1,600,000 gates) for the group of convolution functions, where the subset reduces the total number of registers by 18.6% and the total number of gates by 37.1%.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121842975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hardware IP Protection against Confidentiality Attacks and Evolving Role of CAD Tool (Invited Paper) 针对机密性攻击的硬件IP保护及CAD工具角色的演变(特邀论文)
Pub Date : 2022-10-29 DOI: 10.1145/3508352.3561103
S. Bhunia, Amitabh Das, Saverio Fazzari, V. Kammler, David Kehlet, J. Rajendran, Ankur Srivastava
With growing use of hardware intellectual property (IP) based integrated circuits (IC) design and increasing reliance on a globalized supply chain, the threats to confidentiality of hardware IPs have emerged as major security concerns to the IP producers and owners. These threats are diverse, including reverse engineering (RE), piracy, cloning, and extraction of design secrets, and span different phases of electronics life cycle. The academic research community and the semiconductor industry have made significant efforts over the past decade on developing effective methodologies and CAD tools targeted to protect hardware IPs against these threats. These solutions include watermarking, logic locking, obfuscation, camouflaging, split manufacturing, and hardware redaction. This paper focuses on key topics on confidentiality of hardware IPs encompassing the major threats, protection approaches, security analysis, and metrics. It discusses the strengths and limitations of the major solutions in protecting hardware IPs against the confidentiality attacks, and future directions to address the limitations in the modern supply chain ecosystem.
随着基于硬件知识产权(IP)的集成电路(IC)设计的使用越来越多,以及对全球化供应链的依赖越来越多,对硬件知识产权保密性的威胁已经成为知识产权生产者和所有者的主要安全问题。这些威胁是多种多样的,包括逆向工程(RE)、盗版、克隆和设计秘密的提取,并且跨越电子产品生命周期的不同阶段。在过去的十年中,学术研究界和半导体行业在开发有效的方法和CAD工具方面做出了重大努力,旨在保护硬件ip免受这些威胁。这些解决方案包括水印、逻辑锁定、混淆、伪装、拆分制造和硬件编校。本文重点讨论了硬件ip的机密性问题,包括主要威胁、保护方法、安全分析和度量。讨论了保护硬件ip免受机密性攻击的主要解决方案的优势和局限性,以及解决现代供应链生态系统限制的未来方向。
{"title":"Hardware IP Protection against Confidentiality Attacks and Evolving Role of CAD Tool (Invited Paper)","authors":"S. Bhunia, Amitabh Das, Saverio Fazzari, V. Kammler, David Kehlet, J. Rajendran, Ankur Srivastava","doi":"10.1145/3508352.3561103","DOIUrl":"https://doi.org/10.1145/3508352.3561103","url":null,"abstract":"With growing use of hardware intellectual property (IP) based integrated circuits (IC) design and increasing reliance on a globalized supply chain, the threats to confidentiality of hardware IPs have emerged as major security concerns to the IP producers and owners. These threats are diverse, including reverse engineering (RE), piracy, cloning, and extraction of design secrets, and span different phases of electronics life cycle. The academic research community and the semiconductor industry have made significant efforts over the past decade on developing effective methodologies and CAD tools targeted to protect hardware IPs against these threats. These solutions include watermarking, logic locking, obfuscation, camouflaging, split manufacturing, and hardware redaction. This paper focuses on key topics on confidentiality of hardware IPs encompassing the major threats, protection approaches, security analysis, and metrics. It discusses the strengths and limitations of the major solutions in protecting hardware IPs against the confidentiality attacks, and future directions to address the limitations in the modern supply chain ecosystem.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123199217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic Frequency Boosting beyond Critical Path Delay 超过关键路径延迟的动态频率提升
Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549433
N. Zompakis, S. Xydis
This paper introduces an innovative post-implementation Dynamic Frequency Boosting (DFB) technique to release "hidden" performance margins of digital circuit designs currently suppressed by typical critical path constraint design flows, thus defining higher limits of operation speed. The proposed technique goes beyond state-of-the-art and exploits the data-driven path delay variability incorporating an innovative hardware clocking mechanism that detects in real-time the paths’ activation. In contrast to timing speculation, the operating speed is adjusted on the nominal path delay activation, succeeding an error-free acceleration. The proposed technique has been evaluated on three FPGA-based use cases carefully selected to exhibit differing domain characteristics, i.e i) a third party DNN inference accelerator IP for CIFAR-10 images achieving an average speedup of 18%, ii) a highly designer-optimized Optical Digital Equalizer design, in which DBF delivered a speedup of 50% and iii) a set of 5 synthetic designs examining high frequency (beyond 400 MHz) applications in FPGAs, achieving accelerations of 20-60% depending on the underlying path variability.
本文介绍了一种创新的实现后动态频率提升(DFB)技术,以释放目前被典型关键路径约束设计流程所抑制的数字电路设计的“隐藏”性能边际,从而定义更高的运行速度限制。所提出的技术超越了最先进的技术,利用数据驱动的路径延迟可变性,结合创新的硬件时钟机制,实时检测路径的激活。与时间推测相反,运行速度在名义路径延迟激活上进行调整,随后进行无误差加速。所提出的技术已经在三个基于fpga的用例中进行了评估,这些用例经过精心挑选,表现出不同的领域特征,即i)用于CIFAR-10图像的第三方DNN推理加速器IP实现了18%的平均加速,ii)高度优化的光学数字均衡器设计,其中DBF提供了50%的加速,iii)一组5合成设计检查fpga中的高频(超过400 MHz)应用。根据潜在的路径可变性,实现20-60%的加速度。
{"title":"Dynamic Frequency Boosting beyond Critical Path Delay","authors":"N. Zompakis, S. Xydis","doi":"10.1145/3508352.3549433","DOIUrl":"https://doi.org/10.1145/3508352.3549433","url":null,"abstract":"This paper introduces an innovative post-implementation Dynamic Frequency Boosting (DFB) technique to release \"hidden\" performance margins of digital circuit designs currently suppressed by typical critical path constraint design flows, thus defining higher limits of operation speed. The proposed technique goes beyond state-of-the-art and exploits the data-driven path delay variability incorporating an innovative hardware clocking mechanism that detects in real-time the paths’ activation. In contrast to timing speculation, the operating speed is adjusted on the nominal path delay activation, succeeding an error-free acceleration. The proposed technique has been evaluated on three FPGA-based use cases carefully selected to exhibit differing domain characteristics, i.e i) a third party DNN inference accelerator IP for CIFAR-10 images achieving an average speedup of 18%, ii) a highly designer-optimized Optical Digital Equalizer design, in which DBF delivered a speedup of 50% and iii) a set of 5 synthetic designs examining high frequency (beyond 400 MHz) applications in FPGAs, achieving accelerations of 20-60% depending on the underlying path variability.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131346586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GPU-Accelerated Rectilinear Steiner Tree Generation gpu加速直线斯坦纳树生成
Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549434
Zizheng Guo, Feng Gu, Yibo Lin
Rectilinear Steiner minimum tree (RSMT) generation is a fundamental component in the VLSI design automation flow. Due to its extensive usage in circuit design iterations at early design stages like synthesis, placement, and routing, the performance of RSMT generation is critical for a reasonable design turnaround time. State-of-the-art RSMT generation algorithms, like fast look-up table estimation (FLUTE), are constrained by CPU-based parallelism with limited runtime improvements. The acceleration of RSMT on GPUs is an important yet difficult task, due to the complex and non-trivial divide-and-conquer computation patterns with recursions. In this paper, we present the first GPU-accelerated RSMT generation algorithm based on FLUTE. By designing GPU-efficient data structures and levelized decomposition, table look-up, and merging operations, we incorporate large-scale data parallelism into the generation of Steiner trees. An up to 10.47× runtime speed-up has been achieved compared with FLUTE running on 40 CPU cores, filling in a critical missing component in today’s GPU-accelerated design automation framework.
线性斯坦纳最小树(RSMT)生成是VLSI设计自动化流程中的一个基本组成部分。由于RSMT在早期设计阶段(如合成、放置和路由)的电路设计迭代中广泛使用,因此RSMT生成的性能对于合理的设计周转时间至关重要。最先进的RSMT生成算法,如快速查找表估计(FLUTE),受到基于cpu的并行性和有限的运行时改进的限制。gpu上的RSMT加速是一项重要而又困难的任务,因为递归的分治计算模式非常复杂。在本文中,我们提出了第一个基于FLUTE的gpu加速RSMT生成算法。通过设计gpu高效的数据结构和分层分解、表查找和合并操作,我们将大规模数据并行性融入到斯坦纳树的生成中。与在40个CPU内核上运行的FLUTE相比,实现了高达10.47倍的运行速度提升,填补了当今gpu加速设计自动化框架中一个关键的缺失组件。
{"title":"GPU-Accelerated Rectilinear Steiner Tree Generation","authors":"Zizheng Guo, Feng Gu, Yibo Lin","doi":"10.1145/3508352.3549434","DOIUrl":"https://doi.org/10.1145/3508352.3549434","url":null,"abstract":"Rectilinear Steiner minimum tree (RSMT) generation is a fundamental component in the VLSI design automation flow. Due to its extensive usage in circuit design iterations at early design stages like synthesis, placement, and routing, the performance of RSMT generation is critical for a reasonable design turnaround time. State-of-the-art RSMT generation algorithms, like fast look-up table estimation (FLUTE), are constrained by CPU-based parallelism with limited runtime improvements. The acceleration of RSMT on GPUs is an important yet difficult task, due to the complex and non-trivial divide-and-conquer computation patterns with recursions. In this paper, we present the first GPU-accelerated RSMT generation algorithm based on FLUTE. By designing GPU-efficient data structures and levelized decomposition, table look-up, and merging operations, we incorporate large-scale data parallelism into the generation of Steiner trees. An up to 10.47× runtime speed-up has been achieved compared with FLUTE running on 40 CPU cores, filling in a critical missing component in today’s GPU-accelerated design automation framework.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120952125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Novel Blockage-avoiding Macro Placement Approach for 3D ICs based on POCS 一种基于POCS的三维集成电路免堵塞宏放置方法
Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549352
Jai-Ming Lin, Po-Chen Lu, Heng-Yu Lin, Jia-Ting Tsai
Although the 3D integrated circuit (IC) placement problem has been studied for many years, few publications devoted to the macro legalization. Due to large sizes of macros, the macro placement problem is harder than cell placement , especially when preplaced macros exist in a multi-tier structure. In order to have a more global view, this paper proposes the partitioning-last macro-first flow to handle 3D placement for mixed-size designs, which performs tier partitioning after placement prototyping and then legalizes macros before cell placement. A novel two-step approach is proposed to handle 3D macro placement. The first step determines locations of macros in a projection plane based on a new representation, named K-tier Partially Occupied Corner Stitching. It not only can keep the prototyping result but also guarantees a legal placement after tier assignment of macros. Next, macros are assigned to respective tiers by Integer Linear Programming (ILP) algorithm. Experimental results show that our design flow can obtain better solutions than other flows especially in the cases with more preplaced macros.
虽然对三维集成电路(IC)布局问题的研究已有多年,但很少有文章对其进行宏观合法化。由于宏的大小很大,宏的放置问题比单元格的放置更难,特别是当预先放置的宏存在于多层结构中时。为了有一个更全局的视角,本文提出了分区-最后-宏优先流程来处理混合尺寸设计的3D布局,该流程在布局原型之后进行分层划分,然后在单元放置之前对宏进行合法化。提出了一种新的两步法来处理三维宏放置问题。第一步基于一种新的表示确定投影平面中宏的位置,称为k层部分占用角拼接。它不仅可以保留原型结果,还可以保证宏在层分配后的合法位置。其次,通过整数线性规划(ILP)算法将宏分配到各自的层。实验结果表明,本文设计的流程能够较好地解决问题,特别是在预置宏较多的情况下。
{"title":"A Novel Blockage-avoiding Macro Placement Approach for 3D ICs based on POCS","authors":"Jai-Ming Lin, Po-Chen Lu, Heng-Yu Lin, Jia-Ting Tsai","doi":"10.1145/3508352.3549352","DOIUrl":"https://doi.org/10.1145/3508352.3549352","url":null,"abstract":"Although the 3D integrated circuit (IC) placement problem has been studied for many years, few publications devoted to the macro legalization. Due to large sizes of macros, the macro placement problem is harder than cell placement , especially when preplaced macros exist in a multi-tier structure. In order to have a more global view, this paper proposes the partitioning-last macro-first flow to handle 3D placement for mixed-size designs, which performs tier partitioning after placement prototyping and then legalizes macros before cell placement. A novel two-step approach is proposed to handle 3D macro placement. The first step determines locations of macros in a projection plane based on a new representation, named K-tier Partially Occupied Corner Stitching. It not only can keep the prototyping result but also guarantees a legal placement after tier assignment of macros. Next, macros are assigned to respective tiers by Integer Linear Programming (ILP) algorithm. Experimental results show that our design flow can obtain better solutions than other flows especially in the cases with more preplaced macros.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116468741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1