2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)最新文献

英文中文

McPAT-Calib: A Microarchitecture Power Modeling Framework for Modern CPUs McPAT-Calib:现代cpu的微架构功率建模框架

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643508

Jianwang Zhai, Chen Bai, Binwu Zhu, Yici Cai, Qiang Zhou, Bei Yu

Energy efficiency has become the core issue of modern CPUs, and it is difficult for existing power models to balance speed, generality, and accuracy. This paper introduces McPAT-Calib, a microarchitecture power modeling framework, which combines McPAT with machine learning (ML) calibration methods. McPAT-Calib can quickly and accurately estimate the power of different benchmarks running on different CPU configurations, and provide an effective evaluation tool for the design of modern CPUs. First, McPAT-7nm is introduced to support the analytical power modeling for the 7nm technology node. Then, a wide range of modeling features are identified, and automatic feature selection and advanced regression methods are used to calibrate the McPAT-7nm modeling results, which greatly improves the generality and accuracy. Moreover, a sampling algorithm based on active learning (AL) is leveraged to effectively reduce the labeling cost. We use up to 15 configurations of 7nm RISC-V Berkeley Out-of-Order Machine (BOOM) along with 80 benchmarks to extensively evaluate the proposed framework. Compared with state-of-the-art microarchitecture power models, McPAT-Calib can reduce the mean absolute percentage error (MAPE) of shuffle-split cross-validation by 5.95%. More importantly, the MAPE is reduced by 6.14% and 3.64% for the evaluations of unknown CPU configurations and benchmarks, respectively. The AL sampling algorithm can reduce the demand of labeled samples by 50 %, while the accuracy loss is only 0.44 %.

能效已成为现代cpu的核心问题，现有的功耗模型难以平衡速度、通用性和准确性。介绍了McPAT- calib微架构功率建模框架，该框架将McPAT与机器学习(ML)校准方法相结合。McPAT-Calib可以快速准确地估算不同CPU配置下不同基准测试的性能，为现代CPU的设计提供了有效的评估工具。首先，引入McPAT-7nm以支持7nm技术节点的分析功率建模。然后，识别广泛的建模特征，并采用自动特征选择和先进的回归方法对McPAT-7nm建模结果进行校准，大大提高了建模结果的通用性和准确性。此外，利用基于主动学习(AL)的采样算法有效地降低了标注成本。我们使用多达15种7nm RISC-V伯克利乱序机(BOOM)配置以及80个基准测试来广泛评估拟议的框架。与最先进的微架构功率模型相比，McPAT-Calib可将洗牌-分裂交叉验证的平均绝对百分比误差(MAPE)降低5.95%。更重要的是，对于未知CPU配置和基准测试的评估，MAPE分别降低了6.14%和3.64%。人工智能采样算法可以减少50%的标记样本需求，而精度损失仅为0.44%。

{"title":"McPAT-Calib: A Microarchitecture Power Modeling Framework for Modern CPUs","authors":"Jianwang Zhai, Chen Bai, Binwu Zhu, Yici Cai, Qiang Zhou, Bei Yu","doi":"10.1109/ICCAD51958.2021.9643508","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643508","url":null,"abstract":"Energy efficiency has become the core issue of modern CPUs, and it is difficult for existing power models to balance speed, generality, and accuracy. This paper introduces McPAT-Calib, a microarchitecture power modeling framework, which combines McPAT with machine learning (ML) calibration methods. McPAT-Calib can quickly and accurately estimate the power of different benchmarks running on different CPU configurations, and provide an effective evaluation tool for the design of modern CPUs. First, McPAT-7nm is introduced to support the analytical power modeling for the 7nm technology node. Then, a wide range of modeling features are identified, and automatic feature selection and advanced regression methods are used to calibrate the McPAT-7nm modeling results, which greatly improves the generality and accuracy. Moreover, a sampling algorithm based on active learning (AL) is leveraged to effectively reduce the labeling cost. We use up to 15 configurations of 7nm RISC-V Berkeley Out-of-Order Machine (BOOM) along with 80 benchmarks to extensively evaluate the proposed framework. Compared with state-of-the-art microarchitecture power models, McPAT-Calib can reduce the mean absolute percentage error (MAPE) of shuffle-split cross-validation by 5.95%. More importantly, the MAPE is reduced by 6.14% and 3.64% for the evaluations of unknown CPU configurations and benchmarks, respectively. The AL sampling algorithm can reduce the demand of labeled samples by 50 %, while the accuracy loss is only 0.44 %.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115307231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

GPU-accelerated Critical Path Generation with Path Constraints 具有路径约束的gpu加速关键路径生成

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643504

Guannan Guo, Tsung-Wei Huang, Yibo Lin, Martin D. F. Wong

Path-based Analysis (PBA) is a pivotal step in Static Timing Analysis (STA) for reducing slack pessimism and improving quality of results. Optimization flows often invoke PBA repeatedly with different critical path constraints to verify correct timing behavior under certain logic cone. However, PBA is extremely time consuming and state-of-the-art PBA algorithms are hardly scaled beyond a few CPU threads under constrained search space. In order to achieve new performance milestone, in this work, we propose a new GPU-accelerated PBA algorithm which can handle extensive path constraints and quickly report arbitrary number of critical paths in constrained search space. Experimental results show that our algorithm can generated identical path report and achieve up to 102x speed up on a million-gate design compared to the state-of-the-art algorithm.

基于路径的分析(PBA)是静态时序分析(STA)中减少懒散悲观情绪和提高结果质量的关键步骤。优化流程经常在不同的关键路径约束下重复调用PBA来验证在一定的逻辑锥下正确的时序行为。然而，PBA非常耗时，并且在受限的搜索空间下，最先进的PBA算法很难扩展到几个CPU线程之外。为了达到新的性能里程碑，本文提出了一种新的gpu加速PBA算法，该算法可以处理广泛的路径约束，并在约束搜索空间中快速报告任意数量的关键路径。实验结果表明，该算法可以生成相同的路径报告，并且在百万栅极设计上的速度比现有算法提高了102倍。

引用次数: 9

Lower Voltage for Higher Security: Using Voltage Overscaling to Secure Deep Neural Networks 低电压高安全性:使用电压过刻度保护深度神经网络

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643551

Shohidul Islam, Ihsen Alouani, Khaled N. Khasawneh

Deep neural networks (DNNs) are shown to be vulnerable to adversarial attacks—carefully crafted additive noise that undermines DNNs integrity. Previously proposed defenses against these attacks require substantial overheads, making it challenging to deploy these solutions in power and computational resource-constrained devices, such as embedded systems and the Edge. In this paper, we explore the use of voltage over-scaling (VOS) as a lightweight defense against adversarial attacks. Specifically, we exploit the stochastic timing violations of VOS to implement a moving-target defense for DNNs. Our experimental results demonstrate that VOS guarantees effective defense against different attack methods, does not require any software/hardware modifications, and offers a by-product reduction in power consumption.

研究表明，深度神经网络(dnn)容易受到对抗性攻击——精心制作的附加噪声会破坏dnn的完整性。以前提出的针对这些攻击的防御需要大量的开销，这使得在功率和计算资源受限的设备(如嵌入式系统和Edge)中部署这些解决方案具有挑战性。在本文中，我们探讨了使用电压过标度(VOS)作为对抗对抗性攻击的轻量级防御。具体来说，我们利用VOS的随机时序违规来实现dnn的移动目标防御。我们的实验结果表明，VOS保证了对不同攻击方法的有效防御，不需要任何软件/硬件修改，并且提供了降低功耗的副产品。

引用次数: 4

From Specification to Topology: Automatic Power Converter Design via Reinforcement Learning 从规格到拓扑:基于强化学习的自动电源转换器设计

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643552

Shaoze Fan, N. Cao, Shun Zhang, Jing Li, Xiaoxiao Guo, Xin Zhang

The tidal waves of modern electronic/electrical devices have led to increasing demands for ubiquitous application-specific power converters. A conventional manual design procedure of such power converters is computation- and labor-intensive, which involves selecting and connecting component devices, tuning component-wise parameters and control schemes, and iteratively evaluating and optimizing the design. To automate and speed up this design process, we propose an automatic framework that designs custom power converters from design specifications using reinforcement learning. Specifically, the framework embraces upper-confidence-bound-tree-based (UCT-based) reinforcement learning to automate topology space exploration with circuit design specification-encoded reward signals. Moreover, our UCT-based approach can exploit small offline data via the specially designed default policy to accelerate topology space exploration. Further, it utilizes a hybrid circuit evaluation strategy to substantially reduces design evaluation costs. Empirically, we demonstrated that our framework could generate energy-efficient circuit topologies for various target voltage conversion ratios. Compared to existing automatic topology optimization strategies, the proposed method is much more computationally efficient - it can generate topologies with the same quality while being up to 67% faster. Additionally, we discussed some interesting circuits discovered by our framework.

现代电子/电气设备的浪潮导致对无处不在的特定应用的电源转换器的需求不断增加。这种功率变换器的传统手工设计过程是计算和劳动密集型的，包括选择和连接元件器件，调整元件参数和控制方案，以及迭代评估和优化设计。为了自动化和加速这一设计过程，我们提出了一个使用强化学习从设计规范设计定制电源转换器的自动框架。具体来说，该框架包含基于上置信度约束树(UCT-based)的强化学习，通过电路设计规范编码的奖励信号自动进行拓扑空间探索。此外，我们的方法可以通过特别设计的默认策略来利用小的离线数据，从而加速拓扑空间的探索。此外，它采用混合电路评估策略，大大降低了设计评估成本。根据经验，我们证明了我们的框架可以为各种目标电压转换比生成节能电路拓扑。与现有的自动拓扑优化策略相比，所提出的方法具有更高的计算效率-它可以在生成相同质量的拓扑的同时提高高达67%的速度。此外，我们还讨论了我们的框架发现的一些有趣的电路。

{"title":"From Specification to Topology: Automatic Power Converter Design via Reinforcement Learning","authors":"Shaoze Fan, N. Cao, Shun Zhang, Jing Li, Xiaoxiao Guo, Xin Zhang","doi":"10.1109/ICCAD51958.2021.9643552","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643552","url":null,"abstract":"The tidal waves of modern electronic/electrical devices have led to increasing demands for ubiquitous application-specific power converters. A conventional manual design procedure of such power converters is computation- and labor-intensive, which involves selecting and connecting component devices, tuning component-wise parameters and control schemes, and iteratively evaluating and optimizing the design. To automate and speed up this design process, we propose an automatic framework that designs custom power converters from design specifications using reinforcement learning. Specifically, the framework embraces upper-confidence-bound-tree-based (UCT-based) reinforcement learning to automate topology space exploration with circuit design specification-encoded reward signals. Moreover, our UCT-based approach can exploit small offline data via the specially designed default policy to accelerate topology space exploration. Further, it utilizes a hybrid circuit evaluation strategy to substantially reduces design evaluation costs. Empirically, we demonstrated that our framework could generate energy-efficient circuit topologies for various target voltage conversion ratios. Compared to existing automatic topology optimization strategies, the proposed method is much more computationally efficient - it can generate topologies with the same quality while being up to 67% faster. Additionally, we discussed some interesting circuits discovered by our framework.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116562218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

An Efficient Two-phase Method for Prime Compilation of Non-clausal Boolean Formulae 非子句布尔公式的一种有效的两阶段素编译方法

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643520

Weilin Luo, Hai Wan, Hongzhen Zhong, Ou Wei, Biqing Fang, Xiaotong Song

Prime compilation aims to generate all prime implicates/implicants of a Boolean formula. Recently, prime compilation of non-clausal formulae has received great attention. Since it is hard for $Sigma_{2}^{P}$, existing methods have performance issues. We argue that the main performance bottleneck stems from enlarging the search space using dual rail (DR) encoding, and computing a minimal clausal formula as a by-product. To deal with the issue, we propose a two-phase approach, namely CoAPI, for prime compilation of non-clausal formulae. Thanks to the two-phase framework, we construct a clausal formula without using DR encoding. In addition, to improve performance, the key in our work is a novel bounded prime extraction (BPE) method that, interleaving extracting prime implicates with extracting small implicates, enables constructing a succinct clausal formula rather than a minimal one. Following the assessment way of the state-of-the-art (SOTA) work, we show that CoAPI achieves SOTA performance. Particularly, for generating all prime implicates, CoAPI is up to about one order of magnitude faster. Moreover, we evaluate CoAPI on a benchmark sourcing from real-world industries. The results also confirm the outperformance of CoAPI11Our code and benchmarks are publicly available at https://github.com/LuoWeiLinWillam/CoAPI.

素数编译的目的是生成布尔公式的所有素数隐含项/隐含项。最近，非条款公式的初步编制受到了极大的关注。由于$Sigma_{2}^{P}$很难，现有的方法存在性能问题。我们认为，主要的性能瓶颈源于使用双轨(DR)编码扩大搜索空间，并计算最小子句公式作为副产品。为了解决这一问题，我们建议采用两阶段方法，即CoAPI，对非条款公式进行初步汇编。得益于两阶段框架，我们构建了一个不使用DR编码的子句公式。此外，为了提高性能，我们工作的关键是一种新的有界素数提取(BPE)方法，该方法将提取素数隐含和提取小隐含交织在一起，可以构建简洁的子句公式，而不是最小的子句公式。根据最先进(SOTA)工作的评估方法，我们表明CoAPI达到了SOTA性能。特别是，在生成所有质数隐含时，CoAPI的速度要快一个数量级。此外，我们根据来自真实行业的基准来评估CoAPI。测试结果也证实了coapi11的优越性能。我们的代码和基准测试可以在https://github.com/LuoWeiLinWillam/CoAPI上公开获得。

{"title":"An Efficient Two-phase Method for Prime Compilation of Non-clausal Boolean Formulae","authors":"Weilin Luo, Hai Wan, Hongzhen Zhong, Ou Wei, Biqing Fang, Xiaotong Song","doi":"10.1109/ICCAD51958.2021.9643520","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643520","url":null,"abstract":"Prime compilation aims to generate all prime implicates/implicants of a Boolean formula. Recently, prime compilation of non-clausal formulae has received great attention. Since it is hard for $Sigma_{2}^{P}$, existing methods have performance issues. We argue that the main performance bottleneck stems from enlarging the search space using dual rail (DR) encoding, and computing a minimal clausal formula as a by-product. To deal with the issue, we propose a two-phase approach, namely CoAPI, for prime compilation of non-clausal formulae. Thanks to the two-phase framework, we construct a clausal formula without using DR encoding. In addition, to improve performance, the key in our work is a novel bounded prime extraction (BPE) method that, interleaving extracting prime implicates with extracting small implicates, enables constructing a succinct clausal formula rather than a minimal one. Following the assessment way of the state-of-the-art (SOTA) work, we show that CoAPI achieves SOTA performance. Particularly, for generating all prime implicates, CoAPI is up to about one order of magnitude faster. Moreover, we evaluate CoAPI on a benchmark sourcing from real-world industries. The results also confirm the outperformance of CoAPI11Our code and benchmarks are publicly available at https://github.com/LuoWeiLinWillam/CoAPI.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117023262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

DevelSet: Deep Neural Level Set for Instant Mask Optimization 用于即时掩码优化的深度神经水平集

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643464

Guojin Chen, Ziyang Yu, Hongduo Liu, Yuzhe Ma, Bei Yu

With the feature size continuously shrinking in advanced technology nodes, mask optimization is increasingly crucial in the conventional design flow, accompanied by an explosive growth in prohibitive computational overhead in optical proximity correction (OPC) methods. Recently, inverse lithography technique (ILT) has drawn significant attention and is becoming prevalent in emerging OPC solutions. However, ILT methods are either time-consuming or in weak performance of mask printability and manufacturability. In this paper, we present DevelSet, a GPU and deep neural network (DNN) accelerated level set OPC framework for metal layer. We first improve the conventional level set-based ILT algorithm by introducing the curvature term to reduce mask complexity and applying GPU acceleration to overcome computational bottlenecks. To further enhance printability and fast iterative convergence, we propose a novel deep neural network delicately designed with level set intrinsic principles to facilitate the joint optimization of DNN and GPU accelerated level set optimizer. Experimental results show that DevelSet framework surpasses the state-of-the-art methods in printability and boost the runtime performance achieving instant level (around 1 second).

随着先进技术节点的特征尺寸不断缩小，掩模优化在传统设计流程中变得越来越重要，同时光学接近校正(OPC)方法的计算开销也在爆炸式增长。最近，逆光刻技术(ILT)引起了人们的极大关注，并在新兴的OPC解决方案中变得越来越普遍。然而，ILT方法要么耗时，要么掩模可打印性和可制造性性能较差。在本文中，我们提出了一个GPU和深度神经网络(DNN)加速的金属层水平集OPC框架developset。我们首先改进了传统的基于水平集的ILT算法，通过引入曲率项来降低掩码复杂度，并使用GPU加速来克服计算瓶颈。为了进一步提高可打印性和快速迭代收敛性，我们提出了一种新的深度神经网络，巧妙地设计了水平集内在原理，以促进DNN和GPU加速水平集优化器的联合优化。实验结果表明，developset框架在可打印性方面超越了目前最先进的方法，并将运行时性能提高到即时水平(约1秒)。

{"title":"DevelSet: Deep Neural Level Set for Instant Mask Optimization","authors":"Guojin Chen, Ziyang Yu, Hongduo Liu, Yuzhe Ma, Bei Yu","doi":"10.1109/ICCAD51958.2021.9643464","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643464","url":null,"abstract":"With the feature size continuously shrinking in advanced technology nodes, mask optimization is increasingly crucial in the conventional design flow, accompanied by an explosive growth in prohibitive computational overhead in optical proximity correction (OPC) methods. Recently, inverse lithography technique (ILT) has drawn significant attention and is becoming prevalent in emerging OPC solutions. However, ILT methods are either time-consuming or in weak performance of mask printability and manufacturability. In this paper, we present DevelSet, a GPU and deep neural network (DNN) accelerated level set OPC framework for metal layer. We first improve the conventional level set-based ILT algorithm by introducing the curvature term to reduce mask complexity and applying GPU acceleration to overcome computational bottlenecks. To further enhance printability and fast iterative convergence, we propose a novel deep neural network delicately designed with level set intrinsic principles to facilitate the joint optimization of DNN and GPU accelerated level set optimizer. Experimental results show that DevelSet framework surpasses the state-of-the-art methods in printability and boost the runtime performance achieving instant level (around 1 second).","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130558382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Feedback-Guided Circuit Structure Mutation for Testing Hardware Model Checkers 测试硬件模型检查器的反馈引导电路结构突变

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643509

Chengyu Zhang, Minquan Sun, Jianwen Li, Ting Su, G. Pu

We introduce Circuit Structure Mutation, a simple but effective mutation-based testing approach, for testing hardware model checkers. The key idea is to mutate the existing And-Inverter Graph (AIG) circuit by manipulating the relations among the components in the graph while preserving the validity of the mutant. Based on Circuit Structure Mutation, we implemented a feedback-guided testing tool named Hammer. In our evaluation, Hammer shows its effectiveness on finding bugs, increasing test coverage, and finding performance optimization chances, which can help the hardware model checker developers improve the reliability and the performance of their tools.

我们介绍了一种简单而有效的基于电路结构突变的测试方法，用于测试硬件模型检查器。其关键思想是在保持突变体有效性的前提下，通过控制图中各分量之间的关系，对已有的与逆变图(AIG)电路进行突变。基于电路结构突变，我们实现了一个反馈导向的测试工具Hammer。在我们的评估中，Hammer显示了它在查找错误、增加测试覆盖率和查找性能优化机会方面的有效性，这可以帮助硬件模型检查器开发人员改进其工具的可靠性和性能。

引用次数: 1

2021 CAD Contest Problem A: Functional ECO with Behavioral Change Guidance Invited Paper 2021 CAD竞赛题目A:行为改变指导下的功能性ECO

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643492

Yen-Chun Fang, Shao-Lun Huang, Chi-An Wu, Chung-Han Chou, Chih-Jen Hsu, WoeiTzy Jong, Kei-Yong Khoo

Functional ECO is an essential solution in the VLSI design flow. The technique is to realize the functional changes with a minimal patch netlist in the gate-level netlist. As the increasing of the design complexity, functional ECO becomes more and more difficult to generate a minimal patch. ICCAD 2021 CAD contest calls for a feasible and efficient ECO algorithm with behavioral change guidance. More than ordinary functional ECO problems, the RTL designs are provided. Contestants can utilize the behavioral change in RTL designs to minimize the patch for G1.

功能性ECO是VLSI设计流程中必不可少的解决方案。该技术是在门级网表中用最小的补丁网表实现功能变化。随着设计复杂度的增加，功能性ECO越来越难以生成最小的补丁。ICCAD 2021 CAD竞赛要求一种具有行为改变指导的可行且高效的ECO算法。除了普通的功能性ECO问题外，还提供了RTL设计。参赛者可以利用RTL设计中的行为改变来减少G1的补丁。

引用次数: 0

Overcoming the Memory Hierarchy Inefficiencies in Graph Processing Applications 克服图形处理应用中的内存层次低效

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643434

Jilan Lin, Shuangchen Li, Yufei Ding, Yuan Xie

Graph processing participates a vital role in mining relational data. However, the intensive but inefficient memory accesses make graph processing applications severely bottlenecked by the conventional memory hierarchy. In this work, we focus on inefficiencies that exist on both on-chip cache and off-chip memory. First, graph processing is known dominated by expensive random accesses, which are difficult to be captured by conventional cache and prefetcher architectures, leading to low cache hits and exhausting main memory visits. Second, the off-chip bandwidth is further underutilized by the small data granularity. Because each vertex/edge data in the graph only needs 4-8B, which is much smaller than the memory access granularity of 64B. Thus, lots of bandwidth is wasted fetching unnecessary data. Therefore, we present G-MEM, a customized memory hierarchy design for graph processing applications. First, we propose a coherence-free scratchpad as the on-chip memory, which leverages the power-law characteristic of graphs and only stores those hot data that are frequent-accessed. We equip the scratchpad memory with a degree-aware mapping strategy to better manage it for various applications. On the other hand, we design an elastic-granularity DRAM (EG-DRAM) to facilitate the main memory access. The EG-DRAM is based on near-data processing architecture, which processes and coalesces multiple fine-grained memory accesses together to maximize bandwidth efficiency. Putting them together, the G-MEM demonstrates a 2.48 × overall speedup over a vanilla CPU, with 1.44 × and 1.79 × speedup against the state-of-the-art cache architecture and memory subsystem, respectively.

图处理在关系数据挖掘中起着至关重要的作用。然而，大量而低效的内存访问使得图形处理应用受到传统内存层次结构的严重瓶颈。在这项工作中，我们重点关注片上缓存和片外存储器存在的低效率。首先，已知图形处理由昂贵的随机访问主导，这很难被传统的缓存和预取器架构捕获，导致低缓存命中和耗尽主内存访问。其次，由于数据粒度小，片外带宽进一步得不到充分利用。因为图中的每个顶点/边数据只需要4-8B，这比64B的内存访问粒度要小得多。因此，大量的带宽被浪费在获取不必要的数据上。因此，我们提出了G-MEM，一种用于图形处理应用程序的定制内存层次设计。首先，我们提出了一个无相干的刮擦板作为片上存储器，它利用图形的幂律特性，只存储那些频繁访问的热数据。我们为刮板存储器配备了一个度感知映射策略，以便更好地管理它用于各种应用。另一方面，我们设计了弹性粒度DRAM (EG-DRAM)，以方便主存储器的访问。EG-DRAM基于近数据处理架构，将多个细粒度内存访问处理并合并在一起，以最大限度地提高带宽效率。把它们放在一起，G-MEM比普通CPU的总体加速速度提高了2.48倍，相对于最先进的缓存体系结构和内存子系统的加速速度分别提高了1.44倍和1.79倍。

{"title":"Overcoming the Memory Hierarchy Inefficiencies in Graph Processing Applications","authors":"Jilan Lin, Shuangchen Li, Yufei Ding, Yuan Xie","doi":"10.1109/ICCAD51958.2021.9643434","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643434","url":null,"abstract":"Graph processing participates a vital role in mining relational data. However, the intensive but inefficient memory accesses make graph processing applications severely bottlenecked by the conventional memory hierarchy. In this work, we focus on inefficiencies that exist on both on-chip cache and off-chip memory. First, graph processing is known dominated by expensive random accesses, which are difficult to be captured by conventional cache and prefetcher architectures, leading to low cache hits and exhausting main memory visits. Second, the off-chip bandwidth is further underutilized by the small data granularity. Because each vertex/edge data in the graph only needs 4-8B, which is much smaller than the memory access granularity of 64B. Thus, lots of bandwidth is wasted fetching unnecessary data. Therefore, we present G-MEM, a customized memory hierarchy design for graph processing applications. First, we propose a coherence-free scratchpad as the on-chip memory, which leverages the power-law characteristic of graphs and only stores those hot data that are frequent-accessed. We equip the scratchpad memory with a degree-aware mapping strategy to better manage it for various applications. On the other hand, we design an elastic-granularity DRAM (EG-DRAM) to facilitate the main memory access. The EG-DRAM is based on near-data processing architecture, which processes and coalesces multiple fine-grained memory accesses together to maximize bandwidth efficiency. Putting them together, the G-MEM demonstrates a 2.48 × overall speedup over a vanilla CPU, with 1.44 × and 1.79 × speedup against the state-of-the-art cache architecture and memory subsystem, respectively.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134198954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

ReIGNN: State Register Identification Using Graph Neural Networks for Circuit Reverse Engineering 用图神经网络进行电路逆向工程的状态寄存器识别

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643498

Subhajit Dutta Chowdhury, Kaixin Yang, P. Nuzzo

Reverse engineering an integrated circuit netlist is a powerful tool to help detect malicious logic and counteract design piracy. A critical challenge in this domain is the correct classification of data-path and control-logic registers in a design. We present ReIGNN, a novel learning-based register classification methodology that combines graph neural networks (GNNs) with structural analysis to classify the registers in a circuit with high accuracy and generalize well across different designs. GNNs are particularly effective in processing circuit netlists in terms of graphs and leveraging properties of the nodes and their neighborhoods to learn to efficiently discriminate between different types of nodes. Structural analysis can further rectify any registers misclassified as state registers by the GNN by analyzing strongly connected components in the netlist graph. Numerical results on a set of benchmarks show that ReIGNN can achieve, on average, 96.5% balanced accuracy and 97.7% sensitivity across different designs.

逆向工程集成电路网络表是一个强大的工具，帮助检测恶意逻辑和抵制设计盗版。该领域的一个关键挑战是设计中数据路径和控制逻辑寄存器的正确分类。我们提出了一种新的基于学习的寄存器分类方法ReIGNN，它将图神经网络(gnn)与结构分析相结合，以高精度地对电路中的寄存器进行分类，并在不同的设计中很好地推广。gnn在处理电路网络图方面特别有效，并利用节点及其邻域的属性来学习有效地区分不同类型的节点。结构分析可以通过分析网表图中的强连接分量，进一步纠正被GNN误分类为状态寄存器的寄存器。在一组基准上的数值结果表明，在不同的设计中，ReIGNN平均可以达到96.5%的平衡精度和97.7%的灵敏度。

引用次数: 10

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀