2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)最新文献

英文中文

Tunable Precision Control for Approximate Image Filtering in an In-Memory Architecture with Embedded Neurons 基于嵌入神经元的内存结构中近似图像滤波的可调精度控制

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549385

Ayushi Dube, Ankit Wagle, G. Singh, S. Vrudhula

This paper presents a novel hardware-software co-design consisting of a Processing in-Memory (PiM) architecture with embedded neural processing elements (NPE) that are highly reconfigurable. The PiM platform and proposed approximation strategies are employed for various image filtering applications while providing the user with fine-grain dynamic control over energy efficiency, precision, and throughput (EPT). The proposed co-design can change the Peak Signal to Noise Ratio (PSNR, output quality metric for image filtering applications) from 25dB to 50dB (acceptable PSNR range for image filtering applications) without incurring any extra cost in terms of energy or latency. While switching from accurate to approximate mode of computation in the proposed co-design, the maximum improvement in energy efficiency and throughput is 2X. However, the gains in energy efficiency against a MAC-based PE array with the proposed memory platform are 3X-6X. The corresponding improvements in throughput are 2.26X-4.52X, respectively.

本文提出了一种新的硬件-软件协同设计方案，该方案由内存处理(PiM)体系结构和高度可重构的嵌入式神经处理元件(NPE)组成。PiM平台及其提出的近似策略可用于各种图像滤波应用，同时为用户提供对能效、精度和吞吐量(EPT)的细粒度动态控制。所提出的协同设计可以将峰值信噪比(PSNR，图像滤波应用的输出质量指标)从25dB更改为50dB(图像滤波应用可接受的PSNR范围)，而不会在能量或延迟方面产生任何额外成本。当在提出的协同设计中从精确计算模式切换到近似计算模式时，能源效率和吞吐量的最大改进是2倍。然而，与基于mac的PE阵列相比，该存储平台的能效提高了3X-6X。相应的吞吐量提升分别为2.26X-4.52X。

引用次数: 0

Towards High-Quality CGRA Mapping with Graph Neural Networks and Reinforcement Learning 用图神经网络和强化学习实现高质量的CGRA映射

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549458

Yan Zhuang, Zhihao Zhang, Dajiang Liu

Coarse-Grained Reconfigurable Architectures (CGRA) is a promising solution to accelerate domain applications due to its good combination of energy-efficiency and flexibility. Loops, as computation-intensive parts of applications, are often mapped onto CGRA and modulo scheduling is commonly used to improve the execution performance. However, the actual performance using modulo scheduling is highly dependent on the mapping ability of the Data Dependency Graph (DDG) extracted from a loop. As existing approaches usually separate routing exploration of multi-cycle dependence from mapping for fast compilation, they may easily suffer from poor mapping quality. In this paper, we integrate the routing explorations into the mapping process and make it have more opportunities to find a globally optimized solution. Meanwhile, with a reduced resource graph defined, the searching space of the new mapping problem is not greatly increased. To efficiently solve the problem, we introduce graph neural network based reinforcement learning to predict a placement distribution over different resource nodes for all operations in a DDG. Using the routing connectivity as the reward signal, we optimize the parameters of neural network to find a valid mapping solution with a policy gradient method. Without much engineering and heuristic designing, our approach achieves 1.57× mapping quality, as compared to the state-of-the-art heuristic.

粗粒度可重构体系结构(CGRA)具有良好的能效和灵活性，是一种很有前途的加速领域应用的解决方案。循环作为应用程序的计算密集型部分，通常映射到CGRA，模调度通常用于提高执行性能。然而，使用模调度的实际性能高度依赖于从循环中提取的数据依赖图(DDG)的映射能力。由于现有的方法通常将多循环依赖的路由探索与快速编译的映射分离开来，因此容易出现映射质量差的问题。在本文中，我们将路径探索融入到映射过程中，使其有更多的机会找到全局最优解。同时，通过定义一个简化的资源图，新映射问题的搜索空间并没有大大增加。为了有效地解决这个问题，我们引入了基于图神经网络的强化学习来预测DDG中所有操作在不同资源节点上的放置分布。以路由连通性作为奖励信号，利用策略梯度法对神经网络参数进行优化，找到有效的映射解。在没有太多工程和启发式设计的情况下，与最先进的启发式方法相比，我们的方法实现了1.57倍的映射质量。

{"title":"Towards High-Quality CGRA Mapping with Graph Neural Networks and Reinforcement Learning","authors":"Yan Zhuang, Zhihao Zhang, Dajiang Liu","doi":"10.1145/3508352.3549458","DOIUrl":"https://doi.org/10.1145/3508352.3549458","url":null,"abstract":"Coarse-Grained Reconfigurable Architectures (CGRA) is a promising solution to accelerate domain applications due to its good combination of energy-efficiency and flexibility. Loops, as computation-intensive parts of applications, are often mapped onto CGRA and modulo scheduling is commonly used to improve the execution performance. However, the actual performance using modulo scheduling is highly dependent on the mapping ability of the Data Dependency Graph (DDG) extracted from a loop. As existing approaches usually separate routing exploration of multi-cycle dependence from mapping for fast compilation, they may easily suffer from poor mapping quality. In this paper, we integrate the routing explorations into the mapping process and make it have more opportunities to find a globally optimized solution. Meanwhile, with a reduced resource graph defined, the searching space of the new mapping problem is not greatly increased. To efficiently solve the problem, we introduce graph neural network based reinforcement learning to predict a placement distribution over different resource nodes for all operations in a DDG. Using the routing connectivity as the reward signal, we optimize the parameters of neural network to find a valid mapping solution with a policy gradient method. Without much engineering and heuristic designing, our approach achieves 1.57× mapping quality, as compared to the state-of-the-art heuristic.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114394819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Obstacle-Avoiding Multiple Redistribution Layer Routing with Irregular Structures* 不规则结构的多重分布层避障路由*

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549419

Yen-Ting Chen, Yao-Wen Chang

In advanced packages, redistribution layers (RDLs) are extra metal layers for high interconnections among the chips and printed circuit board (PCB). To better utilize the routing resources of RDLs, published works adopted flexible vias such that they can place the vias everywhere. Furthermore, some regions may be blocked for signal integrity protection or manually prerouted nets (such as power/ground nets or feeding lines of antennas) to achieve higher performance. These blocked regions will be treated as obstacles in the routing process. Since the positions of pads, obstacles, and vias can be arbitrary, the structures of RDLs become irregular. The obstacles and irregular structures substantially increase the difficulty of the routing process. This paper proposes a three-stage algorithm: First, the layout is partitioned by a method based on constrained Delaunay triangulation (CDT). Then we present a global routing graph model and generate routing guides for unified-assignment netlists. Finally, a novel tile routing method is developed to obtain detailed routes. Experiment results demonstrate the robustness and effectiveness of our proposed algorithm.

在高级封装中，再分配层(rdl)是用于芯片和印刷电路板(PCB)之间高互连的额外金属层。为了更好地利用rdl的路由资源，已发表的作品采用了灵活的过孔，可以将过孔放置在任何地方。此外，为了信号完整性保护或手动预路由网(如电源/接地网或天线馈线)，某些区域可能会被阻塞，以实现更高的性能。这些被阻塞的区域将被视为路由过程中的障碍。由于护垫、障碍物和过孔的位置可以是任意的，因此rdl的结构变得不规则。障碍物和不规则结构大大增加了布线过程的难度。本文提出了一种基于约束Delaunay三角剖分(CDT)的布局分割算法。然后给出了全局路由图模型，并生成了统一分配网络的路由指南。最后，提出了一种新的瓦片路由方法来获取详细的瓦片路由。实验结果证明了该算法的鲁棒性和有效性。

{"title":"Obstacle-Avoiding Multiple Redistribution Layer Routing with Irregular Structures*","authors":"Yen-Ting Chen, Yao-Wen Chang","doi":"10.1145/3508352.3549419","DOIUrl":"https://doi.org/10.1145/3508352.3549419","url":null,"abstract":"In advanced packages, redistribution layers (RDLs) are extra metal layers for high interconnections among the chips and printed circuit board (PCB). To better utilize the routing resources of RDLs, published works adopted flexible vias such that they can place the vias everywhere. Furthermore, some regions may be blocked for signal integrity protection or manually prerouted nets (such as power/ground nets or feeding lines of antennas) to achieve higher performance. These blocked regions will be treated as obstacles in the routing process. Since the positions of pads, obstacles, and vias can be arbitrary, the structures of RDLs become irregular. The obstacles and irregular structures substantially increase the difficulty of the routing process. This paper proposes a three-stage algorithm: First, the layout is partitioned by a method based on constrained Delaunay triangulation (CDT). Then we present a global routing graph model and generate routing guides for unified-assignment netlists. Finally, a novel tile routing method is developed to obtain detailed routes. Experiment results demonstrate the robustness and effectiveness of our proposed algorithm.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115295626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating the Security of eFPGA-based Redaction Algorithms 基于efpga编校算法的安全性评估

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549425

Amin Rezaei, Raheel Afsharmazayejani, Jordan Maynard

Hardware IP owners must envision procedures to avoid piracy and overproduction of their designs under a fabless paradigm. A newly proposed technique to obfuscate critical components in a logic design is called eFPGA-based redaction, which replaces a sensitive sub-circuit with an embedded FPGA, and the eFPGA is configured to perform the same functionality as the missing sub-circuit. In this case, the configuration bitstream acts as a hidden key only known to the hardware IP owner. In this paper, we first evaluate the security promise of the existing eFPGA-based redaction algorithms as a preliminary study. Then, we break eFPGA-based redaction schemes by an initial but not necessarily efficient attack named DIP Exclusion that excludes problematic input patterns from checking in a brute-force manner. Finally, by combining cycle breaking and unrolling, we propose a novel and powerful attack called Break & Unroll that is able to recover the bitstream of state-of-the-art eFPGA-based redaction schemes in a relatively short time even with the existence of hard cycles and large size keys. This study reveals that the common perception that eFPGA-based redaction is by default secure against oracle-guided attacks, is prejudice. It also shows that additional research on how to systematically create an exponential number of non-combinational hard cycles is required to secure eFPGA-based redaction schemes.

硬件IP所有者必须设想在无晶圆厂范式下避免盗版和过度生产其设计的程序。一种新提出的在逻辑设计中混淆关键组件的技术被称为基于eFPGA的编校，它用嵌入式FPGA取代敏感子电路，并且eFPGA被配置为执行与缺失子电路相同的功能。在这种情况下，配置位流充当只有硬件IP所有者知道的隐藏密钥。在本文中，我们首先评估了现有的基于efpga的编校算法的安全性作为初步研究。然后，我们通过一种名为DIP Exclusion的初始但不一定有效的攻击来破坏基于efpga的编校方案，该攻击以暴力方式将有问题的输入模式从检查中排除。最后，通过结合循环破坏和展开，我们提出了一种新颖而强大的攻击，称为Break & Unroll，即使存在硬循环和大尺寸密钥，也能够在相对较短的时间内恢复最先进的基于efpga的编码器方案的比特流。这项研究表明，普遍认为基于efpga的编校在默认情况下是安全的，可以抵御神谕引导的攻击，这是一种偏见。它还表明，需要对如何系统地创建指数数量的非组合硬循环进行额外的研究，以确保基于efpga的编目方案。

{"title":"Evaluating the Security of eFPGA-based Redaction Algorithms","authors":"Amin Rezaei, Raheel Afsharmazayejani, Jordan Maynard","doi":"10.1145/3508352.3549425","DOIUrl":"https://doi.org/10.1145/3508352.3549425","url":null,"abstract":"Hardware IP owners must envision procedures to avoid piracy and overproduction of their designs under a fabless paradigm. A newly proposed technique to obfuscate critical components in a logic design is called eFPGA-based redaction, which replaces a sensitive sub-circuit with an embedded FPGA, and the eFPGA is configured to perform the same functionality as the missing sub-circuit. In this case, the configuration bitstream acts as a hidden key only known to the hardware IP owner. In this paper, we first evaluate the security promise of the existing eFPGA-based redaction algorithms as a preliminary study. Then, we break eFPGA-based redaction schemes by an initial but not necessarily efficient attack named DIP Exclusion that excludes problematic input patterns from checking in a brute-force manner. Finally, by combining cycle breaking and unrolling, we propose a novel and powerful attack called Break & Unroll that is able to recover the bitstream of state-of-the-art eFPGA-based redaction schemes in a relatively short time even with the existence of hard cycles and large size keys. This study reveals that the common perception that eFPGA-based redaction is by default secure against oracle-guided attacks, is prejudice. It also shows that additional research on how to systematically create an exponential number of non-combinational hard cycles is required to secure eFPGA-based redaction schemes.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123083049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Automatic Test Configuration and Pattern Generation (ATCPG) for Neuromorphic Chips 神经形态芯片的自动测试配置和模式生成(ATCPG

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549422

I. Chiu, Xin-Ping Chen, Jennifer Shueh-Inn Hu, C. Li

The demand for low-power, high-performance neuromorphic chips is increasing. However, conventional testing is not applicable to neuromorphic chips due to three reasons: (1) lack of scan DfT, (2) stochastic characteristic, and (3) configurable functionality. In this paper, we present an automatic test configuration and pattern generation (ATCPG) method for testing a configurable stochastic neuromorphic chip without using scan DfT. We use machine learning to generate test configurations. Then, we apply a modified fast gradient sign method to generate test patterns. Finally, we determine test repetitions with statistical power of test. We conduct experiments on one of the neuromorphic architectures, spiking neural network, to evaluate the effectiveness of our ATCPG. The experimental results show that our ATCPG can achieve 100% fault coverage for the five fault models we use. For testing a 3-layer model at 0.05 significant level, we produce 5 test configurations and 67 test patterns. The average test repetitions of neuron faults and synapse faults are 2,124 and 4,557, respectively. Besides, our simulation results show that the overkill matched our significance level perfectly.

对低功耗、高性能神经形态芯片的需求正在增加。然而，由于以下三个原因，常规测试并不适用于神经形态芯片:(1)缺乏扫描DfT，(2)随机特性，(3)可配置功能。在本文中，我们提出了一种自动测试配置和模式生成(ATCPG)方法来测试一个可配置的随机神经形态芯片，而不使用扫描DfT。我们使用机器学习来生成测试配置。然后，我们应用一种改进的快速梯度符号方法来生成测试模式。最后，我们用检验的统计能力来确定检验的重复次数。我们在其中一种神经形态架构——脉冲神经网络上进行了实验，以评估我们的ATCPG的有效性。实验结果表明，对于我们使用的5种故障模型，我们的ATCPG可以达到100%的故障覆盖率。为了在0.05显著水平上测试三层模型，我们产生了5个测试配置和67个测试模式。神经元故障和突触故障的平均测试次数分别为2124次和4557次。此外，我们的仿真结果表明，过度杀伤与我们的显著性水平完全匹配。

{"title":"Automatic Test Configuration and Pattern Generation (ATCPG) for Neuromorphic Chips","authors":"I. Chiu, Xin-Ping Chen, Jennifer Shueh-Inn Hu, C. Li","doi":"10.1145/3508352.3549422","DOIUrl":"https://doi.org/10.1145/3508352.3549422","url":null,"abstract":"The demand for low-power, high-performance neuromorphic chips is increasing. However, conventional testing is not applicable to neuromorphic chips due to three reasons: (1) lack of scan DfT, (2) stochastic characteristic, and (3) configurable functionality. In this paper, we present an automatic test configuration and pattern generation (ATCPG) method for testing a configurable stochastic neuromorphic chip without using scan DfT. We use machine learning to generate test configurations. Then, we apply a modified fast gradient sign method to generate test patterns. Finally, we determine test repetitions with statistical power of test. We conduct experiments on one of the neuromorphic architectures, spiking neural network, to evaluate the effectiveness of our ATCPG. The experimental results show that our ATCPG can achieve 100% fault coverage for the five fault models we use. For testing a 3-layer model at 0.05 significant level, we produce 5 test configurations and 67 test patterns. The average test repetitions of neuron faults and synapse faults are 2,124 and 4,557, respectively. Besides, our simulation results show that the overkill matched our significance level perfectly.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122121026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Stochastic Approach to Handle Non-Determinism in Deep Learning-Based Design Rule Violation Predictions 在基于深度学习的设计规则违反预测中处理非确定性的随机方法

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549347

Rongjian Liang, Hua Xiang, Jinwook Jung, Jiang Hu, Gi-Joon Nam

Deep learning is a promising approach to early DRV (Design Rule Violation) prediction. However, non-deterministic parallel routing hampers model training and degrades prediction accuracy. In this work, we propose a stochastic approach, called LGC-Net, to solve this problem. In this approach, we develop new techniques of Gaussian random field layer and focal likelihood loss function to seamlessly integrate Log Gaussian Cox process with deep learning. This approach provides not only statistical regression results but also classification ones with different thresholds without retraining. Experimental results with noisy training data on industrial designs demonstrate that LGC-Net achieves significantly better accuracy of DRV density prediction than prior arts.

深度学习是一种很有前途的早期DRV(设计规则违反)预测方法。然而，不确定性并行路由阻碍了模型训练，降低了预测精度。在这项工作中，我们提出了一种称为LGC-Net的随机方法来解决这个问题。在这种方法中，我们开发了高斯随机场层和焦点似然损失函数的新技术，将Log高斯Cox过程与深度学习无缝集成。该方法不仅可以提供统计回归结果，还可以在不进行再训练的情况下提供不同阈值的分类结果。工业品外观设计噪声训练数据的实验结果表明，LGC-Net对DRV密度的预测精度明显优于现有技术。

引用次数: 3

Analyzing and Improving Resilience and Robustness of Autonomous Systems (Invited Paper) 分析和改进自治系统的弹性和鲁棒性(特邀论文)

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2022-10-29 DOI: 10.1145/3508352.3561111

Zishen Wan, Karthik Swaminathan, Pin-Yu Chen, Nandhini Chandramoorthy, A. Raychowdhury

Autonomous systems have reached a tipping point, with a myriad of self-driving cars, unmanned aerial vehicles (UAVs), and robots being widely applied and revolutionizing new applications. The continuous deployment of autonomous systems reveals the need for designs that facilitate increased resiliency and safety. The ability of an autonomous system to tolerate, or mitigate against errors, such as environmental conditions, sensor, hardware and software faults, and adversarial attacks, is essential to ensure its functional safety. Application-aware resilience metrics, holistic fault analysis frameworks, and lightweight fault mitigation techniques are being proposed for accurate and effective resilience and robustness assessment and improvement. This paper explores the origination of fault sources across the computing stack of autonomous systems, discusses the various fault impacts and fault mitigation techniques of different scales of autonomous systems, and concludes with challenges and opportunities for assessing and building next-generation resilient and robust autonomous systems.

自动驾驶系统已经达到了一个临界点，无数的自动驾驶汽车、无人驾驶飞行器(uav)和机器人被广泛应用，并带来了革命性的新应用。自动系统的不断部署表明，需要设计出能够提高弹性和安全性的设计。自主系统容忍或减轻诸如环境条件、传感器、硬件和软件故障以及对抗性攻击等错误的能力对于确保其功能安全至关重要。应用感知的弹性度量、整体故障分析框架和轻量级故障缓解技术被提出用于准确和有效的弹性和鲁棒性评估和改进。本文探讨了跨自治系统计算堆栈的故障源的起源，讨论了不同规模自治系统的各种故障影响和故障缓解技术，并总结了评估和构建下一代弹性和鲁棒自治系统的挑战和机遇。

{"title":"Analyzing and Improving Resilience and Robustness of Autonomous Systems (Invited Paper)","authors":"Zishen Wan, Karthik Swaminathan, Pin-Yu Chen, Nandhini Chandramoorthy, A. Raychowdhury","doi":"10.1145/3508352.3561111","DOIUrl":"https://doi.org/10.1145/3508352.3561111","url":null,"abstract":"Autonomous systems have reached a tipping point, with a myriad of self-driving cars, unmanned aerial vehicles (UAVs), and robots being widely applied and revolutionizing new applications. The continuous deployment of autonomous systems reveals the need for designs that facilitate increased resiliency and safety. The ability of an autonomous system to tolerate, or mitigate against errors, such as environmental conditions, sensor, hardware and software faults, and adversarial attacks, is essential to ensure its functional safety. Application-aware resilience metrics, holistic fault analysis frameworks, and lightweight fault mitigation techniques are being proposed for accurate and effective resilience and robustness assessment and improvement. This paper explores the origination of fault sources across the computing stack of autonomous systems, discusses the various fault impacts and fault mitigation techniques of different scales of autonomous systems, and concludes with challenges and opportunities for assessing and building next-generation resilient and robust autonomous systems.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130547189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Graph Neural Networks for Idling Error Mitigation 缓解空转错误的图神经网络

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549444

Vedika Servanan, S. Saeed

Dynamical Decoupling (DD)-based protocols have been shown to reduce the idling errors encountered in quantum circuits. However, the current research in suppressing idling qubit errors suffers from scalability issues due to the large number of tuning quantum circuits that should be executed first to find the locations of the DD sequences in the target quantum circuit, which boost the output state fidelity. This process becomes tedious as the size of the quantum circuit increases. To address this challenge, we propose a Graph Neural Network (GNN) framework, which mitigates idling errors through an efficient insertion of DD sequences into quantum circuits by modeling their impact at different idle qubit windows. Our paper targets maximizing the benefit of DD sequences using a limited number of tuning circuits. We propose to classify the idle qubit windows into critical and non-critical (benign) windows using a data-driven reliability model. Our results obtained from IBM Lagos quantum computer show that our proposed GNN models, which determine the locations of DD sequences in the quantum circuits, significantly improve the output state fidelity by a factor of 1.4x on average and up to 2.6x compared to the adaptive DD approach, which searches for the best locations of DD sequences at run-time.

基于动态解耦(DD)的协议已被证明可以减少量子电路中遇到的空转错误。然而，目前抑制空转量子比特错误的研究存在可扩展性问题，因为需要首先执行大量的调谐量子电路来找到DD序列在目标量子电路中的位置，从而提高输出状态的保真度。随着量子电路尺寸的增加，这个过程变得单调乏味。为了解决这一挑战，我们提出了一个图神经网络(GNN)框架，该框架通过模拟DD序列在不同空闲量子位窗口的影响，通过将DD序列有效地插入量子电路来减轻空转错误。我们的论文的目标是使用有限数量的调谐电路最大化DD序列的好处。我们建议使用数据驱动的可靠性模型将空闲量子比特窗口分为关键和非关键(良性)窗口。我们在IBM Lagos量子计算机上获得的结果表明，与自适应DD方法(在运行时搜索DD序列的最佳位置)相比，我们提出的确定DD序列在量子电路中位置的GNN模型显著提高了输出状态保真度，平均提高了1.4倍，最高提高了2.6倍。

{"title":"Graph Neural Networks for Idling Error Mitigation","authors":"Vedika Servanan, S. Saeed","doi":"10.1145/3508352.3549444","DOIUrl":"https://doi.org/10.1145/3508352.3549444","url":null,"abstract":"Dynamical Decoupling (DD)-based protocols have been shown to reduce the idling errors encountered in quantum circuits. However, the current research in suppressing idling qubit errors suffers from scalability issues due to the large number of tuning quantum circuits that should be executed first to find the locations of the DD sequences in the target quantum circuit, which boost the output state fidelity. This process becomes tedious as the size of the quantum circuit increases. To address this challenge, we propose a Graph Neural Network (GNN) framework, which mitigates idling errors through an efficient insertion of DD sequences into quantum circuits by modeling their impact at different idle qubit windows. Our paper targets maximizing the benefit of DD sequences using a limited number of tuning circuits. We propose to classify the idle qubit windows into critical and non-critical (benign) windows using a data-driven reliability model. Our results obtained from IBM Lagos quantum computer show that our proposed GNN models, which determine the locations of DD sequences in the quantum circuits, significantly improve the output state fidelity by a factor of 1.4x on average and up to 2.6x compared to the adaptive DD approach, which searches for the best locations of DD sequences at run-time.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128868935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Attack Directories on ARM big.LITTLE Processors 攻击目录对ARM大。小的处理器

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549340

Zili Kou, Sharad Sinha, Wenjian He, W. Zhang

Eviction-based cache side-channel attacks take advantage of inclusive cache hierarchies and shared cache hardware. Processors with the template ARM big.LITTLE architecture do not guarantee such preconditions and therefore will not usually allow cross-core attacks let alone cross-cluster attacks. This work reveals a new side-channel based on the snoop filter (SF), an unexplored directory structure embedded in template ARM big.LITTLE processors. Our systematic reverse engineering unveils the undocumented structure and property of the SF, and we successfully utilize it to bootstrap cross-core and cross-cluster cache eviction. We demonstrate a comprehensive methodology to exploit the SF side-channel, including the construction of eviction sets, the covert channel, and attacks against RSA and AES. When attacking TrustZone, we conduct an interrupt-based side-channel attack to extract the key of RSA by a single profiling trace, despite the strict cache clean defense. Supported by detailed experiments, the SF side-channel not only achieves competitive performance but also overcomes the main challenge of cache side-channel attacks on ARM big.LITTLE processors.

基于驱逐的缓存侧通道攻击利用了包容性缓存层次结构和共享缓存硬件。处理器用ARM大的模板。LITTLE架构不保证这样的前提条件，因此通常不允许跨核心攻击，更不用说跨集群攻击了。这项工作揭示了一个基于snoop过滤器(SF)的新侧信道，这是一个未开发的目录结构，嵌入在模板ARM big中。小的处理器。我们的系统逆向工程揭示了SF的文档结构和特性，并成功地利用它来引导跨核和跨集群的缓存清除。我们展示了一种利用SF侧信道的综合方法，包括驱逐集的构建，隐蔽信道以及对RSA和AES的攻击。在攻击TrustZone时，我们进行了基于中断的侧信道攻击，通过单个分析跟踪提取RSA密钥，尽管有严格的缓存清理防御。在详细的实验支持下，SF侧信道不仅达到了具有竞争力的性能，而且克服了ARM大缓存侧信道攻击的主要挑战。小的处理器。

{"title":"Attack Directories on ARM big.LITTLE Processors","authors":"Zili Kou, Sharad Sinha, Wenjian He, W. Zhang","doi":"10.1145/3508352.3549340","DOIUrl":"https://doi.org/10.1145/3508352.3549340","url":null,"abstract":"Eviction-based cache side-channel attacks take advantage of inclusive cache hierarchies and shared cache hardware. Processors with the template ARM big.LITTLE architecture do not guarantee such preconditions and therefore will not usually allow cross-core attacks let alone cross-cluster attacks. This work reveals a new side-channel based on the snoop filter (SF), an unexplored directory structure embedded in template ARM big.LITTLE processors. Our systematic reverse engineering unveils the undocumented structure and property of the SF, and we successfully utilize it to bootstrap cross-core and cross-cluster cache eviction. We demonstrate a comprehensive methodology to exploit the SF side-channel, including the construction of eviction sets, the covert channel, and attacks against RSA and AES. When attacking TrustZone, we conduct an interrupt-based side-channel attack to extract the key of RSA by a single profiling trace, despite the strict cache clean defense. Supported by detailed experiments, the SF side-channel not only achieves competitive performance but also overcomes the main challenge of cache side-channel attacks on ARM big.LITTLE processors.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121031571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ATLAS: A Two-Level Layer-Aware Scheme for Routing with Cell Movement ATLAS:一种具有单元移动的两级层感知路由方案

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Pub Date : 2022-10-29 DOI: 10.1145/3508352.3549470

Xinshi Zang, Fangzhou Wang, Jinwei Liu, Martin D. F. Wong

Placement and routing are two crucial steps in the physical design of integrated circuits (ICs). To close the gap between placement and routing, the routing with cell movement problem has attracted great attention recently. In this problem, a certain number of cells can be moved to new positions and the nets can be rerouted to improve the total wire length. In this work, we advance the study on this problem by proposing a two-level layer-aware scheme, named ATLAS. A coarse-level cluster-based cell movement is first performed to optimize via usage and provides a better starting point for the next fine-level single cell movement. To further encourage routing on the upper metal layers, we utilize a set of adjusted layer weights to increase the routing cost on lower layers. Experimental results on the ICCAD 2020 contest benchmarks show that ATLAS achieves much more wire length reduction compared with the state-of-the-art routing with cell movement engine. Furthermore, applied on the ICCAD 2021 contest benchmarks, ATLAS outperforms the first place team of the contest with much better solution quality while being 3× faster.

放置和布线是集成电路物理设计中的两个关键步骤。为了缩小布局和路由之间的差距，带单元移动的路由问题近年来引起了人们的广泛关注。在这个问题中，一定数量的单元可以移动到新的位置，网可以重新路由，以提高总导线长度。在这项工作中，我们通过提出一种名为ATLAS的两层感知方案来推进这一问题的研究。首先执行粗级别的基于集群的单元移动，通过使用进行优化，并为下一个精细级别的单个单元移动提供更好的起点。为了进一步鼓励上层金属层的路由，我们利用一组调整的层权值来增加下层的路由成本。在ICCAD 2020竞赛基准测试上的实验结果表明，与最先进的带有单元移动引擎的路由相比，ATLAS实现了更多的导线长度缩减。此外，应用于ICCAD 2021比赛基准，ATLAS以更好的解决方案质量超过了比赛第一名的团队，同时速度提高了3倍。

{"title":"ATLAS: A Two-Level Layer-Aware Scheme for Routing with Cell Movement","authors":"Xinshi Zang, Fangzhou Wang, Jinwei Liu, Martin D. F. Wong","doi":"10.1145/3508352.3549470","DOIUrl":"https://doi.org/10.1145/3508352.3549470","url":null,"abstract":"Placement and routing are two crucial steps in the physical design of integrated circuits (ICs). To close the gap between placement and routing, the routing with cell movement problem has attracted great attention recently. In this problem, a certain number of cells can be moved to new positions and the nets can be rerouted to improve the total wire length. In this work, we advance the study on this problem by proposing a two-level layer-aware scheme, named ATLAS. A coarse-level cluster-based cell movement is first performed to optimize via usage and provides a better starting point for the next fine-level single cell movement. To further encourage routing on the upper metal layers, we utilize a set of adjusted layer weights to increase the routing cost on lower layers. Experimental results on the ICCAD 2020 contest benchmarks show that ATLAS achieves much more wire length reduction compared with the state-of-the-art routing with cell movement engine. Furthermore, applied on the ICCAD 2021 contest benchmarks, ATLAS outperforms the first place team of the contest with much better solution quality while being 3× faster.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122347264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀