首页 > 最新文献

2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)最新文献

英文 中文
A Transferable GNN-based Multi-Corner Performance Variability Modeling for Analog ICs 基于可转移 GNN 的模拟集成电路多角度性能变异性建模
Pub Date : 2024-01-22 DOI: 10.1109/ASP-DAC58780.2024.10473858
Hongjian Zhou, Yaguang Li, Xin Xiong, Pingqiang Zhou
Performance variability appears strong-nonlinear in analog ICs due to large process variations in advanced technologies. To capture such variability, a vast amount of data is required for learning-based accurate models. On the other hand, yield estimation across multiple PVT corners exacerbates data dimensionality further. In this paper, we propose a graph neural network (GNN)-based performance variability modeling method. The key idea is to leverage GNN techniques to extract variations-related local mismatch in analog circuits, and data efficiency is benefited by the ability of knowledge transfer among different PVT corners. Demonstrated upon three circuits in a commercial 65nm CMOS process and compared with the state-of-the-art modeling techniques, our method can achieve higher modeling accuracy while utilizing significantly less training data.
由于先进技术的工艺差异较大,模拟集成电路的性能变化呈现出强烈的非线性特征。要捕捉这种变异性,就需要大量数据来建立基于学习的精确模型。另一方面,跨多个 PVT 角的良率估计进一步加剧了数据维度。在本文中,我们提出了一种基于图神经网络(GNN)的性能变异性建模方法。其主要思路是利用图神经网络技术提取模拟电路中与变异相关的局部失配,并通过不同 PVT 角之间的知识转移能力提高数据效率。通过对商用 65nm CMOS 工艺中的三个电路进行演示,并与最先进的建模技术进行比较,我们的方法可以实现更高的建模精度,同时大大减少训练数据的使用。
{"title":"A Transferable GNN-based Multi-Corner Performance Variability Modeling for Analog ICs","authors":"Hongjian Zhou, Yaguang Li, Xin Xiong, Pingqiang Zhou","doi":"10.1109/ASP-DAC58780.2024.10473858","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473858","url":null,"abstract":"Performance variability appears strong-nonlinear in analog ICs due to large process variations in advanced technologies. To capture such variability, a vast amount of data is required for learning-based accurate models. On the other hand, yield estimation across multiple PVT corners exacerbates data dimensionality further. In this paper, we propose a graph neural network (GNN)-based performance variability modeling method. The key idea is to leverage GNN techniques to extract variations-related local mismatch in analog circuits, and data efficiency is benefited by the ability of knowledge transfer among different PVT corners. Demonstrated upon three circuits in a commercial 65nm CMOS process and compared with the state-of-the-art modeling techniques, our method can achieve higher modeling accuracy while utilizing significantly less training data.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"254 11","pages":"411-416"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ASP-DAC 2024 Cover Page ASP-DAC 2024 封面页
Pub Date : 2024-01-22 DOI: 10.1109/asp-dac58780.2024.10473969
{"title":"ASP-DAC 2024 Cover Page","authors":"","doi":"10.1109/asp-dac58780.2024.10473969","DOIUrl":"https://doi.org/10.1109/asp-dac58780.2024.10473969","url":null,"abstract":"","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"57 7-8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Control-Logic Routing for Fully Programmable Valve Array Biochips Using Deep Reinforcement Learning 利用深度强化学习为完全可编程阀门阵列生物芯片提供自适应控制逻辑路由选择
Pub Date : 2024-01-22 DOI: 10.1109/ASP-DAC58780.2024.10473962
Huayang Cai, Genggeng Liu, Wenzhong Guo, Zipeng Li, Tsung-Yi Ho, Xing Huang
With the increasing integration level of flow-based microfluidics, fully programmable valve arrays (FPVAs) have emerged as the next generation of microfluidic devices. Mi-crovalves in an FPVA are typically managed by a control logic, where valves are connected to a core input via control channels to receive control signals that guide states switching. The critical valves that suffer from asynchronous actuation leading to chip malfunctions, however, need to be switched simultaneously in a specific bioassay. As a result, the channel lengths from the core input to these valves are required to be equal or similar, which poses a challenge to the channel routing of the control logic. To solve this problem, we propose a deep reinforcement learning-based adaptive routing flow for the control logic of FPVAs. With the proposed routing flow, an efficient control channel network can be automatically constructed to realize accurate control signals propagation. Meanwhile, the timing skews among synchronized valves and the total length of control channels can be minimized, thus generating an optimized control logic with excellent timing behavior. Simulation results on multiple benchmarks demonstrate the effectiveness of the proposed routing flow.
随着基于流体的微流体集成度不断提高,全可编程阀门阵列(FPVA)已成为下一代微流体设备。FPVA 中的微阀通常由控制逻辑管理,阀门通过控制通道连接到核心输入端,以接收控制信号,从而引导状态切换。然而,在特定的生物测定中,需要同时切换那些因不同步驱动而导致芯片故障的关键阀门。因此,从核心输入端到这些阀门的通道长度必须相等或相似,这对控制逻辑的通道路由提出了挑战。为了解决这个问题,我们为 FPVA 的控制逻辑提出了一种基于深度强化学习的自适应路由流。利用所提出的路由流程,可以自动构建高效的控制通道网络,实现精确的控制信号传播。同时,同步阀之间的时序偏差和控制通道的总长度可以最小化,从而生成具有出色时序行为的优化控制逻辑。多个基准的仿真结果证明了所提出的路由流程的有效性。
{"title":"Adaptive Control-Logic Routing for Fully Programmable Valve Array Biochips Using Deep Reinforcement Learning","authors":"Huayang Cai, Genggeng Liu, Wenzhong Guo, Zipeng Li, Tsung-Yi Ho, Xing Huang","doi":"10.1109/ASP-DAC58780.2024.10473962","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473962","url":null,"abstract":"With the increasing integration level of flow-based microfluidics, fully programmable valve arrays (FPVAs) have emerged as the next generation of microfluidic devices. Mi-crovalves in an FPVA are typically managed by a control logic, where valves are connected to a core input via control channels to receive control signals that guide states switching. The critical valves that suffer from asynchronous actuation leading to chip malfunctions, however, need to be switched simultaneously in a specific bioassay. As a result, the channel lengths from the core input to these valves are required to be equal or similar, which poses a challenge to the channel routing of the control logic. To solve this problem, we propose a deep reinforcement learning-based adaptive routing flow for the control logic of FPVAs. With the proposed routing flow, an efficient control channel network can be automatically constructed to realize accurate control signals propagation. Meanwhile, the timing skews among synchronized valves and the total length of control channels can be minimized, thus generating an optimized control logic with excellent timing behavior. Simulation results on multiple benchmarks demonstrate the effectiveness of the proposed routing flow.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"55 3-4","pages":"564-569"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BNN-Flip: Enhancing the Fault Tolerance and Security of Compute-in-Memory Enabled Binary Neural Network Accelerators BNN-Flip:增强支持内存计算的二元神经网络加速器的容错性和安全性
Pub Date : 2024-01-22 DOI: 10.1109/ASP-DAC58780.2024.10473947
Akul Malhotra, Chunguang Wang, Sumeet Kumar Gupta
Compute-in-memory based binary neural networks or CiM-BNNs offer high energy/area efficiency for the design of edge deep neural network (DNN) accelerators, with only a mild accuracy reduction. However, for successful deployment, the design of CiM-BNNs must consider challenges such as memory faults and data security that plague existing DNN accelerators. In this work, we aim to mitigate both these problems simultaneously by proposing BNN-Flip, a training-free weight transformation algorithm that not only enhances the fault tolerance of CiM-BNNs but also protects them from weight theft attacks. BNN-Flip inverts the rows and columns of the BNN weight matrix in a way that reduces the impact of memory faults on the CiM-BNN’s inference accuracy, while preserving the correctness of the CiM operation. Concurrently, our technique encodes the CiM-BNN weights, securing them from weight theft. Our experiments on various CiM-BNNs show that BNN-Flip achieves an inference accuracy increase of up to 10.55% over the baseline (i.e. CiM-BNNs not employing BNN-Flip) in the presence of memory faults. Additionally, we show that the encoded weights generated by BNN-Flip furnish extremely low (near ‘random guess’) inference accuracy for the adversary attempting weight theft. The benefits of BNN-Flip come with an energy overhead of < 3%.
基于内存计算的二元神经网络(CiM-BNN)为边缘深度神经网络(DNN)加速器的设计提供了较高的能量/面积效率,同时仅会轻微降低精度。然而,为了成功部署,CiM-BNNs 的设计必须考虑内存故障和数据安全等困扰现有 DNN 加速器的挑战。在这项工作中,我们提出了一种无需训练的权重转换算法 BNN-Flip,旨在同时缓解这两个问题,该算法不仅能增强 CiM-BNN 的容错性,还能保护它们免受权重窃取攻击。BNN-Flip 会反转 BNN 权重矩阵的行和列,从而降低内存故障对 CiM-BNN 推断准确性的影响,同时保持 CiM 操作的正确性。同时,我们的技术还对 CiM-BNN 权重进行了编码,确保它们不会被窃取权重。我们在各种 CiM-BNN 上进行的实验表明,在存在内存故障的情况下,BNN-Flip 比基线(即未采用 BNN-Flip 的 CiM-BNN)的推理准确率最高提高了 10.55%。此外,我们还表明,BNN-Flip 生成的编码权重为试图窃取权重的对手提供了极低(接近 "随机猜测")的推断准确率。BNN-Flip 所带来的好处是能量开销小于 3%。
{"title":"BNN-Flip: Enhancing the Fault Tolerance and Security of Compute-in-Memory Enabled Binary Neural Network Accelerators","authors":"Akul Malhotra, Chunguang Wang, Sumeet Kumar Gupta","doi":"10.1109/ASP-DAC58780.2024.10473947","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473947","url":null,"abstract":"Compute-in-memory based binary neural networks or CiM-BNNs offer high energy/area efficiency for the design of edge deep neural network (DNN) accelerators, with only a mild accuracy reduction. However, for successful deployment, the design of CiM-BNNs must consider challenges such as memory faults and data security that plague existing DNN accelerators. In this work, we aim to mitigate both these problems simultaneously by proposing BNN-Flip, a training-free weight transformation algorithm that not only enhances the fault tolerance of CiM-BNNs but also protects them from weight theft attacks. BNN-Flip inverts the rows and columns of the BNN weight matrix in a way that reduces the impact of memory faults on the CiM-BNN’s inference accuracy, while preserving the correctness of the CiM operation. Concurrently, our technique encodes the CiM-BNN weights, securing them from weight theft. Our experiments on various CiM-BNNs show that BNN-Flip achieves an inference accuracy increase of up to 10.55% over the baseline (i.e. CiM-BNNs not employing BNN-Flip) in the presence of memory faults. Additionally, we show that the encoded weights generated by BNN-Flip furnish extremely low (near ‘random guess’) inference accuracy for the adversary attempting weight theft. The benefits of BNN-Flip come with an energy overhead of < 3%.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"54 7-8","pages":"146-152"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Efficient Transfer Learning Assisted Global Optimization Scheme for Analog/RF Circuits 模拟/射频电路的高效转移学习辅助全局优化方案
Pub Date : 2024-01-22 DOI: 10.1109/ASP-DAC58780.2024.10473798
Zhikai Wang, Jingbo Zhou, Xiaosen Liu, Yan Wang
Online surrogate model-assisted evolution algorithms (SAEAs) are very efficient for analog/RF circuit optimization. To improve modeling accuracy/sizing results, we propose an efficient transfer learning-assisted global optimization (TLAGO) scheme that can transfer useful knowledge between neural networks to improve modeling accuracy in SAEAs. The novelty mainly relies on a novel transfer learning scheme, including a modeling strategy and novel adaptive transfer learning network, for high-accuracy modeling, and greedy strategy for balancing exploration and exploitation. With lower optimization time, TLAGO can have a faster rate of convergence and more than 8% better performances than GASPAD.
在线代用模型辅助进化算法(SAEA)对于模拟/射频电路优化非常有效。为了提高建模精度/大小结果,我们提出了一种高效的迁移学习辅助全局优化(TLAGO)方案,它可以在神经网络之间迁移有用的知识,从而提高 SAEA 的建模精度。其新颖性主要依赖于一种新颖的迁移学习方案,包括用于高精度建模的建模策略和新颖的自适应迁移学习网络,以及用于平衡探索和利用的贪婪策略。TLAGO 优化时间更短,收敛速度更快,性能比 GASPAD 高 8%以上。
{"title":"An Efficient Transfer Learning Assisted Global Optimization Scheme for Analog/RF Circuits","authors":"Zhikai Wang, Jingbo Zhou, Xiaosen Liu, Yan Wang","doi":"10.1109/ASP-DAC58780.2024.10473798","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473798","url":null,"abstract":"Online surrogate model-assisted evolution algorithms (SAEAs) are very efficient for analog/RF circuit optimization. To improve modeling accuracy/sizing results, we propose an efficient transfer learning-assisted global optimization (TLAGO) scheme that can transfer useful knowledge between neural networks to improve modeling accuracy in SAEAs. The novelty mainly relies on a novel transfer learning scheme, including a modeling strategy and novel adaptive transfer learning network, for high-accuracy modeling, and greedy strategy for balancing exploration and exploitation. With lower optimization time, TLAGO can have a faster rate of convergence and more than 8% better performances than GASPAD.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"35 7-8","pages":"417-422"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards a Highly Interactive Design-Debug-Verification Cycle 实现高度互动的设计-调试-验证循环
Pub Date : 2024-01-22 DOI: 10.1109/ASP-DAC58780.2024.10473953
Lucas Klemmer Daniel, Daniel Große
Taking a hardware design from concept to silicon is a long and complicated process, partly due to very long-running simulations. After modifying a Register Transfer Level (RTL) design, it is typically handed off to the simulator, which then simulates the full design for a given amount of time. If a bug is discovered, there is no way to adjust the design while still in the context of the simulation. Instead, all simulation results are thrown away, and the entire cycle must be restarted from the beginning.In this paper, we argue that it is worth breaking up this strict separation between design languages, analysis languages, verification languages, and simulators. We present virtual signals, a methodology to inject new logic into existing waveforms.Virtual signals are based on WAL, an open-source waveform analysis language, and can therefore use the capabilities of WAL for debugging, fixing, analyzing, and verifying a design. All this enables an interactive and fast response design-debug-verification cycle. To demonstrate the benefits of our methodology, we present a case-study in which we show how the technique improves debugging and design analysis.
将硬件设计从概念转化为芯片是一个漫长而复杂的过程,部分原因是需要进行长时间的仿真。在修改寄存器传输层 (RTL) 设计后,通常会将其交给仿真器,仿真器会在一定时间内对整个设计进行仿真。如果发现错误,则无法在仿真过程中调整设计。相反,所有的仿真结果都会被丢弃,整个循环必须从头开始。在本文中,我们认为值得打破设计语言、分析语言、验证语言和仿真器之间的这种严格分离。虚拟信号基于开源波形分析语言 WAL,因此可以利用 WAL 的功能来调试、修复、分析和验证设计。所有这些都实现了设计-调试-验证循环的交互式快速响应。为了展示我们方法的优势,我们介绍了一个案例研究,其中我们展示了该技术如何改进调试和设计分析。
{"title":"Towards a Highly Interactive Design-Debug-Verification Cycle","authors":"Lucas Klemmer Daniel, Daniel Große","doi":"10.1109/ASP-DAC58780.2024.10473953","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473953","url":null,"abstract":"Taking a hardware design from concept to silicon is a long and complicated process, partly due to very long-running simulations. After modifying a Register Transfer Level (RTL) design, it is typically handed off to the simulator, which then simulates the full design for a given amount of time. If a bug is discovered, there is no way to adjust the design while still in the context of the simulation. Instead, all simulation results are thrown away, and the entire cycle must be restarted from the beginning.In this paper, we argue that it is worth breaking up this strict separation between design languages, analysis languages, verification languages, and simulators. We present virtual signals, a methodology to inject new logic into existing waveforms.Virtual signals are based on WAL, an open-source waveform analysis language, and can therefore use the capabilities of WAL for debugging, fixing, analyzing, and verifying a design. All this enables an interactive and fast response design-debug-verification cycle. To demonstrate the benefits of our methodology, we present a case-study in which we show how the technique improves debugging and design analysis.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"296 8","pages":"692-697"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Resource-efficient Task Scheduling System using Reinforcement Learning : Invited Paper 利用强化学习的资源节约型任务调度系统:特邀论文
Pub Date : 2024-01-22 DOI: 10.1109/ASP-DAC58780.2024.10473960
Chedi Morchdi, Cheng-Hsiang Chiu, Yi Zhou, Tsung-Wei Huang
Computer-aided design (CAD) tools typically incorporate thousands or millions of functional tasks and dependencies to implement various synthesis and analysis algorithms. Efficiently scheduling these tasks in a computing environment that comprises manycore CPUs and GPUs is critically important because it governs the macro-scale performance. However, existing scheduling methods are typically hardcoded within an application that are not adaptive to the change of computing environment. To overcome this challenge, this paper will introduce a novel reinforcement learning-based scheduling algorithm that can learn to adapt the performance optimization to a given runtime (task execution environment) situation. We will present a case study on VLSI timing analysis to demonstrate the effectiveness of our learning-based scheduling algorithm. For instance, our algorithm can achieve the same performance of the baseline while using only 20% of CPU resources.
计算机辅助设计(CAD)工具通常包含数千或数百万个功能任务和依赖关系,以实现各种合成和分析算法。在由多核 CPU 和 GPU 组成的计算环境中有效调度这些任务至关重要,因为这关系到宏观性能。然而,现有的调度方法通常是应用程序中的硬编码,无法适应计算环境的变化。为了克服这一挑战,本文将介绍一种新颖的基于强化学习的调度算法,该算法可以学习如何根据给定的运行时(任务执行环境)情况调整性能优化。我们将介绍一个关于 VLSI 时序分析的案例研究,以证明我们基于学习的调度算法的有效性。例如,我们的算法可以实现与基线算法相同的性能,而只需使用 20% 的 CPU 资源。
{"title":"A Resource-efficient Task Scheduling System using Reinforcement Learning : Invited Paper","authors":"Chedi Morchdi, Cheng-Hsiang Chiu, Yi Zhou, Tsung-Wei Huang","doi":"10.1109/ASP-DAC58780.2024.10473960","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473960","url":null,"abstract":"Computer-aided design (CAD) tools typically incorporate thousands or millions of functional tasks and dependencies to implement various synthesis and analysis algorithms. Efficiently scheduling these tasks in a computing environment that comprises manycore CPUs and GPUs is critically important because it governs the macro-scale performance. However, existing scheduling methods are typically hardcoded within an application that are not adaptive to the change of computing environment. To overcome this challenge, this paper will introduce a novel reinforcement learning-based scheduling algorithm that can learn to adapt the performance optimization to a given runtime (task execution environment) situation. We will present a case study on VLSI timing analysis to demonstrate the effectiveness of our learning-based scheduling algorithm. For instance, our algorithm can achieve the same performance of the baseline while using only 20% of CPU resources.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"14 1","pages":"89-95"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Decomposing Complex Test Cases for Efficient Post-silicon Validation 分解复杂测试用例,实现高效硅后验证
Pub Date : 2024-01-22 DOI: 10.1109/ASP-DAC58780.2024.10473813
C. Harshitha, Sundarapalli Harikrishna, Peddakotla Rohith, Sandeep Chandran, Rajshekar Kalayappan
In post-silicon validation, the first step when an erroneous behavior is uncovered by a long-running test case is to reproduce the observed behavior in a shorter execution. This makes it amenable to use a variety of tools and techniques to debug the error. In this work, we propose a tool called Gru, that takes a long execution trace as input and generates a set of executables, one for each section of the trace. Each generated executable is guaranteed to faithfully replicate the behavior observed in the corresponding section of the original, complex test case independently. This enables the generated executables to be run simultaneously across different silicon samples, thereby allowing further debugging activities to proceed in parallel. The generation of executables does not require the source code of the complex test case and hence supports privacy-aware debugging in scenarios involving sensitive Intellectual Properties (IPs). We demonstrate the effectiveness of this tool on a collection of 10 EEMBC benchmarks that are executed on a bare-metal LEON3 SoC.
在硅片后期验证中,当长期运行的测试用例发现错误行为时,第一步就是在较短的执行时间内重现观察到的行为。这样就可以使用各种工具和技术来调试错误。在这项工作中,我们提出了一种名为 Gru 的工具,它将长执行跟踪作为输入,并生成一组可执行文件,跟踪的每一部分都有一个可执行文件。保证生成的每个可执行文件都能忠实复制在原始复杂测试用例的相应部分中观察到的行为。这样,生成的可执行文件就能在不同的硅片样本中同时运行,从而允许进一步的调试活动并行进行。可执行文件的生成不需要复杂测试用例的源代码,因此支持在涉及敏感知识产权(IP)的情况下进行隐私感知调试。我们在裸机 LEON3 SoC 上执行的 10 个 EEMBC 基准集合上演示了该工具的有效性。
{"title":"On Decomposing Complex Test Cases for Efficient Post-silicon Validation","authors":"C. Harshitha, Sundarapalli Harikrishna, Peddakotla Rohith, Sandeep Chandran, Rajshekar Kalayappan","doi":"10.1109/ASP-DAC58780.2024.10473813","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473813","url":null,"abstract":"In post-silicon validation, the first step when an erroneous behavior is uncovered by a long-running test case is to reproduce the observed behavior in a shorter execution. This makes it amenable to use a variety of tools and techniques to debug the error. In this work, we propose a tool called Gru, that takes a long execution trace as input and generates a set of executables, one for each section of the trace. Each generated executable is guaranteed to faithfully replicate the behavior observed in the corresponding section of the original, complex test case independently. This enables the generated executables to be run simultaneously across different silicon samples, thereby allowing further debugging activities to proceed in parallel. The generation of executables does not require the source code of the complex test case and hence supports privacy-aware debugging in scenarios involving sensitive Intellectual Properties (IPs). We demonstrate the effectiveness of this tool on a collection of 10 EEMBC benchmarks that are executed on a bare-metal LEON3 SoC.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"166 2","pages":"256-261"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPIRAL: Signal-Power Integrity Co-Analysis for High-Speed Inter-Chiplet Serial Links Validation SPIRAL:用于高速芯片间串行链路验证的信号-电源完整性协同分析
Pub Date : 2024-01-22 DOI: 10.1109/ASP-DAC58780.2024.10473908
Xiao Dong, Songyu Sun, Yangfan Jiang, Jingtong Hu, Dawei Gao, Cheng Zhuo
Chiplet has recently emerged as a promising solution to achieving further performance improvements by breaking down complex processors into modular components and communicating through high-speed inter-chiplet serial links. However, the ever-growing on-package routing density and data rates of such serial links inevitably lead to more complex and worse signal and power integrity issues than a large monolithic chip. This highly demands efficient analysis and validation tools to support robust design. In this paper, a signal-power integrity co-analysis framework for high-speed inter-chiplet serial links validation named SPIRAL is proposed. The framework first builds equivalent models for the links with a machine learning-based transmitter model and an impulse response based model for the channel and receiver: Then, the signal-power integrity is co-analyzed with a pulse response based method using the equivalent models. Experimental results show that SPIRAL yields eye diagrams with 0.82-1.85% mean relative error, while achieving $18-44 times$ speedup compared to a commercial SPICE.
最近,Chiplet 成为一种很有前途的解决方案,它将复杂的处理器分解成模块化组件,并通过高速芯片间串行链路进行通信,从而进一步提高性能。然而,这种串行链路的封装内路由密度和数据速率不断增加,不可避免地会导致比大型单片芯片更复杂、更糟糕的信号和电源完整性问题。这就非常需要高效的分析和验证工具来支持稳健的设计。本文提出了一个用于高速芯片间串行链路验证的信号-电源完整性协同分析框架,命名为 SPIRAL。该框架首先利用基于机器学习的发射器模型和基于脉冲响应的信道和接收器模型为链路建立等效模型:然后,使用基于脉冲响应的方法,利用等效模型共同分析信号-功率完整性。实验结果表明,SPIRAL 生成的眼图平均相对误差为 0.82-1.85%,与商用 SPICE 相比,速度提高了 18-44 times$。
{"title":"SPIRAL: Signal-Power Integrity Co-Analysis for High-Speed Inter-Chiplet Serial Links Validation","authors":"Xiao Dong, Songyu Sun, Yangfan Jiang, Jingtong Hu, Dawei Gao, Cheng Zhuo","doi":"10.1109/ASP-DAC58780.2024.10473908","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473908","url":null,"abstract":"Chiplet has recently emerged as a promising solution to achieving further performance improvements by breaking down complex processors into modular components and communicating through high-speed inter-chiplet serial links. However, the ever-growing on-package routing density and data rates of such serial links inevitably lead to more complex and worse signal and power integrity issues than a large monolithic chip. This highly demands efficient analysis and validation tools to support robust design. In this paper, a signal-power integrity co-analysis framework for high-speed inter-chiplet serial links validation named SPIRAL is proposed. The framework first builds equivalent models for the links with a machine learning-based transmitter model and an impulse response based model for the channel and receiver: Then, the signal-power integrity is co-analyzed with a pulse response based method using the equivalent models. Experimental results show that SPIRAL yields eye diagrams with 0.82-1.85% mean relative error, while achieving $18-44 times$ speedup compared to a commercial SPICE.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"21 9","pages":"625-630"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Timing Analysis beyond Complementary CMOS Logic Styles 互补 CMOS 逻辑类型之外的时序分析
Pub Date : 2024-01-22 DOI: 10.1109/ASP-DAC58780.2024.10473842
J. Lappas, Mohamed Amine Riahi, C. Weis, Norbert Wehn, Sani Nassif
With scaling unabated, device density continues to increase, but power and thermal budgets prevent the full use of all available devices. This leads to the exploration of alternative circuit styles beyond traditional CMOS, especially dynamic data-dependent styles, but the excessive pessimism inherent in conventional static timing analysis tools presents a barrier to adoption. One such circuit family is Pass-Transistor Logic (PTL), which holds significant promise but behaves differently from CMOS in that traditional CMOS-oriented EDA tools cannot produce sufficiently accurate performance estimates. In this work, we revisit timing analysis and its premises and show a significantly improved methodology of a more generalized dynamic timing engine that accurately predicts timing performance for traditional CMOS as well as PTL with an accuracy of 4.0% compared to SPICE and with a run-time comparable to traditional gate-level simulation. The run-time improvement compared with SPICE is four orders of magnitude.
随着规模的不断扩大,器件密度持续增加,但功耗和散热预算阻碍了对所有可用器件的充分利用。因此,人们开始探索传统 CMOS 以外的其他电路样式,尤其是动态数据依赖型电路样式,但传统静态时序分析工具固有的过度悲观情绪阻碍了这种电路样式的采用。Pass-Transistor Logic(PTL)就是这样一个电路系列,它前景广阔,但与 CMOS 不同的是,传统的面向 CMOS 的 EDA 工具无法提供足够准确的性能估计。在这项工作中,我们重新审视了时序分析及其前提条件,并展示了一种经过显著改进的方法,即一种更通用的动态时序引擎,它能准确预测传统 CMOS 和 PTL 的时序性能,与 SPICE 相比,准确率提高了 4.0%,运行时间与传统门级仿真相当。与 SPICE 相比,运行时间提高了四个数量级。
{"title":"Timing Analysis beyond Complementary CMOS Logic Styles","authors":"J. Lappas, Mohamed Amine Riahi, C. Weis, Norbert Wehn, Sani Nassif","doi":"10.1109/ASP-DAC58780.2024.10473842","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473842","url":null,"abstract":"With scaling unabated, device density continues to increase, but power and thermal budgets prevent the full use of all available devices. This leads to the exploration of alternative circuit styles beyond traditional CMOS, especially dynamic data-dependent styles, but the excessive pessimism inherent in conventional static timing analysis tools presents a barrier to adoption. One such circuit family is Pass-Transistor Logic (PTL), which holds significant promise but behaves differently from CMOS in that traditional CMOS-oriented EDA tools cannot produce sufficiently accurate performance estimates. In this work, we revisit timing analysis and its premises and show a significantly improved methodology of a more generalized dynamic timing engine that accurately predicts timing performance for traditional CMOS as well as PTL with an accuracy of 4.0% compared to SPICE and with a run-time comparable to traditional gate-level simulation. The run-time improvement compared with SPICE is four orders of magnitude.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"66 3","pages":"189-194"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1