Performance variability appears strong-nonlinear in analog ICs due to large process variations in advanced technologies. To capture such variability, a vast amount of data is required for learning-based accurate models. On the other hand, yield estimation across multiple PVT corners exacerbates data dimensionality further. In this paper, we propose a graph neural network (GNN)-based performance variability modeling method. The key idea is to leverage GNN techniques to extract variations-related local mismatch in analog circuits, and data efficiency is benefited by the ability of knowledge transfer among different PVT corners. Demonstrated upon three circuits in a commercial 65nm CMOS process and compared with the state-of-the-art modeling techniques, our method can achieve higher modeling accuracy while utilizing significantly less training data.
{"title":"A Transferable GNN-based Multi-Corner Performance Variability Modeling for Analog ICs","authors":"Hongjian Zhou, Yaguang Li, Xin Xiong, Pingqiang Zhou","doi":"10.1109/ASP-DAC58780.2024.10473858","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473858","url":null,"abstract":"Performance variability appears strong-nonlinear in analog ICs due to large process variations in advanced technologies. To capture such variability, a vast amount of data is required for learning-based accurate models. On the other hand, yield estimation across multiple PVT corners exacerbates data dimensionality further. In this paper, we propose a graph neural network (GNN)-based performance variability modeling method. The key idea is to leverage GNN techniques to extract variations-related local mismatch in analog circuits, and data efficiency is benefited by the ability of knowledge transfer among different PVT corners. Demonstrated upon three circuits in a commercial 65nm CMOS process and compared with the state-of-the-art modeling techniques, our method can achieve higher modeling accuracy while utilizing significantly less training data.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"254 11","pages":"411-416"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-22DOI: 10.1109/asp-dac58780.2024.10473969
{"title":"ASP-DAC 2024 Cover Page","authors":"","doi":"10.1109/asp-dac58780.2024.10473969","DOIUrl":"https://doi.org/10.1109/asp-dac58780.2024.10473969","url":null,"abstract":"","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"57 7-8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the increasing integration level of flow-based microfluidics, fully programmable valve arrays (FPVAs) have emerged as the next generation of microfluidic devices. Mi-crovalves in an FPVA are typically managed by a control logic, where valves are connected to a core input via control channels to receive control signals that guide states switching. The critical valves that suffer from asynchronous actuation leading to chip malfunctions, however, need to be switched simultaneously in a specific bioassay. As a result, the channel lengths from the core input to these valves are required to be equal or similar, which poses a challenge to the channel routing of the control logic. To solve this problem, we propose a deep reinforcement learning-based adaptive routing flow for the control logic of FPVAs. With the proposed routing flow, an efficient control channel network can be automatically constructed to realize accurate control signals propagation. Meanwhile, the timing skews among synchronized valves and the total length of control channels can be minimized, thus generating an optimized control logic with excellent timing behavior. Simulation results on multiple benchmarks demonstrate the effectiveness of the proposed routing flow.
{"title":"Adaptive Control-Logic Routing for Fully Programmable Valve Array Biochips Using Deep Reinforcement Learning","authors":"Huayang Cai, Genggeng Liu, Wenzhong Guo, Zipeng Li, Tsung-Yi Ho, Xing Huang","doi":"10.1109/ASP-DAC58780.2024.10473962","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473962","url":null,"abstract":"With the increasing integration level of flow-based microfluidics, fully programmable valve arrays (FPVAs) have emerged as the next generation of microfluidic devices. Mi-crovalves in an FPVA are typically managed by a control logic, where valves are connected to a core input via control channels to receive control signals that guide states switching. The critical valves that suffer from asynchronous actuation leading to chip malfunctions, however, need to be switched simultaneously in a specific bioassay. As a result, the channel lengths from the core input to these valves are required to be equal or similar, which poses a challenge to the channel routing of the control logic. To solve this problem, we propose a deep reinforcement learning-based adaptive routing flow for the control logic of FPVAs. With the proposed routing flow, an efficient control channel network can be automatically constructed to realize accurate control signals propagation. Meanwhile, the timing skews among synchronized valves and the total length of control channels can be minimized, thus generating an optimized control logic with excellent timing behavior. Simulation results on multiple benchmarks demonstrate the effectiveness of the proposed routing flow.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"55 3-4","pages":"564-569"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-22DOI: 10.1109/ASP-DAC58780.2024.10473947
Akul Malhotra, Chunguang Wang, Sumeet Kumar Gupta
Compute-in-memory based binary neural networks or CiM-BNNs offer high energy/area efficiency for the design of edge deep neural network (DNN) accelerators, with only a mild accuracy reduction. However, for successful deployment, the design of CiM-BNNs must consider challenges such as memory faults and data security that plague existing DNN accelerators. In this work, we aim to mitigate both these problems simultaneously by proposing BNN-Flip, a training-free weight transformation algorithm that not only enhances the fault tolerance of CiM-BNNs but also protects them from weight theft attacks. BNN-Flip inverts the rows and columns of the BNN weight matrix in a way that reduces the impact of memory faults on the CiM-BNN’s inference accuracy, while preserving the correctness of the CiM operation. Concurrently, our technique encodes the CiM-BNN weights, securing them from weight theft. Our experiments on various CiM-BNNs show that BNN-Flip achieves an inference accuracy increase of up to 10.55% over the baseline (i.e. CiM-BNNs not employing BNN-Flip) in the presence of memory faults. Additionally, we show that the encoded weights generated by BNN-Flip furnish extremely low (near ‘random guess’) inference accuracy for the adversary attempting weight theft. The benefits of BNN-Flip come with an energy overhead of < 3%.
{"title":"BNN-Flip: Enhancing the Fault Tolerance and Security of Compute-in-Memory Enabled Binary Neural Network Accelerators","authors":"Akul Malhotra, Chunguang Wang, Sumeet Kumar Gupta","doi":"10.1109/ASP-DAC58780.2024.10473947","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473947","url":null,"abstract":"Compute-in-memory based binary neural networks or CiM-BNNs offer high energy/area efficiency for the design of edge deep neural network (DNN) accelerators, with only a mild accuracy reduction. However, for successful deployment, the design of CiM-BNNs must consider challenges such as memory faults and data security that plague existing DNN accelerators. In this work, we aim to mitigate both these problems simultaneously by proposing BNN-Flip, a training-free weight transformation algorithm that not only enhances the fault tolerance of CiM-BNNs but also protects them from weight theft attacks. BNN-Flip inverts the rows and columns of the BNN weight matrix in a way that reduces the impact of memory faults on the CiM-BNN’s inference accuracy, while preserving the correctness of the CiM operation. Concurrently, our technique encodes the CiM-BNN weights, securing them from weight theft. Our experiments on various CiM-BNNs show that BNN-Flip achieves an inference accuracy increase of up to 10.55% over the baseline (i.e. CiM-BNNs not employing BNN-Flip) in the presence of memory faults. Additionally, we show that the encoded weights generated by BNN-Flip furnish extremely low (near ‘random guess’) inference accuracy for the adversary attempting weight theft. The benefits of BNN-Flip come with an energy overhead of < 3%.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"54 7-8","pages":"146-152"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-22DOI: 10.1109/ASP-DAC58780.2024.10473798
Zhikai Wang, Jingbo Zhou, Xiaosen Liu, Yan Wang
Online surrogate model-assisted evolution algorithms (SAEAs) are very efficient for analog/RF circuit optimization. To improve modeling accuracy/sizing results, we propose an efficient transfer learning-assisted global optimization (TLAGO) scheme that can transfer useful knowledge between neural networks to improve modeling accuracy in SAEAs. The novelty mainly relies on a novel transfer learning scheme, including a modeling strategy and novel adaptive transfer learning network, for high-accuracy modeling, and greedy strategy for balancing exploration and exploitation. With lower optimization time, TLAGO can have a faster rate of convergence and more than 8% better performances than GASPAD.
{"title":"An Efficient Transfer Learning Assisted Global Optimization Scheme for Analog/RF Circuits","authors":"Zhikai Wang, Jingbo Zhou, Xiaosen Liu, Yan Wang","doi":"10.1109/ASP-DAC58780.2024.10473798","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473798","url":null,"abstract":"Online surrogate model-assisted evolution algorithms (SAEAs) are very efficient for analog/RF circuit optimization. To improve modeling accuracy/sizing results, we propose an efficient transfer learning-assisted global optimization (TLAGO) scheme that can transfer useful knowledge between neural networks to improve modeling accuracy in SAEAs. The novelty mainly relies on a novel transfer learning scheme, including a modeling strategy and novel adaptive transfer learning network, for high-accuracy modeling, and greedy strategy for balancing exploration and exploitation. With lower optimization time, TLAGO can have a faster rate of convergence and more than 8% better performances than GASPAD.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"35 7-8","pages":"417-422"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-22DOI: 10.1109/ASP-DAC58780.2024.10473953
Lucas Klemmer Daniel, Daniel Große
Taking a hardware design from concept to silicon is a long and complicated process, partly due to very long-running simulations. After modifying a Register Transfer Level (RTL) design, it is typically handed off to the simulator, which then simulates the full design for a given amount of time. If a bug is discovered, there is no way to adjust the design while still in the context of the simulation. Instead, all simulation results are thrown away, and the entire cycle must be restarted from the beginning.In this paper, we argue that it is worth breaking up this strict separation between design languages, analysis languages, verification languages, and simulators. We present virtual signals, a methodology to inject new logic into existing waveforms.Virtual signals are based on WAL, an open-source waveform analysis language, and can therefore use the capabilities of WAL for debugging, fixing, analyzing, and verifying a design. All this enables an interactive and fast response design-debug-verification cycle. To demonstrate the benefits of our methodology, we present a case-study in which we show how the technique improves debugging and design analysis.
{"title":"Towards a Highly Interactive Design-Debug-Verification Cycle","authors":"Lucas Klemmer Daniel, Daniel Große","doi":"10.1109/ASP-DAC58780.2024.10473953","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473953","url":null,"abstract":"Taking a hardware design from concept to silicon is a long and complicated process, partly due to very long-running simulations. After modifying a Register Transfer Level (RTL) design, it is typically handed off to the simulator, which then simulates the full design for a given amount of time. If a bug is discovered, there is no way to adjust the design while still in the context of the simulation. Instead, all simulation results are thrown away, and the entire cycle must be restarted from the beginning.In this paper, we argue that it is worth breaking up this strict separation between design languages, analysis languages, verification languages, and simulators. We present virtual signals, a methodology to inject new logic into existing waveforms.Virtual signals are based on WAL, an open-source waveform analysis language, and can therefore use the capabilities of WAL for debugging, fixing, analyzing, and verifying a design. All this enables an interactive and fast response design-debug-verification cycle. To demonstrate the benefits of our methodology, we present a case-study in which we show how the technique improves debugging and design analysis.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"296 8","pages":"692-697"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-22DOI: 10.1109/ASP-DAC58780.2024.10473960
Chedi Morchdi, Cheng-Hsiang Chiu, Yi Zhou, Tsung-Wei Huang
Computer-aided design (CAD) tools typically incorporate thousands or millions of functional tasks and dependencies to implement various synthesis and analysis algorithms. Efficiently scheduling these tasks in a computing environment that comprises manycore CPUs and GPUs is critically important because it governs the macro-scale performance. However, existing scheduling methods are typically hardcoded within an application that are not adaptive to the change of computing environment. To overcome this challenge, this paper will introduce a novel reinforcement learning-based scheduling algorithm that can learn to adapt the performance optimization to a given runtime (task execution environment) situation. We will present a case study on VLSI timing analysis to demonstrate the effectiveness of our learning-based scheduling algorithm. For instance, our algorithm can achieve the same performance of the baseline while using only 20% of CPU resources.
计算机辅助设计(CAD)工具通常包含数千或数百万个功能任务和依赖关系,以实现各种合成和分析算法。在由多核 CPU 和 GPU 组成的计算环境中有效调度这些任务至关重要,因为这关系到宏观性能。然而,现有的调度方法通常是应用程序中的硬编码,无法适应计算环境的变化。为了克服这一挑战,本文将介绍一种新颖的基于强化学习的调度算法,该算法可以学习如何根据给定的运行时(任务执行环境)情况调整性能优化。我们将介绍一个关于 VLSI 时序分析的案例研究,以证明我们基于学习的调度算法的有效性。例如,我们的算法可以实现与基线算法相同的性能,而只需使用 20% 的 CPU 资源。
{"title":"A Resource-efficient Task Scheduling System using Reinforcement Learning : Invited Paper","authors":"Chedi Morchdi, Cheng-Hsiang Chiu, Yi Zhou, Tsung-Wei Huang","doi":"10.1109/ASP-DAC58780.2024.10473960","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473960","url":null,"abstract":"Computer-aided design (CAD) tools typically incorporate thousands or millions of functional tasks and dependencies to implement various synthesis and analysis algorithms. Efficiently scheduling these tasks in a computing environment that comprises manycore CPUs and GPUs is critically important because it governs the macro-scale performance. However, existing scheduling methods are typically hardcoded within an application that are not adaptive to the change of computing environment. To overcome this challenge, this paper will introduce a novel reinforcement learning-based scheduling algorithm that can learn to adapt the performance optimization to a given runtime (task execution environment) situation. We will present a case study on VLSI timing analysis to demonstrate the effectiveness of our learning-based scheduling algorithm. For instance, our algorithm can achieve the same performance of the baseline while using only 20% of CPU resources.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"14 1","pages":"89-95"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-22DOI: 10.1109/ASP-DAC58780.2024.10473813
C. Harshitha, Sundarapalli Harikrishna, Peddakotla Rohith, Sandeep Chandran, Rajshekar Kalayappan
In post-silicon validation, the first step when an erroneous behavior is uncovered by a long-running test case is to reproduce the observed behavior in a shorter execution. This makes it amenable to use a variety of tools and techniques to debug the error. In this work, we propose a tool called Gru, that takes a long execution trace as input and generates a set of executables, one for each section of the trace. Each generated executable is guaranteed to faithfully replicate the behavior observed in the corresponding section of the original, complex test case independently. This enables the generated executables to be run simultaneously across different silicon samples, thereby allowing further debugging activities to proceed in parallel. The generation of executables does not require the source code of the complex test case and hence supports privacy-aware debugging in scenarios involving sensitive Intellectual Properties (IPs). We demonstrate the effectiveness of this tool on a collection of 10 EEMBC benchmarks that are executed on a bare-metal LEON3 SoC.
{"title":"On Decomposing Complex Test Cases for Efficient Post-silicon Validation","authors":"C. Harshitha, Sundarapalli Harikrishna, Peddakotla Rohith, Sandeep Chandran, Rajshekar Kalayappan","doi":"10.1109/ASP-DAC58780.2024.10473813","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473813","url":null,"abstract":"In post-silicon validation, the first step when an erroneous behavior is uncovered by a long-running test case is to reproduce the observed behavior in a shorter execution. This makes it amenable to use a variety of tools and techniques to debug the error. In this work, we propose a tool called Gru, that takes a long execution trace as input and generates a set of executables, one for each section of the trace. Each generated executable is guaranteed to faithfully replicate the behavior observed in the corresponding section of the original, complex test case independently. This enables the generated executables to be run simultaneously across different silicon samples, thereby allowing further debugging activities to proceed in parallel. The generation of executables does not require the source code of the complex test case and hence supports privacy-aware debugging in scenarios involving sensitive Intellectual Properties (IPs). We demonstrate the effectiveness of this tool on a collection of 10 EEMBC benchmarks that are executed on a bare-metal LEON3 SoC.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"166 2","pages":"256-261"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chiplet has recently emerged as a promising solution to achieving further performance improvements by breaking down complex processors into modular components and communicating through high-speed inter-chiplet serial links. However, the ever-growing on-package routing density and data rates of such serial links inevitably lead to more complex and worse signal and power integrity issues than a large monolithic chip. This highly demands efficient analysis and validation tools to support robust design. In this paper, a signal-power integrity co-analysis framework for high-speed inter-chiplet serial links validation named SPIRAL is proposed. The framework first builds equivalent models for the links with a machine learning-based transmitter model and an impulse response based model for the channel and receiver: Then, the signal-power integrity is co-analyzed with a pulse response based method using the equivalent models. Experimental results show that SPIRAL yields eye diagrams with 0.82-1.85% mean relative error, while achieving $18-44 times$ speedup compared to a commercial SPICE.
{"title":"SPIRAL: Signal-Power Integrity Co-Analysis for High-Speed Inter-Chiplet Serial Links Validation","authors":"Xiao Dong, Songyu Sun, Yangfan Jiang, Jingtong Hu, Dawei Gao, Cheng Zhuo","doi":"10.1109/ASP-DAC58780.2024.10473908","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473908","url":null,"abstract":"Chiplet has recently emerged as a promising solution to achieving further performance improvements by breaking down complex processors into modular components and communicating through high-speed inter-chiplet serial links. However, the ever-growing on-package routing density and data rates of such serial links inevitably lead to more complex and worse signal and power integrity issues than a large monolithic chip. This highly demands efficient analysis and validation tools to support robust design. In this paper, a signal-power integrity co-analysis framework for high-speed inter-chiplet serial links validation named SPIRAL is proposed. The framework first builds equivalent models for the links with a machine learning-based transmitter model and an impulse response based model for the channel and receiver: Then, the signal-power integrity is co-analyzed with a pulse response based method using the equivalent models. Experimental results show that SPIRAL yields eye diagrams with 0.82-1.85% mean relative error, while achieving $18-44 times$ speedup compared to a commercial SPICE.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"21 9","pages":"625-630"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140530650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-22DOI: 10.1109/ASP-DAC58780.2024.10473842
J. Lappas, Mohamed Amine Riahi, C. Weis, Norbert Wehn, Sani Nassif
With scaling unabated, device density continues to increase, but power and thermal budgets prevent the full use of all available devices. This leads to the exploration of alternative circuit styles beyond traditional CMOS, especially dynamic data-dependent styles, but the excessive pessimism inherent in conventional static timing analysis tools presents a barrier to adoption. One such circuit family is Pass-Transistor Logic (PTL), which holds significant promise but behaves differently from CMOS in that traditional CMOS-oriented EDA tools cannot produce sufficiently accurate performance estimates. In this work, we revisit timing analysis and its premises and show a significantly improved methodology of a more generalized dynamic timing engine that accurately predicts timing performance for traditional CMOS as well as PTL with an accuracy of 4.0% compared to SPICE and with a run-time comparable to traditional gate-level simulation. The run-time improvement compared with SPICE is four orders of magnitude.
{"title":"Timing Analysis beyond Complementary CMOS Logic Styles","authors":"J. Lappas, Mohamed Amine Riahi, C. Weis, Norbert Wehn, Sani Nassif","doi":"10.1109/ASP-DAC58780.2024.10473842","DOIUrl":"https://doi.org/10.1109/ASP-DAC58780.2024.10473842","url":null,"abstract":"With scaling unabated, device density continues to increase, but power and thermal budgets prevent the full use of all available devices. This leads to the exploration of alternative circuit styles beyond traditional CMOS, especially dynamic data-dependent styles, but the excessive pessimism inherent in conventional static timing analysis tools presents a barrier to adoption. One such circuit family is Pass-Transistor Logic (PTL), which holds significant promise but behaves differently from CMOS in that traditional CMOS-oriented EDA tools cannot produce sufficiently accurate performance estimates. In this work, we revisit timing analysis and its premises and show a significantly improved methodology of a more generalized dynamic timing engine that accurately predicts timing performance for traditional CMOS as well as PTL with an accuracy of 4.0% compared to SPICE and with a run-time comparable to traditional gate-level simulation. The run-time improvement compared with SPICE is four orders of magnitude.","PeriodicalId":518586,"journal":{"name":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"66 3","pages":"189-194"},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}