High-Throughput Logic Timing Simulation on GPGPUs

ACM Trans. Design Autom. Electr. Syst. Pub Date : 2015-06-24 DOI:10.1145/2714564

S. Holst, M. Imhof, H. Wunderlich

{"title":"High-Throughput Logic Timing Simulation on GPGPUs","authors":"S. Holst, M. Imhof, H. Wunderlich","doi":"10.1145/2714564","DOIUrl":null,"url":null,"abstract":"Many EDA tasks such as test set characterization or the precise estimation of power consumption, power droop and temperature development, require a very large number of time-aware gate-level logic simulations. Until now, such characterizations have been feasible only for rather small designs or with reduced precision due to the high computational demands.\n The new simulation system presented here is able to accelerate such tasks by more than two orders of magnitude and provides for the first time fast and comprehensive timing simulations for industrial-sized designs. Hazards, pulse-filtering, and pin-to-pin delay are supported for the first time in a GPGPU accelerated simulator, and the system can easily be extended to even more realistic delay models and further applications.\n A sophisticated mapping with efficient memory utilization and access patterns as well as minimal synchronizations and control flow divergence is able to use the full potential of GPGPU architectures. To provide such a mapping, we combine for the first time the versatility of event-based timing simulation and multi-dimensional parallelism used in GPU-based gate-level simulators. The result is a throughput-optimized timing simulation algorithm, which runs many simulation instances in parallel and at the same time fully exploits gate-parallelism within the circuit.","PeriodicalId":7063,"journal":{"name":"ACM Trans. Design Autom. Electr. Syst.","volume":"15 1","pages":"37:1-37:22"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"31","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Trans. Design Autom. Electr. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2714564","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 31

Abstract

Many EDA tasks such as test set characterization or the precise estimation of power consumption, power droop and temperature development, require a very large number of time-aware gate-level logic simulations. Until now, such characterizations have been feasible only for rather small designs or with reduced precision due to the high computational demands. The new simulation system presented here is able to accelerate such tasks by more than two orders of magnitude and provides for the first time fast and comprehensive timing simulations for industrial-sized designs. Hazards, pulse-filtering, and pin-to-pin delay are supported for the first time in a GPGPU accelerated simulator, and the system can easily be extended to even more realistic delay models and further applications. A sophisticated mapping with efficient memory utilization and access patterns as well as minimal synchronizations and control flow divergence is able to use the full potential of GPGPU architectures. To provide such a mapping, we combine for the first time the versatility of event-based timing simulation and multi-dimensional parallelism used in GPU-based gate-level simulators. The result is a throughput-optimized timing simulation algorithm, which runs many simulation instances in parallel and at the same time fully exploits gate-parallelism within the circuit.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于gpgpu的高吞吐量逻辑时序仿真

许多EDA任务，如测试集表征或功耗，功耗下降和温度发展的精确估计，需要非常大量的时间感知门级逻辑仿真。到目前为止，这种表征仅适用于相当小的设计或由于高计算要求而降低精度。本文介绍的新仿真系统能够将此类任务的速度提高两个数量级以上，并首次为工业规模的设计提供快速而全面的时序仿真。在GPGPU加速模拟器中首次支持危险，脉冲滤波和引脚到引脚延迟，并且该系统可以很容易地扩展到更现实的延迟模型和进一步的应用。具有高效内存利用和访问模式以及最小同步和控制流分歧的复杂映射能够充分利用GPGPU架构的潜力。为了提供这样的映射，我们首次将基于事件的时序仿真的多功能性和基于gpu的门级模拟器中使用的多维并行性结合起来。结果是一种吞吐量优化的时序仿真算法，该算法并行运行多个仿真实例，同时充分利用电路内的门并行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Trans. Design Autom. Electr. Syst.

自引率

0.00%

发文量