Performance Optimized Clock Tree Embedding for Auto-Generated FPGAs

2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2023-06-20 DOI:10.1109/ISVLSI59464.2023.10238626

Grant Brown, Ganesh Gore, P. Gaillardon

{"title":"Performance Optimized Clock Tree Embedding for Auto-Generated FPGAs","authors":"Grant Brown, Ganesh Gore, P. Gaillardon","doi":"10.1109/ISVLSI59464.2023.10238626","DOIUrl":null,"url":null,"abstract":"Field Programmable Gate Arrays (FPGA) have grown in popularity in a myriad of applications due to their reconfigurablity and lower non-recurrent engineering costs when compared to application specific integrated circuits (ASIC). To keep pace with growing application needs and process technology improvements, commerical FPGAs have traditionally chosen full custom chip design approaches. However, embedded FPGAs (eFPGA) have redesigned FPGA uses to be more application specific, thereby producing the need for an agile design approach to accelerate the eFPGA design process. Hence, recent agile FPGA design methods have introduced automation in the design process, allowing for a semi-automated fine-tuning of physical and architectural parameters which reduces the physical design iteration time for FPGAs. The novel grid-based design methods render the usage of commercially available Clock Tree Synthesis (CTS) algorithms on modern FPGA fabrics ineffective. To overcome these deficiencies, we propose a novel clock tree embedding algorithm, utilizing a symmetrical clock tree to ensure skew minimization followed by an efficient pruning method leveraging traditional Static Timing Analysis (STA) to improve clock latency. Experimental results on $2\\times 2,\\ 7\\times 7,\\ 8\\times 8,\\ 29\\times 29$, and $32\\times 32$ FPGAs show that our proposed CTS algorithm can achieve up to a 50% improvement in latency and over a $10\\times$ reduction in skew when compared to an implementation using commercial CTS methodology.","PeriodicalId":199371,"journal":{"name":"2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISVLSI59464.2023.10238626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Field Programmable Gate Arrays (FPGA) have grown in popularity in a myriad of applications due to their reconfigurablity and lower non-recurrent engineering costs when compared to application specific integrated circuits (ASIC). To keep pace with growing application needs and process technology improvements, commerical FPGAs have traditionally chosen full custom chip design approaches. However, embedded FPGAs (eFPGA) have redesigned FPGA uses to be more application specific, thereby producing the need for an agile design approach to accelerate the eFPGA design process. Hence, recent agile FPGA design methods have introduced automation in the design process, allowing for a semi-automated fine-tuning of physical and architectural parameters which reduces the physical design iteration time for FPGAs. The novel grid-based design methods render the usage of commercially available Clock Tree Synthesis (CTS) algorithms on modern FPGA fabrics ineffective. To overcome these deficiencies, we propose a novel clock tree embedding algorithm, utilizing a symmetrical clock tree to ensure skew minimization followed by an efficient pruning method leveraging traditional Static Timing Analysis (STA) to improve clock latency. Experimental results on $2\times 2,\ 7\times 7,\ 8\times 8,\ 29\times 29$, and $32\times 32$ FPGAs show that our proposed CTS algorithm can achieve up to a 50% improvement in latency and over a $10\times$ reduction in skew when compared to an implementation using commercial CTS methodology.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

自动生成fpga的性能优化时钟树嵌入

与专用集成电路(ASIC)相比，现场可编程门阵列(FPGA)由于其可重构性和较低的非经常性工程成本，在无数应用中越来越受欢迎。为了跟上不断增长的应用需求和工艺技术的改进，商业fpga传统上选择了完全定制的芯片设计方法。然而，嵌入式FPGA (eFPGA)已经重新设计了FPGA的用途，使其更加特定于应用，从而产生了对敏捷设计方法的需求，以加速eFPGA设计过程。因此，最近敏捷的FPGA设计方法在设计过程中引入了自动化，允许对物理和架构参数进行半自动微调，从而减少了FPGA的物理设计迭代时间。新的基于网格的设计方法使得商用时钟树合成(CTS)算法在现代FPGA结构上的使用无效。为了克服这些不足，我们提出了一种新的时钟树嵌入算法，利用对称时钟树来确保偏差最小化，然后利用传统的静态时序分析(STA)有效的修剪方法来改善时钟延迟。在$2\ × 2、$ 7\ × 7、$ 8\ × 8、$ 29\ × 29$和$32\ × 32$ fpga上的实验结果表明，与使用商业CTS方法的实现相比，我们提出的CTS算法可以实现高达50%的延迟改进和超过$10\ × $的倾斜减少。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

自引率

0.00%

发文量