{"title":"自动生成fpga的性能优化时钟树嵌入","authors":"Grant Brown, Ganesh Gore, P. Gaillardon","doi":"10.1109/ISVLSI59464.2023.10238626","DOIUrl":null,"url":null,"abstract":"Field Programmable Gate Arrays (FPGA) have grown in popularity in a myriad of applications due to their reconfigurablity and lower non-recurrent engineering costs when compared to application specific integrated circuits (ASIC). To keep pace with growing application needs and process technology improvements, commerical FPGAs have traditionally chosen full custom chip design approaches. However, embedded FPGAs (eFPGA) have redesigned FPGA uses to be more application specific, thereby producing the need for an agile design approach to accelerate the eFPGA design process. Hence, recent agile FPGA design methods have introduced automation in the design process, allowing for a semi-automated fine-tuning of physical and architectural parameters which reduces the physical design iteration time for FPGAs. The novel grid-based design methods render the usage of commercially available Clock Tree Synthesis (CTS) algorithms on modern FPGA fabrics ineffective. To overcome these deficiencies, we propose a novel clock tree embedding algorithm, utilizing a symmetrical clock tree to ensure skew minimization followed by an efficient pruning method leveraging traditional Static Timing Analysis (STA) to improve clock latency. Experimental results on $2\\times 2,\\ 7\\times 7,\\ 8\\times 8,\\ 29\\times 29$, and $32\\times 32$ FPGAs show that our proposed CTS algorithm can achieve up to a 50% improvement in latency and over a $10\\times$ reduction in skew when compared to an implementation using commercial CTS methodology.","PeriodicalId":199371,"journal":{"name":"2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Optimized Clock Tree Embedding for Auto-Generated FPGAs\",\"authors\":\"Grant Brown, Ganesh Gore, P. Gaillardon\",\"doi\":\"10.1109/ISVLSI59464.2023.10238626\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Field Programmable Gate Arrays (FPGA) have grown in popularity in a myriad of applications due to their reconfigurablity and lower non-recurrent engineering costs when compared to application specific integrated circuits (ASIC). To keep pace with growing application needs and process technology improvements, commerical FPGAs have traditionally chosen full custom chip design approaches. However, embedded FPGAs (eFPGA) have redesigned FPGA uses to be more application specific, thereby producing the need for an agile design approach to accelerate the eFPGA design process. Hence, recent agile FPGA design methods have introduced automation in the design process, allowing for a semi-automated fine-tuning of physical and architectural parameters which reduces the physical design iteration time for FPGAs. The novel grid-based design methods render the usage of commercially available Clock Tree Synthesis (CTS) algorithms on modern FPGA fabrics ineffective. To overcome these deficiencies, we propose a novel clock tree embedding algorithm, utilizing a symmetrical clock tree to ensure skew minimization followed by an efficient pruning method leveraging traditional Static Timing Analysis (STA) to improve clock latency. Experimental results on $2\\\\times 2,\\\\ 7\\\\times 7,\\\\ 8\\\\times 8,\\\\ 29\\\\times 29$, and $32\\\\times 32$ FPGAs show that our proposed CTS algorithm can achieve up to a 50% improvement in latency and over a $10\\\\times$ reduction in skew when compared to an implementation using commercial CTS methodology.\",\"PeriodicalId\":199371,\"journal\":{\"name\":\"2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISVLSI59464.2023.10238626\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISVLSI59464.2023.10238626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Optimized Clock Tree Embedding for Auto-Generated FPGAs
Field Programmable Gate Arrays (FPGA) have grown in popularity in a myriad of applications due to their reconfigurablity and lower non-recurrent engineering costs when compared to application specific integrated circuits (ASIC). To keep pace with growing application needs and process technology improvements, commerical FPGAs have traditionally chosen full custom chip design approaches. However, embedded FPGAs (eFPGA) have redesigned FPGA uses to be more application specific, thereby producing the need for an agile design approach to accelerate the eFPGA design process. Hence, recent agile FPGA design methods have introduced automation in the design process, allowing for a semi-automated fine-tuning of physical and architectural parameters which reduces the physical design iteration time for FPGAs. The novel grid-based design methods render the usage of commercially available Clock Tree Synthesis (CTS) algorithms on modern FPGA fabrics ineffective. To overcome these deficiencies, we propose a novel clock tree embedding algorithm, utilizing a symmetrical clock tree to ensure skew minimization followed by an efficient pruning method leveraging traditional Static Timing Analysis (STA) to improve clock latency. Experimental results on $2\times 2,\ 7\times 7,\ 8\times 8,\ 29\times 29$, and $32\times 32$ FPGAs show that our proposed CTS algorithm can achieve up to a 50% improvement in latency and over a $10\times$ reduction in skew when compared to an implementation using commercial CTS methodology.