{"title":"Improving clock-rate of hard-macro designs","authors":"C. Lavin, B. Nelson, B. Hutchings","doi":"10.1109/FPT.2013.6718361","DOIUrl":null,"url":null,"abstract":"HMFlow reuses precompiled circuit modules (hard macros) and other techniques to rapidly compile large designs in a few seconds - many times faster than standard Xilinx flows. However, the clock rates of designs rapidly compiled by HMFlow are often significantly lower than those compiled by the Xilinx flow. To improve clock rates, HMFlow algorithms were modified as follows: (1) the router was modified to take advantage of longer routing wires in the FPGA devices, (2) the original greedy placer was replaced with an annealing-based placer, and (3) certain registers were removed from the hard-macro and moved into the fabric to reduce critical-path delays. Benchmark circuits compiled with these modifications can achieve clock rates that are about 75% as fast as those achieved by Xilinx, on average. Fast run-times are also preserved; the improved algorithms only increase HMFlow run-times by about 50% across the benchmark suite so that HMFlow remains more than 30× faster than the standard Xilinx flow for the benchmarks tested in this paper.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Field-Programmable Technology (FPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPT.2013.6718361","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
HMFlow reuses precompiled circuit modules (hard macros) and other techniques to rapidly compile large designs in a few seconds - many times faster than standard Xilinx flows. However, the clock rates of designs rapidly compiled by HMFlow are often significantly lower than those compiled by the Xilinx flow. To improve clock rates, HMFlow algorithms were modified as follows: (1) the router was modified to take advantage of longer routing wires in the FPGA devices, (2) the original greedy placer was replaced with an annealing-based placer, and (3) certain registers were removed from the hard-macro and moved into the fabric to reduce critical-path delays. Benchmark circuits compiled with these modifications can achieve clock rates that are about 75% as fast as those achieved by Xilinx, on average. Fast run-times are also preserved; the improved algorithms only increase HMFlow run-times by about 50% across the benchmark suite so that HMFlow remains more than 30× faster than the standard Xilinx flow for the benchmarks tested in this paper.