可扩展的可达性驱动的分析放置器，具有fpga的全局路由器集成(仅抽象)

Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays Pub Date : 2014-02-26 DOI:10.1145/2554688.2554711

Ka-Chun Lam, W. Tang, Evangeline F. Y. Young

{"title":"可扩展的可达性驱动的分析放置器，具有fpga的全局路由器集成(仅抽象)","authors":"Ka-Chun Lam, W. Tang, Evangeline F. Y. Young","doi":"10.1145/2554688.2554711","DOIUrl":null,"url":null,"abstract":"As the sizes of modern circuits become bigger and bigger, implementing those large circuits into FPGA becomes arduous. The state-of-the-art academic FPGA place-and-route tool, VPR, has good quality but needs around a whole day to complete a placement when the input circuit contains millions of lookup tables, excluding the runtime for routing. To expedite the placement process, we propose a routability-driven placement algorithm for FPGA that adopts techniques used in ASIC global placer. Our placer follows the lower-bound-and-upper-bound iterative optimization process in ASIC placers like Ripple. In the lower-bound computation, the total HPWL, modeled using the Bound2Bound net model, is minimized using the conjugate gradient method. In the upper-bound computation, an almost-legalized result is produced by spreading cells linearly in the placement area. Those positions are then served as fixed-point anchors and fed into the next lower-bound computation. Furthermore, global routing will be performed in the upper-bound computation to estimate the routing segment usage, as a mean to consider congestion in placement. We tested our approach using 20 MCNC benchmarks and 4 large benchmarks for performance and scalability. Experimental results show that based on the island-style architecture which VPR is most optimized for, our approach can obtain a placement result 8x faster than VPR with 2% more in channel width, or 3x faster with 1% more in channel width when congestion is being considered. Our approach is even 14x faster than VPR in placing large benchmarks with over 10,000 lookup tables, with only 7% more in channel width.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A scalable routability-driven analytical placer with global router integration for FPGAs (abstract only)\",\"authors\":\"Ka-Chun Lam, W. Tang, Evangeline F. Y. Young\",\"doi\":\"10.1145/2554688.2554711\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the sizes of modern circuits become bigger and bigger, implementing those large circuits into FPGA becomes arduous. The state-of-the-art academic FPGA place-and-route tool, VPR, has good quality but needs around a whole day to complete a placement when the input circuit contains millions of lookup tables, excluding the runtime for routing. To expedite the placement process, we propose a routability-driven placement algorithm for FPGA that adopts techniques used in ASIC global placer. Our placer follows the lower-bound-and-upper-bound iterative optimization process in ASIC placers like Ripple. In the lower-bound computation, the total HPWL, modeled using the Bound2Bound net model, is minimized using the conjugate gradient method. In the upper-bound computation, an almost-legalized result is produced by spreading cells linearly in the placement area. Those positions are then served as fixed-point anchors and fed into the next lower-bound computation. Furthermore, global routing will be performed in the upper-bound computation to estimate the routing segment usage, as a mean to consider congestion in placement. We tested our approach using 20 MCNC benchmarks and 4 large benchmarks for performance and scalability. Experimental results show that based on the island-style architecture which VPR is most optimized for, our approach can obtain a placement result 8x faster than VPR with 2% more in channel width, or 3x faster with 1% more in channel width when congestion is being considered. Our approach is even 14x faster than VPR in placing large benchmarks with over 10,000 lookup tables, with only 7% more in channel width.\",\"PeriodicalId\":390562,\"journal\":{\"name\":\"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2554688.2554711\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2554688.2554711","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

随着现代电路的尺寸越来越大，在FPGA中实现这些大型电路变得非常困难。最先进的学术FPGA放置和路由工具VPR具有良好的质量，但当输入电路包含数百万个查找表(不包括路由运行时)时，需要大约一整天才能完成放置。为了加快放置过程，我们提出了一种可达性驱动的FPGA放置算法，该算法采用了ASIC全局放置器中使用的技术。我们的砂矿遵循Ripple等ASIC砂矿的下限和上限迭代优化过程。在下界计算中，使用Bound2Bound网络模型建模的总HPWL使用共轭梯度法最小化。在上界计算中，通过在放置区域内线性扩展单元，得到一个几乎合法化的结果。然后将这些位置作为定点锚点，并输入到下一个下界计算中。此外，全局路由将在上界计算中执行，以估计路由段的使用情况，作为考虑放置中的拥塞的平均值。我们使用20个MCNC基准和4个大型性能和可伸缩性基准测试了我们的方法。实验结果表明，基于最适合VPR的岛式架构，我们的方法可以比VPR快8倍，通道宽度增加2%，考虑拥塞时可以比VPR快3倍，通道宽度增加1%。在放置超过10,000个查找表的大型基准测试时，我们的方法甚至比VPR快14倍，通道宽度仅多7%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A scalable routability-driven analytical placer with global router integration for FPGAs (abstract only)

As the sizes of modern circuits become bigger and bigger, implementing those large circuits into FPGA becomes arduous. The state-of-the-art academic FPGA place-and-route tool, VPR, has good quality but needs around a whole day to complete a placement when the input circuit contains millions of lookup tables, excluding the runtime for routing. To expedite the placement process, we propose a routability-driven placement algorithm for FPGA that adopts techniques used in ASIC global placer. Our placer follows the lower-bound-and-upper-bound iterative optimization process in ASIC placers like Ripple. In the lower-bound computation, the total HPWL, modeled using the Bound2Bound net model, is minimized using the conjugate gradient method. In the upper-bound computation, an almost-legalized result is produced by spreading cells linearly in the placement area. Those positions are then served as fixed-point anchors and fed into the next lower-bound computation. Furthermore, global routing will be performed in the upper-bound computation to estimate the routing segment usage, as a mean to consider congestion in placement. We tested our approach using 20 MCNC benchmarks and 4 large benchmarks for performance and scalability. Experimental results show that based on the island-style architecture which VPR is most optimized for, our approach can obtain a placement result 8x faster than VPR with 2% more in channel width, or 3x faster with 1% more in channel width when congestion is being considered. Our approach is even 14x faster than VPR in placing large benchmarks with over 10,000 lookup tables, with only 7% more in channel width.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays

自引率

0.00%

发文量