G. Inggs, Shane T. Fleming, David B. Thomas, W. Luk
{"title":"高级合成是否已准备就绪?计算金融案例研究","authors":"G. Inggs, Shane T. Fleming, David B. Thomas, W. Luk","doi":"10.1109/FPT.2014.7082747","DOIUrl":null,"url":null,"abstract":"High Level Synthesis (HLS) tools for Field Programmable Gate Arrays (FPGAs) have made considerable progress, and are now sufficiently mature that a novice developer could create functionally correct implementation with limited understanding of the target hardware. In this case study, a novice developer considers a benchmark of financial problems for implementation upon FPGA via HLS. This novice starts by extending an existing implementation for a CPU or GPU using tools such as Xilinx's Vivado HLS, the Altera OpenCL SDK or Maxeler's MaxCompiler. When their direct source code translation inevitably didn't meet performance expectations, this developer then applies optimisations such as exploiting task or pipeline parallelism as well as C-slowing. When a combination of these optimisations are considered for a range of devices and process technologies, an acceleration of up to 220 times is achieved using these tools, the sort of acceleration expected of custom architectures. Compared to the 31 times improvement shown by an optimised Multicore CPU implementation, the 60 times improvement by a GPU and 207 times by a Xeon Phi, these results suggest that HLS is indeed ready for industrial adoption.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"66 1","pages":"12-19"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"Is high level synthesis ready for business? A computational finance case study\",\"authors\":\"G. Inggs, Shane T. Fleming, David B. Thomas, W. Luk\",\"doi\":\"10.1109/FPT.2014.7082747\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High Level Synthesis (HLS) tools for Field Programmable Gate Arrays (FPGAs) have made considerable progress, and are now sufficiently mature that a novice developer could create functionally correct implementation with limited understanding of the target hardware. In this case study, a novice developer considers a benchmark of financial problems for implementation upon FPGA via HLS. This novice starts by extending an existing implementation for a CPU or GPU using tools such as Xilinx's Vivado HLS, the Altera OpenCL SDK or Maxeler's MaxCompiler. When their direct source code translation inevitably didn't meet performance expectations, this developer then applies optimisations such as exploiting task or pipeline parallelism as well as C-slowing. When a combination of these optimisations are considered for a range of devices and process technologies, an acceleration of up to 220 times is achieved using these tools, the sort of acceleration expected of custom architectures. Compared to the 31 times improvement shown by an optimised Multicore CPU implementation, the 60 times improvement by a GPU and 207 times by a Xeon Phi, these results suggest that HLS is indeed ready for industrial adoption.\",\"PeriodicalId\":6877,\"journal\":{\"name\":\"2014 International Conference on Field-Programmable Technology (FPT)\",\"volume\":\"66 1\",\"pages\":\"12-19\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Field-Programmable Technology (FPT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FPT.2014.7082747\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Field-Programmable Technology (FPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPT.2014.7082747","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Is high level synthesis ready for business? A computational finance case study
High Level Synthesis (HLS) tools for Field Programmable Gate Arrays (FPGAs) have made considerable progress, and are now sufficiently mature that a novice developer could create functionally correct implementation with limited understanding of the target hardware. In this case study, a novice developer considers a benchmark of financial problems for implementation upon FPGA via HLS. This novice starts by extending an existing implementation for a CPU or GPU using tools such as Xilinx's Vivado HLS, the Altera OpenCL SDK or Maxeler's MaxCompiler. When their direct source code translation inevitably didn't meet performance expectations, this developer then applies optimisations such as exploiting task or pipeline parallelism as well as C-slowing. When a combination of these optimisations are considered for a range of devices and process technologies, an acceleration of up to 220 times is achieved using these tools, the sort of acceleration expected of custom architectures. Compared to the 31 times improvement shown by an optimised Multicore CPU implementation, the 60 times improvement by a GPU and 207 times by a Xeon Phi, these results suggest that HLS is indeed ready for industrial adoption.