{"title":"Source-to-source translation: Impact on the performance of high level synthesis","authors":"Meena Belwal, Sudarshan TSB","doi":"10.1109/CCAA.2017.8229944","DOIUrl":null,"url":null,"abstract":"The recent advancement in software industry such as Microsoft utilizing FPGAs (Field Programmable Gate Arrays) for acceleration in its search engine Bing and Intel's initiative to have its CPU along with Altera FPGA in the same chip indicates FPGA's potential as well as growing demand in the field of high performance computing. FPGAs provide accelerated computation due to their flexible architecture. However it creates challenges for the system designer as efficient design in terms of latency, power and energy demands hardware programming expertise. Hardware coding is a time consuming as well as an error prone task. High Level Synthesis (HLS) addresses these challenges by enabling programmer to code in High-level languages (HLL) such as C, C++, SystemC, CUDA and translating this code to hardware language such as Verilog or VHDL. Even though HLS tools provide several optimizations, their performance is limited due to the implementation constraints. Some of the software constructs widely used in high level language such as dynamic memory allocation, pointer-based data structures and recursion are very hard to implement well in hardware and thereby restricting the performance of HLS. Source-to-source translation is a mechanism to optimize the code in HLL so that the compiler can perform better in terms of code optimization. This article investigates whether the source-to-source translation widely used in HLL can also benefit high level synthesis. For this study, Bones source-to-source compiler is selected to perform the translation of C code to C (Optimized-C) and OpenMP code. These three types of code: C, Optimized-C and OpenMP were synthesized in LegUP HLS for three benchmarks; the performance statistics were measured for all the nine cases and analysis was conducted in terms of speedup, area reduction, power and energy consumption. OpenMP code performed better as compared to original C code in terms of execution time (speedup range 1.86–3.49), area (gain range 1–6.55) and energy (gain range 1.86–3.55). However optimized-C code did not always perform better than the original C-code in terms of execution time (speedup range 0.27–3.08), area (gain range 0.83–5.7) and energy (gain range 0.27–3.13). The power statistics observed were almost the same for all the three input versions of the code.","PeriodicalId":6627,"journal":{"name":"2017 International Conference on Computing, Communication and Automation (ICCCA)","volume":"69 1","pages":"951-956"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Computing, Communication and Automation (ICCCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCAA.2017.8229944","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The recent advancement in software industry such as Microsoft utilizing FPGAs (Field Programmable Gate Arrays) for acceleration in its search engine Bing and Intel's initiative to have its CPU along with Altera FPGA in the same chip indicates FPGA's potential as well as growing demand in the field of high performance computing. FPGAs provide accelerated computation due to their flexible architecture. However it creates challenges for the system designer as efficient design in terms of latency, power and energy demands hardware programming expertise. Hardware coding is a time consuming as well as an error prone task. High Level Synthesis (HLS) addresses these challenges by enabling programmer to code in High-level languages (HLL) such as C, C++, SystemC, CUDA and translating this code to hardware language such as Verilog or VHDL. Even though HLS tools provide several optimizations, their performance is limited due to the implementation constraints. Some of the software constructs widely used in high level language such as dynamic memory allocation, pointer-based data structures and recursion are very hard to implement well in hardware and thereby restricting the performance of HLS. Source-to-source translation is a mechanism to optimize the code in HLL so that the compiler can perform better in terms of code optimization. This article investigates whether the source-to-source translation widely used in HLL can also benefit high level synthesis. For this study, Bones source-to-source compiler is selected to perform the translation of C code to C (Optimized-C) and OpenMP code. These three types of code: C, Optimized-C and OpenMP were synthesized in LegUP HLS for three benchmarks; the performance statistics were measured for all the nine cases and analysis was conducted in terms of speedup, area reduction, power and energy consumption. OpenMP code performed better as compared to original C code in terms of execution time (speedup range 1.86–3.49), area (gain range 1–6.55) and energy (gain range 1.86–3.55). However optimized-C code did not always perform better than the original C-code in terms of execution time (speedup range 0.27–3.08), area (gain range 0.83–5.7) and energy (gain range 0.27–3.13). The power statistics observed were almost the same for all the three input versions of the code.