{"title":"Profiling-Based Control-Flow Reduction in High-Level Synthesis","authors":"Austin Liolli, Omar Ragheb, J. Anderson","doi":"10.1109/ICFPT52863.2021.9609816","DOIUrl":null,"url":null,"abstract":"Control flow in a program can be represented in a directed graph, called the control flow graph (CFG). Nodes in the graph represent straight-line segments of code, basic blocks, and directed edges between nodes correspond to transfers of control. We present a methodology to selectively reduce control flow by collapsing basic blocks into their parent blocks, revealing increased instruction-level parallelism to a high-level synthesis (HLS) scheduler, thereby raising circuit performance.We evaluate our approach within an HLS tool that allows a C-language software program to be automatically synthesized into a hardware circuit, using the CHStone benchmark suite [1], targeting an Intel Cyclone V FPGA. For individual benchmark circuits we observe cycle count reductions up to 20.7% and wall-clock time reductions up to 22.6%, and 6% on average.","PeriodicalId":376220,"journal":{"name":"2021 International Conference on Field-Programmable Technology (ICFPT)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Field-Programmable Technology (ICFPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFPT52863.2021.9609816","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Control flow in a program can be represented in a directed graph, called the control flow graph (CFG). Nodes in the graph represent straight-line segments of code, basic blocks, and directed edges between nodes correspond to transfers of control. We present a methodology to selectively reduce control flow by collapsing basic blocks into their parent blocks, revealing increased instruction-level parallelism to a high-level synthesis (HLS) scheduler, thereby raising circuit performance.We evaluate our approach within an HLS tool that allows a C-language software program to be automatically synthesized into a hardware circuit, using the CHStone benchmark suite [1], targeting an Intel Cyclone V FPGA. For individual benchmark circuits we observe cycle count reductions up to 20.7% and wall-clock time reductions up to 22.6%, and 6% on average.
程序中的控制流可以用有向图表示,称为控制流图(CFG)。图中的节点表示代码的直线段,基本块,节点之间的有向边对应于控制的转移。我们提出了一种方法,通过将基本块折叠到它们的父块中来选择性地减少控制流,从而向高级合成(HLS)调度程序揭示增加的指令级并行性,从而提高电路性能。我们在HLS工具中评估我们的方法,该工具允许c语言软件程序自动合成到硬件电路中,使用CHStone基准套件[1],针对英特尔Cyclone V FPGA。对于单个基准电路,我们观察到周期计数减少了20.7%,时钟时间减少了22.6%,平均减少了6%。