Optimizing VLIW Instruction Scheduling via a Two-Dimensional Constrained Dynamic Programming

IF 2.2 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE ACM Transactions on Design Automation of Electronic Systems Pub Date : 2024-01-25 DOI:10.1145/3643135
Can Deng, Zhaoyun Chen, Yang Shi, Yimin Ma, Mei Wen, Lei Luo
{"title":"Optimizing VLIW Instruction Scheduling via a Two-Dimensional Constrained Dynamic Programming","authors":"Can Deng, Zhaoyun Chen, Yang Shi, Yimin Ma, Mei Wen, Lei Luo","doi":"10.1145/3643135","DOIUrl":null,"url":null,"abstract":"<p>Typical embedded processors, such as Digital Signal Processors (DSPs), usually adopt Very Long Instruction Word (VLIW) architecture to improve computing efficiency. The performance of VLIW processors heavily relies on Instruction-Level Parallelism (ILP). Therefore, it is crucial to develop an efficient instruction scheduling algorithm to explore more ILP. While heuristic algorithms are widely used in modern compilers due to simple implementation and low computational cost, they have limitations in providing accurate solutions and are prone to local optima. On the other hand, exact algorithms can usually find the optimal solution, but their high time overhead makes them less suitable for large-scale problems. This paper proposes a two-dimensional constrained dynamic programming (TDCDP) approach and a quantitative model for instruction scheduling. The TDCDP approach achieves near-optimal solutions within an acceptable time overhead. Furthermore, we integrate our TDCDP approach into mainstream compiler architecture, encompassing Pre- and Post-RA (register allocation) scheduling. We conduct a quantitative evaluation of TDCDP compared to four heuristic algorithms on a typical VLIW processor. Our approach achieves an efficiency improvement of up to 58.34% in final solutions compared to the heuristic algorithms. Additionally, the Post-RA Scheduling enhances programs with an average speedup of 14.04% than solely applying the Pre-RA Scheduling.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":2.2000,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Design Automation of Electronic Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3643135","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Typical embedded processors, such as Digital Signal Processors (DSPs), usually adopt Very Long Instruction Word (VLIW) architecture to improve computing efficiency. The performance of VLIW processors heavily relies on Instruction-Level Parallelism (ILP). Therefore, it is crucial to develop an efficient instruction scheduling algorithm to explore more ILP. While heuristic algorithms are widely used in modern compilers due to simple implementation and low computational cost, they have limitations in providing accurate solutions and are prone to local optima. On the other hand, exact algorithms can usually find the optimal solution, but their high time overhead makes them less suitable for large-scale problems. This paper proposes a two-dimensional constrained dynamic programming (TDCDP) approach and a quantitative model for instruction scheduling. The TDCDP approach achieves near-optimal solutions within an acceptable time overhead. Furthermore, we integrate our TDCDP approach into mainstream compiler architecture, encompassing Pre- and Post-RA (register allocation) scheduling. We conduct a quantitative evaluation of TDCDP compared to four heuristic algorithms on a typical VLIW processor. Our approach achieves an efficiency improvement of up to 58.34% in final solutions compared to the heuristic algorithms. Additionally, the Post-RA Scheduling enhances programs with an average speedup of 14.04% than solely applying the Pre-RA Scheduling.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过二维约束动态编程优化 VLIW 指令调度
典型的嵌入式处理器,如数字信号处理器(DSP),通常采用超长指令字(VLIW)架构来提高计算效率。VLIW 处理器的性能在很大程度上依赖于指令级并行性(ILP)。因此,开发一种高效的指令调度算法以探索更多的 ILP 至关重要。虽然启发式算法由于实施简单、计算成本低廉而被广泛应用于现代编译器中,但它们在提供精确解决方案方面存在局限性,而且容易出现局部最优。另一方面,精确算法通常能找到最优解,但其时间开销大,不太适合大规模问题。本文提出了一种二维约束动态编程(TDCDP)方法和指令调度的定量模型。TDCDP 方法能在可接受的时间开销内实现接近最优的解决方案。此外,我们还将 TDCDP 方法集成到主流编译器架构中,包括前 RA 和后 RA(寄存器分配)调度。我们在典型的 VLIW 处理器上对 TDCDP 与四种启发式算法进行了定量评估。与启发式算法相比,我们的方法使最终解决方案的效率提高了 58.34%。此外,Post-RA Scheduling 比单纯应用 Pre-RA Scheduling 的程序平均提速 14.04%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACM Transactions on Design Automation of Electronic Systems
ACM Transactions on Design Automation of Electronic Systems 工程技术-计算机:软件工程
CiteScore
3.20
自引率
7.10%
发文量
105
审稿时长
3 months
期刊介绍: TODAES is a premier ACM journal in design and automation of electronic systems. It publishes innovative work documenting significant research and development advances on the specification, design, analysis, simulation, testing, and evaluation of electronic systems, emphasizing a computer science/engineering orientation. Both theoretical analysis and practical solutions are welcome.
期刊最新文献
Efficient Attacks on Strong PUFs via Covariance and Boolean Modeling PriorMSM: An Efficient Acceleration Architecture for Multi-Scalar Multiplication Multi-Stream Scheduling of Inference Pipelines on Edge Devices - a DRL Approach A Power Optimization Approach for Large-scale RM-TB Dual Logic Circuits Based on an Adaptive Multi-Task Intelligent Algorithm MAB-BMC: A Formal Verification Enhancer by Harnessing Multiple BMC Engines Together
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1