Mounir Bahtat, S. Belkouch, Philippe Elleaume, Phillipe Gall
{"title":"Fast enumeration-based modulo scheduling heuristic for VLIW architectures","authors":"Mounir Bahtat, S. Belkouch, Philippe Elleaume, Phillipe Gall","doi":"10.1109/ICM.2014.7071820","DOIUrl":null,"url":null,"abstract":"Modulo scheduling is a software pipelining technique exploiting instruction-level parallelism (ILP) of VLIW architectures to efficiently implement loops. This paper presents a novel enumeration-based resource-constrained heuristic for modulo scheduling. It takes into consideration the criticality of the nodes, generating near optimal schedules in terms of initiation intervals and register requirements. The scheduling algorithm outperformed better-known heuristics in terms of the quality of schedules, while presenting small compilation time enabling it to be used in a production environment. Experimental results on the VLIW TMS320C6678 DSP processor, showed improved performance on a signal processing set of algorithms.","PeriodicalId":107354,"journal":{"name":"2014 26th International Conference on Microelectronics (ICM)","volume":"252 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 26th International Conference on Microelectronics (ICM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICM.2014.7071820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Modulo scheduling is a software pipelining technique exploiting instruction-level parallelism (ILP) of VLIW architectures to efficiently implement loops. This paper presents a novel enumeration-based resource-constrained heuristic for modulo scheduling. It takes into consideration the criticality of the nodes, generating near optimal schedules in terms of initiation intervals and register requirements. The scheduling algorithm outperformed better-known heuristics in terms of the quality of schedules, while presenting small compilation time enabling it to be used in a production environment. Experimental results on the VLIW TMS320C6678 DSP processor, showed improved performance on a signal processing set of algorithms.