{"title":"Optimal Launch Bound Selection in CPU-GPU Hybrid Graph Applications with Deep Learning","authors":"Md. Erfanul Haque Rafi, Apan Qasem","doi":"10.1109/IGSC55832.2022.9969364","DOIUrl":null,"url":null,"abstract":"Graph algorithms, which are at heart of emerging computation domains such as machine learning, are notoriously difficult to optimize because of their irregular behavior. The challenges are magnified on current CPU-GPU heterogeneous platforms. In this paper, we study the problem of GPU launch bound configuration in hybrid graph algorithms. We train a multi-objective deep neural network to learn a function that maps input graph characteristics and runtime program behavior to a set of launch bound parameters. When applying launch bounds predicted by our neural network in BFS and SSSP algorithms, we observe as much as 2.76× speedup on certain graph instances and an overall speedup of 1.31 and 1.61, respectively. Similar improvements are seen in energy efficiency of the applications, with an average reduction of 14% in peak power consumption across 20 real-world input graphs. Evaluation of the neural network shows that it is robust and generalizable and yields close to a 90% accuracy on cross-validation.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IGSC55832.2022.9969364","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Graph algorithms, which are at heart of emerging computation domains such as machine learning, are notoriously difficult to optimize because of their irregular behavior. The challenges are magnified on current CPU-GPU heterogeneous platforms. In this paper, we study the problem of GPU launch bound configuration in hybrid graph algorithms. We train a multi-objective deep neural network to learn a function that maps input graph characteristics and runtime program behavior to a set of launch bound parameters. When applying launch bounds predicted by our neural network in BFS and SSSP algorithms, we observe as much as 2.76× speedup on certain graph instances and an overall speedup of 1.31 and 1.61, respectively. Similar improvements are seen in energy efficiency of the applications, with an average reduction of 14% in peak power consumption across 20 real-world input graphs. Evaluation of the neural network shows that it is robust and generalizable and yields close to a 90% accuracy on cross-validation.