{"title":"集成CPU+GPU处理器的调度挑战与机遇","authors":"K. Dev, S. Reda","doi":"10.1145/2993452.2994307","DOIUrl":null,"url":null,"abstract":"Heterogeneous processors with architecturally different devices (CPU and GPU) integrated on the same die provide good performance and energy efficiency for wide range of workloads. However, they also create challenges and opportunities in terms of scheduling workloads on the appropriate device. Current scheduling practices mainly use the characteristics of kernel workloads to decide the CPU/GPU mapping. In this paper we first provide detailed infrared imaging results that show the impact of mapping decisions on the thermal and power profiles of CPU+GPU processors. Furthermore, we observe that runtime conditions such as power and CPU load from traditional workloads also affect the mapping decision. To exploit our observations, we propose techniques to characterize the OpenCL kernel workloads during run-time and map them on appropriate device under time-varying physical (i.e., chip power limit) and CPU load conditions, in particular the number of available CPU cores for the OpenCL kernel. We implement our dynamic scheduler on a real CPU+GPU processor and evaluate it using various OpenCL benchmarks. Compared to the state-ofthe- art kernel-level scheduling method, the proposed scheduler provides up to 31% and 10% improvements in runtime and energy, respectively.","PeriodicalId":198459,"journal":{"name":"2016 14th ACM/IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Scheduling challenges and opportunities in integrated CPU+GPU processors\",\"authors\":\"K. Dev, S. Reda\",\"doi\":\"10.1145/2993452.2994307\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Heterogeneous processors with architecturally different devices (CPU and GPU) integrated on the same die provide good performance and energy efficiency for wide range of workloads. However, they also create challenges and opportunities in terms of scheduling workloads on the appropriate device. Current scheduling practices mainly use the characteristics of kernel workloads to decide the CPU/GPU mapping. In this paper we first provide detailed infrared imaging results that show the impact of mapping decisions on the thermal and power profiles of CPU+GPU processors. Furthermore, we observe that runtime conditions such as power and CPU load from traditional workloads also affect the mapping decision. To exploit our observations, we propose techniques to characterize the OpenCL kernel workloads during run-time and map them on appropriate device under time-varying physical (i.e., chip power limit) and CPU load conditions, in particular the number of available CPU cores for the OpenCL kernel. We implement our dynamic scheduler on a real CPU+GPU processor and evaluate it using various OpenCL benchmarks. Compared to the state-ofthe- art kernel-level scheduling method, the proposed scheduler provides up to 31% and 10% improvements in runtime and energy, respectively.\",\"PeriodicalId\":198459,\"journal\":{\"name\":\"2016 14th ACM/IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)\",\"volume\":\"122 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 14th ACM/IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2993452.2994307\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 14th ACM/IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2993452.2994307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Scheduling challenges and opportunities in integrated CPU+GPU processors
Heterogeneous processors with architecturally different devices (CPU and GPU) integrated on the same die provide good performance and energy efficiency for wide range of workloads. However, they also create challenges and opportunities in terms of scheduling workloads on the appropriate device. Current scheduling practices mainly use the characteristics of kernel workloads to decide the CPU/GPU mapping. In this paper we first provide detailed infrared imaging results that show the impact of mapping decisions on the thermal and power profiles of CPU+GPU processors. Furthermore, we observe that runtime conditions such as power and CPU load from traditional workloads also affect the mapping decision. To exploit our observations, we propose techniques to characterize the OpenCL kernel workloads during run-time and map them on appropriate device under time-varying physical (i.e., chip power limit) and CPU load conditions, in particular the number of available CPU cores for the OpenCL kernel. We implement our dynamic scheduler on a real CPU+GPU processor and evaluate it using various OpenCL benchmarks. Compared to the state-ofthe- art kernel-level scheduling method, the proposed scheduler provides up to 31% and 10% improvements in runtime and energy, respectively.