{"title":"猛禽:减少统一内存系统下CPU-GPU错误共享","authors":"Md. Erfanul Haque Rafi, Kaylee Williams, Apan Qasem","doi":"10.1109/IGSC55832.2022.9969376","DOIUrl":null,"url":null,"abstract":"The introduction of Unified Memory (UM) technology has greatly increased the programmability of CPU-GPU heterogeneous systems. At the same time, Unified Memory systems have given rise to new performance challenges. Achieving the desired performance and energy efficiency on such systems requires careful consideration of data allocation and migration. This paper looks at the problem of false sharing under UM. We present Raptor, a system for fast and accurate detection of page-level false sharing in heterogeneous applications. The system employs binary code instrumentation and leverages hardware performance counters to track UM allocations and data access patterns and pinpoint energy inefficiencies created by the occurrence of false sharing. Experiments on a suite of heterogeneous applications show false sharing can be a common occurrence in collaborative design paradigms with tight coupling of CPU-GPU tasks. When false sharing is eliminated via a padding scheme, applications are able to achieve higher performance at lower clock frequencies, leading to improved energy efficiency by as much as 2.96× and by 1.62× and 1.47× on average on two contemporary CPU-GPU platforms.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Raptor: Mitigating CPU-GPU False Sharing Under Unified Memory Systems\",\"authors\":\"Md. Erfanul Haque Rafi, Kaylee Williams, Apan Qasem\",\"doi\":\"10.1109/IGSC55832.2022.9969376\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The introduction of Unified Memory (UM) technology has greatly increased the programmability of CPU-GPU heterogeneous systems. At the same time, Unified Memory systems have given rise to new performance challenges. Achieving the desired performance and energy efficiency on such systems requires careful consideration of data allocation and migration. This paper looks at the problem of false sharing under UM. We present Raptor, a system for fast and accurate detection of page-level false sharing in heterogeneous applications. The system employs binary code instrumentation and leverages hardware performance counters to track UM allocations and data access patterns and pinpoint energy inefficiencies created by the occurrence of false sharing. Experiments on a suite of heterogeneous applications show false sharing can be a common occurrence in collaborative design paradigms with tight coupling of CPU-GPU tasks. When false sharing is eliminated via a padding scheme, applications are able to achieve higher performance at lower clock frequencies, leading to improved energy efficiency by as much as 2.96× and by 1.62× and 1.47× on average on two contemporary CPU-GPU platforms.\",\"PeriodicalId\":114200,\"journal\":{\"name\":\"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IGSC55832.2022.9969376\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IGSC55832.2022.9969376","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Raptor: Mitigating CPU-GPU False Sharing Under Unified Memory Systems
The introduction of Unified Memory (UM) technology has greatly increased the programmability of CPU-GPU heterogeneous systems. At the same time, Unified Memory systems have given rise to new performance challenges. Achieving the desired performance and energy efficiency on such systems requires careful consideration of data allocation and migration. This paper looks at the problem of false sharing under UM. We present Raptor, a system for fast and accurate detection of page-level false sharing in heterogeneous applications. The system employs binary code instrumentation and leverages hardware performance counters to track UM allocations and data access patterns and pinpoint energy inefficiencies created by the occurrence of false sharing. Experiments on a suite of heterogeneous applications show false sharing can be a common occurrence in collaborative design paradigms with tight coupling of CPU-GPU tasks. When false sharing is eliminated via a padding scheme, applications are able to achieve higher performance at lower clock frequencies, leading to improved energy efficiency by as much as 2.96× and by 1.62× and 1.47× on average on two contemporary CPU-GPU platforms.