Joseph Whitehouse, Qinzhe Wu, Shuang Song, E. John, A. Gerstlauer, L. John
{"title":"A Study of Core Utilization and Residency in Heterogeneous Smart Phone Architectures","authors":"Joseph Whitehouse, Qinzhe Wu, Shuang Song, E. John, A. Gerstlauer, L. John","doi":"10.1145/3297663.3310304","DOIUrl":null,"url":null,"abstract":"In recent years, the smart phone platform has seen a rise in the number of cores and the use of heterogeneous clusters as in the Qualcomm Snapdragon, Apple A10 and the Samsung Exynos processors. This paper attempts to understand characteristics of mobile workloads, with measurements on heterogeneous multicore phone platforms with big and little cores. It answers questions such as the following: (i) Do smart phones need multiple cores of different types (eg: big or little)? (ii) Is it energy-efficient to operate with more cores (with less time) or fewer cores even if it might take longer? (iii)What are the best frequencies to operate the cores considering energy efficiency? (iv) Do mobile applications need out-of-order speculative execution cores with complex branch prediction? (v) Is IPC a good performance indicator for early design tradeoff evaluation while working on mobile processor design? Using Geekbench and more than 3 dozen Android applications, and the Workload Automation tool from ARM, we measure core utilization, frequency residencies, and energy efficiency characteristics on two leading edge smart phones. Many characteristics of smartphone platforms are presented, and architectural implications of the observations as well as design considerations for future mobile processors are discussed. A key insight is that multiple big and complex cores are beneficial both from a performance as well as an energy point of view in certain scenarios. It is seen that 4 big cores are utilized during application launch and update phases of applications. Similarly, reboot using all 4 cores at maximum performance provides latency advantages. However, it consumes higher power and energy, and reboot with 2 cores was seen to be more energy efficient than reboot with 1 or 4 cores. Furthermore, inaccurate branch prediction is seen to result in up to 40% mis-speculated instructions in many applications, suggesting that it is important to improve the accuracy of branch predictors in mobile processors. While absolute IPCs are observed to be a poor predictor of benchmark scores, relative IPCs are useful for estimating the impact of microarchitectural changes on benchmark scores.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3297663.3310304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In recent years, the smart phone platform has seen a rise in the number of cores and the use of heterogeneous clusters as in the Qualcomm Snapdragon, Apple A10 and the Samsung Exynos processors. This paper attempts to understand characteristics of mobile workloads, with measurements on heterogeneous multicore phone platforms with big and little cores. It answers questions such as the following: (i) Do smart phones need multiple cores of different types (eg: big or little)? (ii) Is it energy-efficient to operate with more cores (with less time) or fewer cores even if it might take longer? (iii)What are the best frequencies to operate the cores considering energy efficiency? (iv) Do mobile applications need out-of-order speculative execution cores with complex branch prediction? (v) Is IPC a good performance indicator for early design tradeoff evaluation while working on mobile processor design? Using Geekbench and more than 3 dozen Android applications, and the Workload Automation tool from ARM, we measure core utilization, frequency residencies, and energy efficiency characteristics on two leading edge smart phones. Many characteristics of smartphone platforms are presented, and architectural implications of the observations as well as design considerations for future mobile processors are discussed. A key insight is that multiple big and complex cores are beneficial both from a performance as well as an energy point of view in certain scenarios. It is seen that 4 big cores are utilized during application launch and update phases of applications. Similarly, reboot using all 4 cores at maximum performance provides latency advantages. However, it consumes higher power and energy, and reboot with 2 cores was seen to be more energy efficient than reboot with 1 or 4 cores. Furthermore, inaccurate branch prediction is seen to result in up to 40% mis-speculated instructions in many applications, suggesting that it is important to improve the accuracy of branch predictors in mobile processors. While absolute IPCs are observed to be a poor predictor of benchmark scores, relative IPCs are useful for estimating the impact of microarchitectural changes on benchmark scores.