Junhee Park, Qingyang Wang, D. Jayasinghe, Jack Li, Yasuhiko Kanemasa, Masazumi Matsubara, Daisaku Yokoyama, M. Kitsuregawa, C. Pu
{"title":"Variations in Performance Measurements of Multi-core Processors: A Study of n-Tier Applications","authors":"Junhee Park, Qingyang Wang, D. Jayasinghe, Jack Li, Yasuhiko Kanemasa, Masazumi Matsubara, Daisaku Yokoyama, M. Kitsuregawa, C. Pu","doi":"10.1109/SCC.2013.116","DOIUrl":null,"url":null,"abstract":"The prevalence of multi-core processors has raised the question of whether applications can use the increasing number of cores efficiently in order to provide predictable quality of service (QoS). In this paper, we study the horizontal scalability of n-tier application performance within a multicore processor (MCP). Through extensive measurements of the RUBBoS benchmark, we found one major source of performance variations within MCP: the mapping of cores to virtual CPUs can significantly lower on-chip cache hit ratio, causing performance drops of up to 22% without obvious changes in resource utilization. After we eliminated these variations by fixing the MCP core mapping, we measured the impact of three mainstream hypervisors (the dominant Commercial Hypervisor, Xen, and KVM) on intra-MCP horizontal scalability. On a quad-core dual-processor (total 8 cores), we found some interesting similarities and dissimilarities among the hypervisors. An example of similarities is a non-monotonic scalability trend (throughput increasing up to 4 cores and then decreasing for more than 4 cores) when running a browse-only CPU-intensive workload. This problem can be traced to the management of last level cache of CPU packages. An example of dissimilarities among hypervisors is their handling of write operations in mixed read/write, I/O-intensive workloads. Specifically, the Commercial Hypervisor is able to provide more than twice the throughput compared to KVM. Our measurements show that both MCP cache architecture and the choice of hypervisors indeed have an impact on the efficiency and horizontal scalability achievable by applications. However, despite their differences, all three mainstream hypervisors have difficulties with the intra-MCP horizontal scalability beyond 4 cores for n-tier applications.","PeriodicalId":370898,"journal":{"name":"2013 IEEE International Conference on Services Computing","volume":"110 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Conference on Services Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCC.2013.116","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
The prevalence of multi-core processors has raised the question of whether applications can use the increasing number of cores efficiently in order to provide predictable quality of service (QoS). In this paper, we study the horizontal scalability of n-tier application performance within a multicore processor (MCP). Through extensive measurements of the RUBBoS benchmark, we found one major source of performance variations within MCP: the mapping of cores to virtual CPUs can significantly lower on-chip cache hit ratio, causing performance drops of up to 22% without obvious changes in resource utilization. After we eliminated these variations by fixing the MCP core mapping, we measured the impact of three mainstream hypervisors (the dominant Commercial Hypervisor, Xen, and KVM) on intra-MCP horizontal scalability. On a quad-core dual-processor (total 8 cores), we found some interesting similarities and dissimilarities among the hypervisors. An example of similarities is a non-monotonic scalability trend (throughput increasing up to 4 cores and then decreasing for more than 4 cores) when running a browse-only CPU-intensive workload. This problem can be traced to the management of last level cache of CPU packages. An example of dissimilarities among hypervisors is their handling of write operations in mixed read/write, I/O-intensive workloads. Specifically, the Commercial Hypervisor is able to provide more than twice the throughput compared to KVM. Our measurements show that both MCP cache architecture and the choice of hypervisors indeed have an impact on the efficiency and horizontal scalability achievable by applications. However, despite their differences, all three mainstream hypervisors have difficulties with the intra-MCP horizontal scalability beyond 4 cores for n-tier applications.