{"title":"EuroSys '22: Seventeenth European Conference on Computer Systems, Rennes, France, April 5 - 8, 2022","authors":"","doi":"10.1145/3492321","DOIUrl":"https://doi.org/10.1145/3492321","url":null,"abstract":"","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91322469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EuroSys '21: Sixteenth European Conference on Computer Systems, Online Event, United Kingdom, April 26-28, 2021","authors":"","doi":"10.1145/3447786","DOIUrl":"https://doi.org/10.1145/3447786","url":null,"abstract":"","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84301512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EuroSys '20: Fifteenth EuroSys Conference 2020, Heraklion, Greece, April 27-30, 2020","authors":"","doi":"10.1145/3342195","DOIUrl":"https://doi.org/10.1145/3342195","url":null,"abstract":"","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"43 1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78763375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scheduling has a significant influence on application performance. Deciding on a quantum length can be very tricky, especially when concurrent applications have various characteristics. This is actually the case in virtualized cloud computing environments where virtual machines from different users are colocated on the same physical machine. We claim that in a multi-core virtualized platform, different quantum lengths should be associated with different application types. We apply this principle in a new scheduler called AQL_Sched. We identified 5 main application types and experimentally found the best quantum length for each of them. Dynamically, AQL_Sched associates an application type with each virtual CPU (vCPU) and schedules vCPUs according to their type on physical CPU (pCPU) pools with the best quantum length. Therefore, each vCPU is scheduled on a pCPU with the best quantum length. We implemented a prototype of AQL_Sched in Xen and we evaluated it with various reference benchmarks (SPECweb2009, SPECmail2009, SPEC CPU2006, and PARSEC). The evaluation results show that AQL_Sched outperforms Xen's credit scheduler. For instance, up to 20%, 10% and 15% of performance improvements have been obtained with SPECweb2009, SPEC CPU2006 and PARSEC, respectively.
{"title":"Application-specific quantum for multi-core platform scheduler","authors":"Boris Teabe, A. Tchana, D. Hagimont","doi":"10.1145/2901318.2901340","DOIUrl":"https://doi.org/10.1145/2901318.2901340","url":null,"abstract":"Scheduling has a significant influence on application performance. Deciding on a quantum length can be very tricky, especially when concurrent applications have various characteristics. This is actually the case in virtualized cloud computing environments where virtual machines from different users are colocated on the same physical machine. We claim that in a multi-core virtualized platform, different quantum lengths should be associated with different application types. We apply this principle in a new scheduler called AQL_Sched. We identified 5 main application types and experimentally found the best quantum length for each of them. Dynamically, AQL_Sched associates an application type with each virtual CPU (vCPU) and schedules vCPUs according to their type on physical CPU (pCPU) pools with the best quantum length. Therefore, each vCPU is scheduled on a pCPU with the best quantum length. We implemented a prototype of AQL_Sched in Xen and we evaluated it with various reference benchmarks (SPECweb2009, SPECmail2009, SPEC CPU2006, and PARSEC). The evaluation results show that AQL_Sched outperforms Xen's credit scheduler. For instance, up to 20%, 10% and 15% of performance improvements have been obtained with SPECweb2009, SPEC CPU2006 and PARSEC, respectively.","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75426718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Zhang, J. Litton, Frank Cangialosi, Theophilus A. Benson, Dave Levin, A. Mislove
Cloud computing has evolved to meet user demands, from arbitrary VMs offered by IaaS to the narrow application interfaces of PaaS. Unfortunately, there exists an intermediate point that is not well met by today's offerings: users who wish to run arbitrary, already available binaries (as opposed to rewriting their own application for a PaaS) yet expect their applications to be long-lived but mostly idle (as opposed to the always-on VM of IaaS). For example, end users who wish to run their own email or DNS server. In this paper, we explore an alternative approach for cloud computation based on a process-like abstraction rather than a virtual machine abstraction, thereby gaining the scalability and efficiency of PaaS along with the generality of IaaS. We present the design of Picocenter, a hosting infrastructure for such applications that enables use of legacy applications. The key technical challenge in Picocenter is enabling fast swapping of applications to and from cloud storage (since, by definition, applications are largely idle, we expect them to spend the majority of their time swapped out). We develop an ActiveSet technique that prefetches the application's predicted memory working set when reviving an application. An evaluation on EC2 demonstrates that using ActiveSet, Picocenter is able to swap in applications in under 250 ms even when they are stored in S3 while swapped out.
{"title":"Picocenter: supporting long-lived, mostly-idle applications in cloud environments","authors":"L. Zhang, J. Litton, Frank Cangialosi, Theophilus A. Benson, Dave Levin, A. Mislove","doi":"10.1145/2901318.2901345","DOIUrl":"https://doi.org/10.1145/2901318.2901345","url":null,"abstract":"Cloud computing has evolved to meet user demands, from arbitrary VMs offered by IaaS to the narrow application interfaces of PaaS. Unfortunately, there exists an intermediate point that is not well met by today's offerings: users who wish to run arbitrary, already available binaries (as opposed to rewriting their own application for a PaaS) yet expect their applications to be long-lived but mostly idle (as opposed to the always-on VM of IaaS). For example, end users who wish to run their own email or DNS server. In this paper, we explore an alternative approach for cloud computation based on a process-like abstraction rather than a virtual machine abstraction, thereby gaining the scalability and efficiency of PaaS along with the generality of IaaS. We present the design of Picocenter, a hosting infrastructure for such applications that enables use of legacy applications. The key technical challenge in Picocenter is enabling fast swapping of applications to and from cloud storage (since, by definition, applications are largely idle, we expect them to spend the majority of their time swapped out). We develop an ActiveSet technique that prefetches the application's predicted memory working set when reviving an application. An evaluation on EC2 demonstrates that using ActiveSet, Picocenter is able to swap in applications in under 250 ms even when they are stored in S3 while swapped out.","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84513847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Few transactional memory implementations allow for condition synchronization among transactions. The problems are many, most notably the lack of consensus about a single appropriate linguistic construct, and the lack of mechanisms that are compatible with hardware transactional memory. In this paper, we introduce a broadly useful mechanism for supporting condition synchronization among transactions. Our mechanism supports a number of linguistic constructs for coordinating transactions, and does so without introducing overhead on in-flight hardware transactions. Experiments show that our mechanisms work well, and that the diversity of linguistic constructs allows programmers to chose the technique that is best suited to a particular application.
{"title":"Practical condition synchronization for transactional memory","authors":"Chao Wang, Michael F. Spear","doi":"10.1145/2901318.2901342","DOIUrl":"https://doi.org/10.1145/2901318.2901342","url":null,"abstract":"Few transactional memory implementations allow for condition synchronization among transactions. The problems are many, most notably the lack of consensus about a single appropriate linguistic construct, and the lack of mechanisms that are compatible with hardware transactional memory. In this paper, we introduce a broadly useful mechanism for supporting condition synchronization among transactions. Our mechanism supports a number of linguistic constructs for coordinating transactions, and does so without introducing overhead on in-flight hardware transactions. Experiments show that our mechanisms work well, and that the diversity of linguistic constructs allows programmers to chose the technique that is best suited to a particular application.","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85628277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Syed Masum Billah, Donald E. Porter, I. Ramakrishnan
Computer users commonly use applications designed for different operating systems (OSes). For instance, a Mac user may access a cloud-based Windows remote desktop to run an application required for her job. Current remote access protocols do not work well with screen readers, creating a disproportionate burden for users with visual impairments. These users' productivity depends on features of a specific screen reader, and readers are locked-in to a specific OS. The only current option is to run a different screen reader on each platform, which harms productivity. This paper describes a framework, called Sinter, that efficiently and seamlessly supports remote, cross-platform screen reading, without modifying the application or the screen reader. Sinter addresses these problems with a platform-independent intermediate representation (IR) of a remote application's user interface (UI). The Sinter IR encapsulates platform-specific accessibility code on the remote system, facilitates development of additional accessibility features, and is simple enough to be reconstructed and read on any client platform. In the example above, Sinter allows a Mac-only reader to read remote Windows applications. Sinter supports low-bandwidth, remote access to a wide range of applications, including Microsoft Word and Apple Mail, with both Windows and OS X clients and servers, as well as a web browser client. Sinter's IR-level programming model facilitates development of accessibility features and other enhancements, transparently to the remote application and reader. Sinter's latency is low enough for practical use, even over a relatively slow network connection.
{"title":"Sinter: low-bandwidth remote access for the visually-impaired","authors":"Syed Masum Billah, Donald E. Porter, I. Ramakrishnan","doi":"10.1145/2901318.2901335","DOIUrl":"https://doi.org/10.1145/2901318.2901335","url":null,"abstract":"Computer users commonly use applications designed for different operating systems (OSes). For instance, a Mac user may access a cloud-based Windows remote desktop to run an application required for her job. Current remote access protocols do not work well with screen readers, creating a disproportionate burden for users with visual impairments. These users' productivity depends on features of a specific screen reader, and readers are locked-in to a specific OS. The only current option is to run a different screen reader on each platform, which harms productivity. This paper describes a framework, called Sinter, that efficiently and seamlessly supports remote, cross-platform screen reading, without modifying the application or the screen reader. Sinter addresses these problems with a platform-independent intermediate representation (IR) of a remote application's user interface (UI). The Sinter IR encapsulates platform-specific accessibility code on the remote system, facilitates development of additional accessibility features, and is simple enough to be reconstructed and read on any client platform. In the example above, Sinter allows a Mac-only reader to read remote Windows applications. Sinter supports low-bandwidth, remote access to a wide range of applications, including Microsoft Word and Apple Mail, with both Windows and OS X clients and servers, as well as a web browser client. Sinter's IR-level programming model facilitates development of accessibility features and other enhancements, transparently to the remote application and reader. Sinter's latency is low enough for practical use, even over a relatively slow network connection.","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81981516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xingbo Wu, Li Zhang, Yandong Wang, Yufei Ren, M. Hack, Song Jiang
While key-value (KV) cache, such as memcached, dedicates a large volume of expensive memory to holding performance-critical data, it is important to improve memory efficiency, or to reduce cache miss ratio without adding more memory. As we find that optimizing replacement algorithms is of limited effect for this purpose, a promising approach is to use a compact data organization and data compression to increase effective cache size. However, this approach has the risk of degrading the cache's performance due to additional computation cost. A common perception is that a high-performance KV cache is not compatible with use of data compacting techniques. In this paper, we show that, by leveraging highly skewed data access pattern common in real-world KV cache workloads, we can both reduce miss ratio through improved memory efficiency and maintain high performance for a KV cache. Specifically, we design and implement a KV cache system, named zExpander, which dynamically partitions the cache into two sub-caches. One serves frequently accessed data for high performance, and the other compacts data and metadata for high memory efficiency to reduce misses. Experiments show that zExpander can increase memcached's effective cache size by up to 2x and reduce miss ratio by up to 46%. When integrated with a cache of a higher performance, its advantages remain. For example, with 24 threads on a YCSB workload zExpander can achieve throughput of 32 million RPS with 36% of its cache misses removed.
{"title":"zExpander: a key-value cache with both high performance and fewer misses","authors":"Xingbo Wu, Li Zhang, Yandong Wang, Yufei Ren, M. Hack, Song Jiang","doi":"10.1145/2901318.2901332","DOIUrl":"https://doi.org/10.1145/2901318.2901332","url":null,"abstract":"While key-value (KV) cache, such as memcached, dedicates a large volume of expensive memory to holding performance-critical data, it is important to improve memory efficiency, or to reduce cache miss ratio without adding more memory. As we find that optimizing replacement algorithms is of limited effect for this purpose, a promising approach is to use a compact data organization and data compression to increase effective cache size. However, this approach has the risk of degrading the cache's performance due to additional computation cost. A common perception is that a high-performance KV cache is not compatible with use of data compacting techniques. In this paper, we show that, by leveraging highly skewed data access pattern common in real-world KV cache workloads, we can both reduce miss ratio through improved memory efficiency and maintain high performance for a KV cache. Specifically, we design and implement a KV cache system, named zExpander, which dynamically partitions the cache into two sub-caches. One serves frequently accessed data for high performance, and the other compacts data and metadata for high memory efficiency to reduce misses. Experiments show that zExpander can increase memcached's effective cache size by up to 2x and reduce miss ratio by up to 46%. When integrated with a cache of a higher performance, its advantages remain. For example, with 24 threads on a YCSB workload zExpander can achieve throughput of 32 million RPS with 36% of its cache misses removed.","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84097172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
W. Lavrijsen, Costin Iancu, W. A. Jong, Xin Chen, K. Schwan
In this paper we present optimizations that use DVFS mechanisms to reduce the total energy usage in scientific applications. Our main insight is that noise is intrinsic to large scale parallel executions and it appears whenever shared resources are contended. The presence of noise allows us to identify and manipulate any program regions amenable to DVFS. When compared to previous energy optimizations that make per core decisions using predictions of the running time, our scheme uses a qualitative approach to recognize the signature of executions amenable to DVFS. By recognizing the "shape of variability" we can optimize codes with highly dynamic behavior, which pose challenges to all existing DVFS techniques. We validate our approach using offline and online analyses for one-sided and two-sided communication paradigms. We have applied our methods to NWChem, and we show best case improvements in energy use of 12% at no loss in performance when using online optimizations running on 720 Haswell cores with one-sided communication. With NWChem on MPI two-sided and offline analysis, capturing the initialization, we find energy savings of up to 20%, with less than 1% performance cost.
{"title":"Exploiting variability for energy optimization of parallel programs","authors":"W. Lavrijsen, Costin Iancu, W. A. Jong, Xin Chen, K. Schwan","doi":"10.1145/2901318.2901329","DOIUrl":"https://doi.org/10.1145/2901318.2901329","url":null,"abstract":"In this paper we present optimizations that use DVFS mechanisms to reduce the total energy usage in scientific applications. Our main insight is that noise is intrinsic to large scale parallel executions and it appears whenever shared resources are contended. The presence of noise allows us to identify and manipulate any program regions amenable to DVFS. When compared to previous energy optimizations that make per core decisions using predictions of the running time, our scheme uses a qualitative approach to recognize the signature of executions amenable to DVFS. By recognizing the \"shape of variability\" we can optimize codes with highly dynamic behavior, which pose challenges to all existing DVFS techniques. We validate our approach using offline and online analyses for one-sided and two-sided communication paradigms. We have applied our methods to NWChem, and we show best case improvements in energy use of 12% at no loss in performance when using online optimizations running on 720 Haswell cores with one-sided communication. With NWChem on MPI two-sided and offline analysis, capturing the initialization, we find energy savings of up to 20%, with less than 1% performance cost.","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88000982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cristian Cadar, P. Pietzuch, K. Keeton, R. Rodrigues
Welcome to EuroSys 2016, held at Imperial College London, UK! This year's program includes 38 wonderful papers that cover a wide range of topics, including multicore systems and concurrency, distributed machine learning, studies of familiar operating system abstractions, heterogeneous and non-volatile memory systems, data center networking, novel techniques for energy and power optimization, and experiences from production systems.
{"title":"Proceedings of the Eleventh European Conference on Computer Systems","authors":"Cristian Cadar, P. Pietzuch, K. Keeton, R. Rodrigues","doi":"10.1145/2901318","DOIUrl":"https://doi.org/10.1145/2901318","url":null,"abstract":"Welcome to EuroSys 2016, held at Imperial College London, UK! This year's program includes 38 wonderful papers that cover a wide range of topics, including multicore systems and concurrency, distributed machine learning, studies of familiar operating system abstractions, heterogeneous and non-volatile memory systems, data center networking, novel techniques for energy and power optimization, and experiences from production systems.","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87854342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}