Ian Aragon Escobar, E. Alchieri, F. Dotti, F. Pedone
State machine replication (SMR) is a well-known approach to implementing fault-tolerant services, providing high availability and strong consistency. To boost the performance of SMR, some proposals execute independent commands concurrently, while dependent commands execute sequentially in the total delivery order. The most general approach to handling command dependencies resorts to a directed acyclic graph (DAG), where nodes represent commands and edges represent dependencies. In this paper we show that due to the command arrival and multithreaded execution rates of SMR, a highly concurrent implementation of a DAG is needed. We show that a typical coarse-grained DAG implementation, where the whole graph is a critical section, results in a bottleneck in the replica. We propose two improvements to the coarse-grained DAG approach: fine-grained algorithms, using lock-coupling, and lock-free algorithms. Our fine-grain algorithms lock individual vertices in the DAG. The lock-free algorithms use nonblocking synchronization, with atomic operations, and lazy synchronization to postpone physical removal of nodes. All algorithms were integrated in a parallel SMR prototype. Experimental evaluation revealed that the fine-grained algorithms are also subject to a bottleneck. The lock-free implementation, however, sports linear speedup with the number of working threads, in some cases scaling up to 64 threads.
{"title":"Boosting concurrency in Parallel State Machine Replication","authors":"Ian Aragon Escobar, E. Alchieri, F. Dotti, F. Pedone","doi":"10.1145/3361525.3361549","DOIUrl":"https://doi.org/10.1145/3361525.3361549","url":null,"abstract":"State machine replication (SMR) is a well-known approach to implementing fault-tolerant services, providing high availability and strong consistency. To boost the performance of SMR, some proposals execute independent commands concurrently, while dependent commands execute sequentially in the total delivery order. The most general approach to handling command dependencies resorts to a directed acyclic graph (DAG), where nodes represent commands and edges represent dependencies. In this paper we show that due to the command arrival and multithreaded execution rates of SMR, a highly concurrent implementation of a DAG is needed. We show that a typical coarse-grained DAG implementation, where the whole graph is a critical section, results in a bottleneck in the replica. We propose two improvements to the coarse-grained DAG approach: fine-grained algorithms, using lock-coupling, and lock-free algorithms. Our fine-grain algorithms lock individual vertices in the DAG. The lock-free algorithms use nonblocking synchronization, with atomic operations, and lazy synchronization to postpone physical removal of nodes. All algorithms were integrated in a parallel SMR prototype. Experimental evaluation revealed that the fine-grained algorithms are also subject to a bottleneck. The lock-free implementation, however, sports linear speedup with the number of working threads, in some cases scaling up to 64 threads.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120961158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sheung Chi Chan, J. Cheney, Pramod Bhatotia, Thomas Pasquier, Ashish Gehani, Hassaan Irshad, Lucian Carata, M. Seltzer
System level provenance is of widespread interest for applications such as security enforcement and information protection. However, testing the correctness or completeness of provenance capture tools is challenging and currently done manually. In some cases there is not even a clear consensus about what behavior is correct. We present an automated tool, ProvMark, that uses an existing provenance system as a black box and reliably identifies the provenance graph structure recorded for a given activity, by a reduction to subgraph isomorphism problems handled by an external solver. ProvMark is a beginning step in the much needed area of testing and comparing the expressiveness of provenance systems. We demonstrate ProvMark's usefuless in comparing three capture systems with different architectures and distinct design philosophies.
{"title":"ProvMark: A Provenance Expressiveness Benchmarking System","authors":"Sheung Chi Chan, J. Cheney, Pramod Bhatotia, Thomas Pasquier, Ashish Gehani, Hassaan Irshad, Lucian Carata, M. Seltzer","doi":"10.1145/3361525.3361552","DOIUrl":"https://doi.org/10.1145/3361525.3361552","url":null,"abstract":"System level provenance is of widespread interest for applications such as security enforcement and information protection. However, testing the correctness or completeness of provenance capture tools is challenging and currently done manually. In some cases there is not even a clear consensus about what behavior is correct. We present an automated tool, ProvMark, that uses an existing provenance system as a black box and reliably identifies the provenance graph structure recorded for a given activity, by a reduction to subgraph isomorphism problems handled by an external solver. ProvMark is a beginning step in the much needed area of testing and comparing the expressiveness of provenance systems. We demonstrate ProvMark's usefuless in comparing three capture systems with different architectures and distinct design philosophies.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123860917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. R. Jayaram, Vinod Muthusamy, Parijat Dube, Vatche Isahagian, Chen Wang, Benjamin Herta, S. Boag, Diana Arroyo, A. Tantawi, Archit Verma, Falk Pollok, Rania Y. Khalaf
Deep learning (DL) is becoming increasingly popular in several application domains and has made several new application features involving computer vision, speech recognition and synthesis, self-driving automobiles, drug design, etc. feasible and accurate. As a result, large scale "on-premise" and "cloud-hosted" deep learning platforms have become essential infrastructure in many organizations. These systems accept, schedule, manage and execute DL training jobs at scale. This paper describes the design, implementation and our experiences with FfDL, a DL platform used at IBM. We describe how our design balances dependability with scalability, elasticity, flexibility and efficiency. We examine FfDL qualitatively through a retrospective look at the lessons learned from building, operating, and supporting FfDL; and quantitatively through a detailed empirical evaluation of FfDL, including the overheads introduced by the platform for various DL models, the load and performance observed in a real case study using FfDL within our organization, the frequency of various faults observed including faults that we did not anticipate, and experiments demonstrating the benefits of various scheduling policies. FfDL has been open-sourced.
{"title":"FfDL: A Flexible Multi-tenant Deep Learning Platform","authors":"K. R. Jayaram, Vinod Muthusamy, Parijat Dube, Vatche Isahagian, Chen Wang, Benjamin Herta, S. Boag, Diana Arroyo, A. Tantawi, Archit Verma, Falk Pollok, Rania Y. Khalaf","doi":"10.1145/3361525.3361538","DOIUrl":"https://doi.org/10.1145/3361525.3361538","url":null,"abstract":"Deep learning (DL) is becoming increasingly popular in several application domains and has made several new application features involving computer vision, speech recognition and synthesis, self-driving automobiles, drug design, etc. feasible and accurate. As a result, large scale \"on-premise\" and \"cloud-hosted\" deep learning platforms have become essential infrastructure in many organizations. These systems accept, schedule, manage and execute DL training jobs at scale. This paper describes the design, implementation and our experiences with FfDL, a DL platform used at IBM. We describe how our design balances dependability with scalability, elasticity, flexibility and efficiency. We examine FfDL qualitatively through a retrospective look at the lessons learned from building, operating, and supporting FfDL; and quantitatively through a detailed empirical evaluation of FfDL, including the overheads introduced by the platform for various DL models, the load and performance observed in a real case study using FfDL within our organization, the frequency of various faults observed including faults that we did not anticipate, and experiments demonstrating the benefits of various scheduling policies. FfDL has been open-sourced.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115573510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Birke, Isabelly Rocha, Juan F. Pérez, V. Schiavoni, P. Felber, L. Chen
Today's big data clusters based on the MapReduce paradigm are capable of executing analysis jobs with multiple priorities, providing differential latency guarantees. Traces from production systems show that the latency advantage of high-priority jobs comes at the cost of severe latency degradation of low-priority jobs as well as daunting resource waste caused by repetitive eviction and re-execution of low-priority jobs. We advocate a new resource management design that exploits the idea of differential approximation and sprinting. The unique combination of approximation and sprinting avoids the eviction of low-priority jobs and its consequent latency degradation and resource waste. To this end, we designed, implemented and evaluated DiAS, an extension of the Spark processing engine to support deflate jobs by dropping tasks and to sprint jobs. Our experiments on scenarios with two and three priority classes indicate that DiAS achieves up to 90% and 60% latency reduction for low- and high-priority jobs, respectively. DiAS not only eliminates resource waste but also (surprisingly) lowers energy consumption up to 30% at only a marginal accuracy loss for low-priority jobs.
{"title":"Differential Approximation and Sprinting for Multi-Priority Big Data Engines","authors":"R. Birke, Isabelly Rocha, Juan F. Pérez, V. Schiavoni, P. Felber, L. Chen","doi":"10.1145/3361525.3361547","DOIUrl":"https://doi.org/10.1145/3361525.3361547","url":null,"abstract":"Today's big data clusters based on the MapReduce paradigm are capable of executing analysis jobs with multiple priorities, providing differential latency guarantees. Traces from production systems show that the latency advantage of high-priority jobs comes at the cost of severe latency degradation of low-priority jobs as well as daunting resource waste caused by repetitive eviction and re-execution of low-priority jobs. We advocate a new resource management design that exploits the idea of differential approximation and sprinting. The unique combination of approximation and sprinting avoids the eviction of low-priority jobs and its consequent latency degradation and resource waste. To this end, we designed, implemented and evaluated DiAS, an extension of the Spark processing engine to support deflate jobs by dropping tasks and to sprint jobs. Our experiments on scenarios with two and three priority classes indicate that DiAS achieves up to 90% and 60% latency reduction for low- and high-priority jobs, respectively. DiAS not only eliminates resource waste but also (surprisingly) lowers energy consumption up to 30% at only a marginal accuracy loss for low-priority jobs.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116075689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The large penetration and continued growth in ownership of personal electronic devices represents a freely available and largely untapped source of computing power. To leverage those, we present Pando, a new volunteer computing tool based on a declarative concurrent programming model and implemented using JavaScript, WebRTC, and WebSockets. This tool enables a dynamically varying number of failure-prone personal devices contributed by volunteers to parallelize the application of a function on a stream of values, by using the devices' browsers. We show that Pando can provide throughput improvements compared to a single personal device, on a variety of compute-bound applications including animation rendering and image processing. We also show the flexibility of our approach by deploying Pando on personal devices connected over a local network, on Grid5000, a French-wide computing grid in a virtual private network, and seven PlanetLab nodes distributed in a wide area network over Europe.
{"title":"Pando: Personal Volunteer Computing in Browsers","authors":"Erick Lavoie, L. Hendren, F. Desprez, M. Correia","doi":"10.1145/3361525.3361539","DOIUrl":"https://doi.org/10.1145/3361525.3361539","url":null,"abstract":"The large penetration and continued growth in ownership of personal electronic devices represents a freely available and largely untapped source of computing power. To leverage those, we present Pando, a new volunteer computing tool based on a declarative concurrent programming model and implemented using JavaScript, WebRTC, and WebSockets. This tool enables a dynamically varying number of failure-prone personal devices contributed by volunteers to parallelize the application of a function on a stream of values, by using the devices' browsers. We show that Pando can provide throughput improvements compared to a single personal device, on a variety of compute-bound applications including animation rendering and image processing. We also show the flexibility of our approach by deploying Pando on personal devices connected over a local network, on Grid5000, a French-wide computing grid in a virtual private network, and seven PlanetLab nodes distributed in a wide area network over Europe.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130697855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This edition marks the 13th ACM/IFIP/USENIX Middleware Conference. The first conference was held in the Lake District of England in 1998, and its origins reflected the growing importance of middleware and the realization that middleware represented an active, rigorous, growing and evolving research discipline in its own right. The definition of the term "middleware" has also evolved in the past decade, but retains, at its core, the notion of different levels/layers of abstractions in distributed-computing systems. Since its inception, the Middleware Conference has remained a premier forum for the discussion of innovations and recent advances in the design, implementation, experimentation, deployment, and usage of middleware systems.
{"title":"Proceedings of the 20th International Middleware Conference","authors":"P. Narasimhan, P. Triantafillou","doi":"10.1145/3361525","DOIUrl":"https://doi.org/10.1145/3361525","url":null,"abstract":"This edition marks the 13th ACM/IFIP/USENIX Middleware Conference. The first conference was held in the Lake District of England in 1998, and its origins reflected the growing importance of middleware and the realization that middleware represented an active, rigorous, growing and evolving research discipline in its own right. The definition of the term \"middleware\" has also evolved in the past decade, but retains, at its core, the notion of different levels/layers of abstractions in distributed-computing systems. Since its inception, the Middleware Conference has remained a premier forum for the discussion of innovations and recent advances in the design, implementation, experimentation, deployment, and usage of middleware systems.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130730169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}