With the increased adaption of blockchain technologies, permissioned blockchains such as Hyperledger Fabric provide a robust ecosystem for developing production-grade decentralized applications. However, the additional latency between executing and committing transactions, due to Fabric's three-phase transaction lifecycle of Execute-Order-Validate (EOV), is a potential scalability bottleneck. The added latency increases the probability of concurrent updates on the same keys by different transactions, leading to transaction failures caused by Fabric's concurrency control mechanism. The transaction failures increase the application development complexity and decrease Fabric's throughput. Conflict-free Replicated Datatypes (CRDTs) provide a solution for merging and resolving conflicts in the presence of concurrent updates. In this work, we introduce FabricCRDT, an approach for integrating CRDTs to Fabric. Our evaluations show that in general, FabricCRDT offers higher throughput of successful transactions than Fabric, while successfully committing and merging all conflicting transactions without any failures.
{"title":"FabricCRDT: A Conflict-Free Replicated Datatypes Approach to Permissioned Blockchains","authors":"Pezhman Nasirifard, R. Mayer, H. Jacobsen","doi":"10.1145/3361525.3361540","DOIUrl":"https://doi.org/10.1145/3361525.3361540","url":null,"abstract":"With the increased adaption of blockchain technologies, permissioned blockchains such as Hyperledger Fabric provide a robust ecosystem for developing production-grade decentralized applications. However, the additional latency between executing and committing transactions, due to Fabric's three-phase transaction lifecycle of Execute-Order-Validate (EOV), is a potential scalability bottleneck. The added latency increases the probability of concurrent updates on the same keys by different transactions, leading to transaction failures caused by Fabric's concurrency control mechanism. The transaction failures increase the application development complexity and decrease Fabric's throughput. Conflict-free Replicated Datatypes (CRDTs) provide a solution for merging and resolving conflicts in the presence of concurrent updates. In this work, we introduce FabricCRDT, an approach for integrating CRDTs to Fabric. Our evaluations show that in general, FabricCRDT offers higher throughput of successful transactions than Fabric, while successfully committing and merging all conflicting transactions without any failures.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116081604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Exploiting opportunistic memory by oversubscription is an appealing approach to improving cluster utilization and throughput. In this paper, we find the efficacy of memory oversubscription depends on whether or not the oversubscribed tasks can be killed by an OutOf Memory (OOM) killer in a timely manner to avoid significant memory thrashing upon memory pressure. However, current approaches in modern cluster schedulers are actually unable to unleash the power of opportunistic memory because their user space OOM killers are unable to timely deliver a task killing signal to terminate the oversubscribed tasks. Our experiments observe that a user space OOM killer fails to do that because of lacking the memory pressure knowledge from OS while the kernel space Linux OOM killer is too conservative to relieve memory pressure. In this paper, we design a user-assisted OOM killer (namely UA killer) in kernel space, an OS augmentation for accurate thrashing detection and agile task killing. To identify a thrashing task, UA killer features a novel mechanism, constraint thrashing. Upon UA killer, we develop Charon, a cluster scheduler for oversubscription of opportunistic memory in an on-demand manner. We implement Charon upon Mercury, a state-of-the-art opportunistic cluster scheduler. Extensive experiments with a Google trace in a 26-node cluster show that Charon can: (1) achieve agile task killing, (2) improve the best-effort job throughput by 3.5X over Mercury while prioritizing the production jobs, and (3) improve the 90th job completion time of production jobs over Kubernetes opportunistic scheduler by 62%.
{"title":"OS-Augmented Oversubscription of Opportunistic Memory with a User-Assisted OOM Killer","authors":"Wei Chen, Aidi Pi, Shaoqi Wang, Xiaobo Zhou","doi":"10.1145/3361525.3361534","DOIUrl":"https://doi.org/10.1145/3361525.3361534","url":null,"abstract":"Exploiting opportunistic memory by oversubscription is an appealing approach to improving cluster utilization and throughput. In this paper, we find the efficacy of memory oversubscription depends on whether or not the oversubscribed tasks can be killed by an OutOf Memory (OOM) killer in a timely manner to avoid significant memory thrashing upon memory pressure. However, current approaches in modern cluster schedulers are actually unable to unleash the power of opportunistic memory because their user space OOM killers are unable to timely deliver a task killing signal to terminate the oversubscribed tasks. Our experiments observe that a user space OOM killer fails to do that because of lacking the memory pressure knowledge from OS while the kernel space Linux OOM killer is too conservative to relieve memory pressure. In this paper, we design a user-assisted OOM killer (namely UA killer) in kernel space, an OS augmentation for accurate thrashing detection and agile task killing. To identify a thrashing task, UA killer features a novel mechanism, constraint thrashing. Upon UA killer, we develop Charon, a cluster scheduler for oversubscription of opportunistic memory in an on-demand manner. We implement Charon upon Mercury, a state-of-the-art opportunistic cluster scheduler. Extensive experiments with a Google trace in a 26-node cluster show that Charon can: (1) achieve agile task killing, (2) improve the best-effort job throughput by 3.5X over Mercury while prioritizing the production jobs, and (3) improve the 90th job completion time of production jobs over Kubernetes opportunistic scheduler by 62%.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114185478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Approximate Query Processing has become increasingly popular as larger data sizes have increased query latency in distributed query processing systems. To provide such approximate results, systems return intermediate results and iteratively update these approximations as they process more data. In shared clusters, however, these systems waste resources by directing resources to queries that are no longer improving the results given to users. We describe ReLAQS, a cluster scheduling system for online aggregation queries that aims to reduce latency by assigning resources to queries with the most potential for improvement. ReLAQS utilizes the approximate results each query returns to periodically estimate how much progress each concurrent query is currently making. It then uses this information to predict how much progress each query is expected to make in the near future and redistributes resources in real-time to maximize the overall quality of the answers returned across the cluster. Experiments show that ReLAQS achieves a reduction in latency of up to 47% compared to traditional fair schedulers.
{"title":"ReLAQS","authors":"Logan Stafman, Andrew Or, M. Freedman","doi":"10.1145/3361525.3361553","DOIUrl":"https://doi.org/10.1145/3361525.3361553","url":null,"abstract":"Approximate Query Processing has become increasingly popular as larger data sizes have increased query latency in distributed query processing systems. To provide such approximate results, systems return intermediate results and iteratively update these approximations as they process more data. In shared clusters, however, these systems waste resources by directing resources to queries that are no longer improving the results given to users. We describe ReLAQS, a cluster scheduling system for online aggregation queries that aims to reduce latency by assigning resources to queries with the most potential for improvement. ReLAQS utilizes the approximate results each query returns to periodically estimate how much progress each concurrent query is currently making. It then uses this information to predict how much progress each query is expected to make in the near future and redistributes resources in real-time to maximize the overall quality of the answers returned across the cluster. Experiments show that ReLAQS achieves a reduction in latency of up to 47% compared to traditional fair schedulers.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130906810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Control mechanisms of stream processing applications (SPAs) that ensure latency bounds at minimal runtime cost mostly target a specific infrastructure, e.g., homogeneous nodes. With the growing popularity of the Internet of Things, fog, and edge computing, SPAs are more often distributed on heterogeneous infrastructures, triggering the need for a holistic SPA-control that still considers heterogeneity. We therefore combine individual control mechanisms via the latency-distribution problem that seeks to distribute latency budgets to individually managed components of distributed SPAs for a lightweight yet effective end-to-end control. To this end, we introduce a hierarchical control architecture, give a formal definition of the latency-distribution problem, and provide both an ILP formulation to find an optimal solution as well as a heuristic approach, thereby enabling the combination of individual control mechanisms into one SPA while ensuring global cost minimality. Our evaluations show that both solutions are effective---while the heuristic approach is only slightly more costly than the optimal ILP solution, it significantly reduces runtime and communication overhead.
{"title":"Combining it all: Cost minimal and low-latency stream processing across distributed heterogeneous infrastructures","authors":"Henriette Röger, Sukanya Bhowmik, K. Rothermel","doi":"10.1145/3361525.3361551","DOIUrl":"https://doi.org/10.1145/3361525.3361551","url":null,"abstract":"Control mechanisms of stream processing applications (SPAs) that ensure latency bounds at minimal runtime cost mostly target a specific infrastructure, e.g., homogeneous nodes. With the growing popularity of the Internet of Things, fog, and edge computing, SPAs are more often distributed on heterogeneous infrastructures, triggering the need for a holistic SPA-control that still considers heterogeneity. We therefore combine individual control mechanisms via the latency-distribution problem that seeks to distribute latency budgets to individually managed components of distributed SPAs for a lightweight yet effective end-to-end control. To this end, we introduce a hierarchical control architecture, give a formal definition of the latency-distribution problem, and provide both an ILP formulation to find an optimal solution as well as a heuristic approach, thereby enabling the combination of individual control mechanisms into one SPA while ensuring global cost minimality. Our evaluations show that both solutions are effective---while the heuristic approach is only slightly more costly than the optimal ILP solution, it significantly reduces runtime and communication overhead.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117156894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lixia Chen, Jian Li, Ruhui Ma, Haibing Guan, Hans A. Jacobsen
With in-memory key-value caches such as Redis and Memcached being a key component for many systems to improve throughput and reduce latency, cloud caches have been widely adopted for small companies to deploy their own cache systems. However, data security is still a major concern, which affects the adoption of cloud caches. Tenant's data stored in a multi-tenant cloud environment faces threats from both co-located other tenants, as well as the untrusted cloud provider. We proposed EnclaveCache, which is a multi-tenant key-value cache that provides data confidentiality and privacy leveraging Intel Software Guard Extensions (SGX). Enclave-Cache utilizes multiple SGX enclaves to enforce data isolation among co-located tenants. With a carefully designed key distribution procedure, EnclaveCache ensures that a tenant-specific encryption key is securely guarded by an enclave to perform cryptography operations towards tenant's data. Experimental results show that EnclaveCache achieves comparable performance to traditional key-value caches (with secure communication) with a performance overhead of 13% while ensuring security guarantees and better scalability.
{"title":"EnclaveCache","authors":"Lixia Chen, Jian Li, Ruhui Ma, Haibing Guan, Hans A. Jacobsen","doi":"10.1145/3361525.3361533","DOIUrl":"https://doi.org/10.1145/3361525.3361533","url":null,"abstract":"With in-memory key-value caches such as Redis and Memcached being a key component for many systems to improve throughput and reduce latency, cloud caches have been widely adopted for small companies to deploy their own cache systems. However, data security is still a major concern, which affects the adoption of cloud caches. Tenant's data stored in a multi-tenant cloud environment faces threats from both co-located other tenants, as well as the untrusted cloud provider. We proposed EnclaveCache, which is a multi-tenant key-value cache that provides data confidentiality and privacy leveraging Intel Software Guard Extensions (SGX). Enclave-Cache utilizes multiple SGX enclaves to enforce data isolation among co-located tenants. With a carefully designed key distribution procedure, EnclaveCache ensures that a tenant-specific encryption key is securely guarded by an enclave to perform cryptography operations towards tenant's data. Experimental results show that EnclaveCache achieves comparable performance to traditional key-value caches (with secure communication) with a performance overhead of 13% while ensuring security guarantees and better scalability.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126664305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Attacks on the heap are an increasingly severe threat. State-of-the-art secure dynamic memory allocators can offer protection, however their memory footprint is high, making them suboptimal in many situations. We introduce Slim-Guard, a secure allocator whose design is driven by memory efficiency. Among other features, SlimGuard uses an efficient fine-grain size classes indexing mechanism and implements a novel dynamic canary scheme. It offers a low memory overhead due its size classes optimized for canary usage, its on-demand metadata allocation, and the combination of randomized allocations and over-provisioning into a single memory efficient security feature. SlimGuard protects against widespread heap-related attacks such as overflows, over-reads, double/invalid free, and use-after-free. Evaluation over a wide range of applications shows that it offers a significant reduction in memory consumption compared to the state-of-the-art secure allocator (up to 2x in macro-benchmarks), while offering similar or better security guarantees and good performance.
{"title":"SlimGuard","authors":"Beichen Liu, Pierre Olivier, B. Ravindran","doi":"10.1145/3361525.3361532","DOIUrl":"https://doi.org/10.1145/3361525.3361532","url":null,"abstract":"Attacks on the heap are an increasingly severe threat. State-of-the-art secure dynamic memory allocators can offer protection, however their memory footprint is high, making them suboptimal in many situations. We introduce Slim-Guard, a secure allocator whose design is driven by memory efficiency. Among other features, SlimGuard uses an efficient fine-grain size classes indexing mechanism and implements a novel dynamic canary scheme. It offers a low memory overhead due its size classes optimized for canary usage, its on-demand metadata allocation, and the combination of randomized allocations and over-provisioning into a single memory efficient security feature. SlimGuard protects against widespread heap-related attacks such as overflows, over-reads, double/invalid free, and use-after-free. Evaluation over a wide range of applications shows that it offers a significant reduction in memory consumption compared to the state-of-the-art secure allocator (up to 2x in macro-benchmarks), while offering similar or better security guarantees and good performance.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122967161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Besma Khalfoun, Mohamed Maouche, Sonia Ben Mokhtar, S. Bouchenak
With the increasing development of handheld devices, Location Based Services (LBSs) became very popular in facilitating users' daily life with a broad range of applications (e.g. traffic monitoring, geo-located search, geo-gaming). However, several studies have shown that the collected mobility data may reveal sensitive information about end-users such as their home and workplaces, their gender, political, religious or sexual preferences. To overcome these threats, many Location Privacy Protection Mechanisms (LPPMs) were proposed in the literature. While the existing LPPMs try to protect most of the users in mobility datasets, there is usually a subset of users who are not protected by any of the existing LPPMs. By analogy to medical research, there are orphan diseases, for which the medical community is still looking for a remedy. In this paper, we present MooD, a fine-grained multi-LPPM user-centric solution whose main objective is to find a treatment to mobile users' orphan disease by protecting them from re-identification attacks. Our experiments are conducted on four real world datasets. The results show that MooD outperforms its competitors, and the amount of user mobility data it is able to protect is in the range between 97.5% to 100% on the various datasets.
{"title":"MooD: MObility Data Privacy as Orphan Disease: Experimentation and Deployment Paper","authors":"Besma Khalfoun, Mohamed Maouche, Sonia Ben Mokhtar, S. Bouchenak","doi":"10.1145/3361525.3361542","DOIUrl":"https://doi.org/10.1145/3361525.3361542","url":null,"abstract":"With the increasing development of handheld devices, Location Based Services (LBSs) became very popular in facilitating users' daily life with a broad range of applications (e.g. traffic monitoring, geo-located search, geo-gaming). However, several studies have shown that the collected mobility data may reveal sensitive information about end-users such as their home and workplaces, their gender, political, religious or sexual preferences. To overcome these threats, many Location Privacy Protection Mechanisms (LPPMs) were proposed in the literature. While the existing LPPMs try to protect most of the users in mobility datasets, there is usually a subset of users who are not protected by any of the existing LPPMs. By analogy to medical research, there are orphan diseases, for which the medical community is still looking for a remedy. In this paper, we present MooD, a fine-grained multi-LPPM user-centric solution whose main objective is to find a treatment to mobile users' orphan disease by protecting them from re-identification attacks. Our experiments are conducted on four real world datasets. The results show that MooD outperforms its competitors, and the amount of user mobility data it is able to protect is in the range between 97.5% to 100% on the various datasets.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127716973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simon Da Silva, Sonia Ben Mokhtar, Stefan Contiu, D. Négru, Laurent Réveillère, E. Rivière
Video on Demand (VoD) streaming is the largest source of Internet traffic. Efficient and scalable VoD requires Content Delivery Networks (CDNs) whose cost are prohibitive for many providers. An alternative is to cache and serve video content using end-users devices. Direct connections between these devices complement the resources of core VoD servers with an edge-assisted collaborative CDN. VoD access histories can reveal critical personal information, and centralized VoD solutions are notorious for exploiting personal data. Hiding the interests of users from servers and edge-assisting devices is necessary for a new generation of privacy-preserving VoD services. We introduce PrivaTube, a scalable and cost-effective VoD solution. PrivaTube aggregates video content from multiple servers and edge peers to offer a high Quality of Experience (QoE) for its users. It enables privacy preservation at all levels of the content distribution process. It leverages Trusted Execution Environments (TEEs) at servers and clients, and obfuscates access patterns using fake requests that reduce the risk of personal information leaks. Fake requests are further leveraged to implement proactive provisioning and improve QoE. Our evaluation of a complete prototype shows that PrivaTube reduces the load on servers and increases QoE while providing strong privacy guarantees.
{"title":"PrivaTube: Privacy-Preserving Edge-Assisted Video Streaming","authors":"Simon Da Silva, Sonia Ben Mokhtar, Stefan Contiu, D. Négru, Laurent Réveillère, E. Rivière","doi":"10.1145/3361525.3361546","DOIUrl":"https://doi.org/10.1145/3361525.3361546","url":null,"abstract":"Video on Demand (VoD) streaming is the largest source of Internet traffic. Efficient and scalable VoD requires Content Delivery Networks (CDNs) whose cost are prohibitive for many providers. An alternative is to cache and serve video content using end-users devices. Direct connections between these devices complement the resources of core VoD servers with an edge-assisted collaborative CDN. VoD access histories can reveal critical personal information, and centralized VoD solutions are notorious for exploiting personal data. Hiding the interests of users from servers and edge-assisting devices is necessary for a new generation of privacy-preserving VoD services. We introduce PrivaTube, a scalable and cost-effective VoD solution. PrivaTube aggregates video content from multiple servers and edge peers to offer a high Quality of Experience (QoE) for its users. It enables privacy preservation at all levels of the content distribution process. It leverages Trusted Execution Environments (TEEs) at servers and clients, and obfuscates access patterns using fake requests that reduce the risk of personal information leaks. Fake requests are further leveraged to implement proactive provisioning and improve QoE. Our evaluation of a complete prototype shows that PrivaTube reduces the load on servers and increases QoE while providing strong privacy guarantees.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126578341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiang Ni, S. Schneider, Raju Pavuluri, Jonathan Kaus, Kun-Lung Wu
Streaming applications exhibit abundant opportunities for pipeline parallelism, data parallelism and task parallelism. Prior work in IBM Streams introduced an elastic threading model that sought the best performance by automatically tuning the number of threads. In this paper, we introduce the ability to automatically discover where that threading model is profitable. However this introduces a new challenge: we have separate performance elastic mechanisms that are designed with different objectives, leading to potential negative interactions and unintended performance degradation. We present our experiences in overcoming these challenges by showing how to coordinate separate but interfering elasticity mechanisms to maxmize performance gains with stable and fast parallelism exploitation. We first describe an elastic performance mechanism that automatically adapts different threading models to different regions of an application. We then show a coherent ecosystem for coordinating this threading model elasticty with thread count elasticity. This system is an online, stable multi-level elastic coordination scheme that adapts different regions of a streaming application to different threading models and number of threads. We implemented this multi-level coordination scheme in IBM Streams and demonstrated that it (a) scales to over a hundred threads; (b) can improve performance by an order of magnitude on two different processor architectures when an application can benefit from multiple threading models; and (c) achieves performance comparable to hand-optimized applications but with much fewer threads.
{"title":"Automating Multi-level Performance Elastic Components for IBM Streams","authors":"Xiang Ni, S. Schneider, Raju Pavuluri, Jonathan Kaus, Kun-Lung Wu","doi":"10.1145/3361525.3361544","DOIUrl":"https://doi.org/10.1145/3361525.3361544","url":null,"abstract":"Streaming applications exhibit abundant opportunities for pipeline parallelism, data parallelism and task parallelism. Prior work in IBM Streams introduced an elastic threading model that sought the best performance by automatically tuning the number of threads. In this paper, we introduce the ability to automatically discover where that threading model is profitable. However this introduces a new challenge: we have separate performance elastic mechanisms that are designed with different objectives, leading to potential negative interactions and unintended performance degradation. We present our experiences in overcoming these challenges by showing how to coordinate separate but interfering elasticity mechanisms to maxmize performance gains with stable and fast parallelism exploitation. We first describe an elastic performance mechanism that automatically adapts different threading models to different regions of an application. We then show a coherent ecosystem for coordinating this threading model elasticty with thread count elasticity. This system is an online, stable multi-level elastic coordination scheme that adapts different regions of a streaming application to different threading models and number of threads. We implemented this multi-level coordination scheme in IBM Streams and demonstrated that it (a) scales to over a hundred threads; (b) can improve performance by an order of magnitude on two different processor architectures when an application can benefit from multiple threading models; and (c) achieves performance comparable to hand-optimized applications but with much fewer threads.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124163466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Efficient and correct operation of an IoT network requires the presence of a failure detector and membership protocol amongst the IoT nodes. This paper presents a new failure detector for IoT settings where nodes are connected via a wireless ad-hoc network. This failure detector, which we name Medley, is fully decentralized, allows IoT nodes to maintain a local membership list of other alive nodes, detects failures quickly (and updates the membership list), and incurs low communication overhead in the underlying ad-hoc network. In order to minimize detection time and communication, we adapt a failure detector originally proposed for datacenters (SWIM), for the IoT environment. In Medley each node picks a medley of ping targets in a randomized and skewed manner, preferring nearer nodes. Via analysis and NS-3 simulation we show the right mix of pinging probabilities that simultaneously optimize detection time and communication traffic. We have also implemented Medley for Raspberry Pis, and present deployment results.
{"title":"Medley: A Novel Distributed Failure Detector for IoT Networks","authors":"Rui Yang, Shichu Zhu, Yifei Li, Indranil Gupta","doi":"10.1145/3361525.3361556","DOIUrl":"https://doi.org/10.1145/3361525.3361556","url":null,"abstract":"Efficient and correct operation of an IoT network requires the presence of a failure detector and membership protocol amongst the IoT nodes. This paper presents a new failure detector for IoT settings where nodes are connected via a wireless ad-hoc network. This failure detector, which we name Medley, is fully decentralized, allows IoT nodes to maintain a local membership list of other alive nodes, detects failures quickly (and updates the membership list), and incurs low communication overhead in the underlying ad-hoc network. In order to minimize detection time and communication, we adapt a failure detector originally proposed for datacenters (SWIM), for the IoT environment. In Medley each node picks a medley of ping targets in a randomized and skewed manner, preferring nearer nodes. Via analysis and NS-3 simulation we show the right mix of pinging probabilities that simultaneously optimize detection time and communication traffic. We have also implemented Medley for Raspberry Pis, and present deployment results.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114423593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}