Pub Date : 2022-07-01DOI: 10.1109/ISPDC55340.2022.00013
Muhammad Javed, Hao Zhou, David Troendle, Byunghyun Jang
The hash table finds numerous applications in many different domains, but its potential for non-coalesced memory accesses and execution divergence characteristics impose optimization challenges on GPUs. We propose a novel hash table design, referred to as Cuckoo Node Hashing, which aims to better exploit the massive data parallelism offered by GPUs. At the core of its design, we leverage Cuckoo Hashing, one of known hash table design schemes, in a closed-address manner, which, to our knowledge, is the first attempt on GPUs. We also propose an architecture-aware warp-cooperative reordering algorithm that improves the memory performance and thread divergence of Cuckoo Node Hashing and efficiently increases the likelihood of coalesced memory accesses in hash table operations. Our experiments show that Cuckoo Node Hashing outperforms and scales better than existing state-of-the-art GPU hash table designs such as DACHash and Slab Hash with a peak performance of 5.03 billion queries/second in static searching and 4.34 billion insertions/second in static building.
{"title":"Cuckoo Node Hashing on GPUs","authors":"Muhammad Javed, Hao Zhou, David Troendle, Byunghyun Jang","doi":"10.1109/ISPDC55340.2022.00013","DOIUrl":"https://doi.org/10.1109/ISPDC55340.2022.00013","url":null,"abstract":"The hash table finds numerous applications in many different domains, but its potential for non-coalesced memory accesses and execution divergence characteristics impose optimization challenges on GPUs. We propose a novel hash table design, referred to as Cuckoo Node Hashing, which aims to better exploit the massive data parallelism offered by GPUs. At the core of its design, we leverage Cuckoo Hashing, one of known hash table design schemes, in a closed-address manner, which, to our knowledge, is the first attempt on GPUs. We also propose an architecture-aware warp-cooperative reordering algorithm that improves the memory performance and thread divergence of Cuckoo Node Hashing and efficiently increases the likelihood of coalesced memory accesses in hash table operations. Our experiments show that Cuckoo Node Hashing outperforms and scales better than existing state-of-the-art GPU hash table designs such as DACHash and Slab Hash with a peak performance of 5.03 billion queries/second in static searching and 4.34 billion insertions/second in static building.","PeriodicalId":389334,"journal":{"name":"2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127319594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-01DOI: 10.1109/ISPDC55340.2022.00026
D. Hass, Josef Spillner
Continuum computing promises the abstraction of physical node location and node platform stack in order to create a seamless application deployment and execution across edges and cloud data centres. For industrial IoT applications, the demand to generate data insights in conjunction with an installed base of increasingly capable edge devices is calling for appropriate continuum computing interfaces. Derived from a case study in industrial water flow monitoring and based on the industry’s de-facto standard Kubernetes to deploy complex containerised workloads, we present an appropriate continuum deployment mechanism based on custom Kubernetes controllers and CI/CD, called Kontinuum Controller. Through synthetic experiments and a holistic cross-provider deployment, we investigate its scalability with emphasis on reconciling adjusted configuration per application and per node, a critical requirement by industrial customers. Our findings convey that Kubernetes by default would enter undesirable oscillation already for modestly sized deployments. Thus, we also discuss possible solutions.
{"title":"Workload Deployment and Configuration Reconciliation at Scale in Kubernetes-Based Edge-Cloud Continuums","authors":"D. Hass, Josef Spillner","doi":"10.1109/ISPDC55340.2022.00026","DOIUrl":"https://doi.org/10.1109/ISPDC55340.2022.00026","url":null,"abstract":"Continuum computing promises the abstraction of physical node location and node platform stack in order to create a seamless application deployment and execution across edges and cloud data centres. For industrial IoT applications, the demand to generate data insights in conjunction with an installed base of increasingly capable edge devices is calling for appropriate continuum computing interfaces. Derived from a case study in industrial water flow monitoring and based on the industry’s de-facto standard Kubernetes to deploy complex containerised workloads, we present an appropriate continuum deployment mechanism based on custom Kubernetes controllers and CI/CD, called Kontinuum Controller. Through synthetic experiments and a holistic cross-provider deployment, we investigate its scalability with emphasis on reconciling adjusted configuration per application and per node, a critical requirement by industrial customers. Our findings convey that Kubernetes by default would enter undesirable oscillation already for modestly sized deployments. Thus, we also discuss possible solutions.","PeriodicalId":389334,"journal":{"name":"2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122771608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-01DOI: 10.1109/ISPDC55340.2022.00020
G. Laccetti, M. Lapegna, D. Romano
Clustering algorithms are efficient tools for discovering correlations or affinities within large datasets and are the basis of several Artificial Intelligence processes based on data generated by sensor networks. Recently, such algorithms have found an active application area closely correlated to the Edge Computing paradigm. The final aim is to transfer intelligence and decision-making ability near the edge of the sensors networks, thus avoiding the stringent requests for low-latency and large-bandwidth networks typical of the Cloud Computing model. In such a context, the present work describes a new hybrid version of a clustering algorithm for the NVIDIA Jetson Nano board by integrating two different parallel strategies. The algorithm is later evaluated from the points of view of the performance and energy consumption, comparing it with two high-end GPU-based computing systems. The results confirm the possibility of creating intelligent sensor networks where decisions are taken at the data collection points.
{"title":"A hybrid clustering algorithm for high-performance edge computing devices [Short]","authors":"G. Laccetti, M. Lapegna, D. Romano","doi":"10.1109/ISPDC55340.2022.00020","DOIUrl":"https://doi.org/10.1109/ISPDC55340.2022.00020","url":null,"abstract":"Clustering algorithms are efficient tools for discovering correlations or affinities within large datasets and are the basis of several Artificial Intelligence processes based on data generated by sensor networks. Recently, such algorithms have found an active application area closely correlated to the Edge Computing paradigm. The final aim is to transfer intelligence and decision-making ability near the edge of the sensors networks, thus avoiding the stringent requests for low-latency and large-bandwidth networks typical of the Cloud Computing model. In such a context, the present work describes a new hybrid version of a clustering algorithm for the NVIDIA Jetson Nano board by integrating two different parallel strategies. The algorithm is later evaluated from the points of view of the performance and energy consumption, comparing it with two high-end GPU-based computing systems. The results confirm the possibility of creating intelligent sensor networks where decisions are taken at the data collection points.","PeriodicalId":389334,"journal":{"name":"2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123004379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-01DOI: 10.1109/ISPDC55340.2022.00017
F. Gava, L. Bayati
Homomorphic encryption draws huge attention as it provides a way of privacy-preserving computations on encrypted data. But sadly, such computations are extremely expensive both in terms of calculation time and memory consumption and so much slower than the corresponding computations with unencrypted data. One solution is using parallelism and in this work, we investigate using distributed architectures of interconnected nodes (multi-core clusters) to execute homomorphic computations that have been programmed with the cingulata environment, a toolchain which is able to generate boolean circuits (where gates manipulate encrypted booleans) from homomorphic C++ codes. Such circuits are spliting into slices and we have used a bsp algorithm to executed each of them.
{"title":"A scalable algorithm for homomorphic computing on multi-core clusters","authors":"F. Gava, L. Bayati","doi":"10.1109/ISPDC55340.2022.00017","DOIUrl":"https://doi.org/10.1109/ISPDC55340.2022.00017","url":null,"abstract":"Homomorphic encryption draws huge attention as it provides a way of privacy-preserving computations on encrypted data. But sadly, such computations are extremely expensive both in terms of calculation time and memory consumption and so much slower than the corresponding computations with unencrypted data. One solution is using parallelism and in this work, we investigate using distributed architectures of interconnected nodes (multi-core clusters) to execute homomorphic computations that have been programmed with the cingulata environment, a toolchain which is able to generate boolean circuits (where gates manipulate encrypted booleans) from homomorphic C++ codes. Such circuits are spliting into slices and we have used a bsp algorithm to executed each of them.","PeriodicalId":389334,"journal":{"name":"2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126682461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-01DOI: 10.1109/ISPDC55340.2022.00016
Tian Ye, S. Kuppannagari, C. Rose, Sasindu Wijeratne, R. Kannan, V. Prasanna
Extreme scale graph analytics is imperative for several real-world Big Data applications with the underlying graph structure containing millions or billions of vertices and edges. Since such huge graphs cannot fit into the memory of a single computer, distributed processing of the graph is required. Several frameworks have been developed for performing graph processing on distributed systems. The frameworks focus primarily on choosing the right computation model and the partitioning scheme under the assumption that such design choices will automatically reduce the communication overheads. For any computational model and partitioning scheme, communication schemes — the data to be communicated and the virtual interconnection network among the nodes — have significant impact on the performance. To analyze this impact, in this work, we identify widely used communication schemes and estimate their performance. Analyzing the trade-offs between the number of compute nodes and communication costs of various schemes on a distributed platform by brute force experimentation can be prohibitively expensive. Thus, our performance estimation models provide an economic way to perform the analyses given the partitions and the communication scheme as input. We validate our model on a local HPC cluster as well as the cloud hosted NSF Chameleon cluster. Using our estimates as well as the actual measurements, we compare the communication schemes and provide conditions under which one scheme should be preferred over the others.
{"title":"Estimating the Impact of Communication Schemes for Distributed Graph Processing","authors":"Tian Ye, S. Kuppannagari, C. Rose, Sasindu Wijeratne, R. Kannan, V. Prasanna","doi":"10.1109/ISPDC55340.2022.00016","DOIUrl":"https://doi.org/10.1109/ISPDC55340.2022.00016","url":null,"abstract":"Extreme scale graph analytics is imperative for several real-world Big Data applications with the underlying graph structure containing millions or billions of vertices and edges. Since such huge graphs cannot fit into the memory of a single computer, distributed processing of the graph is required. Several frameworks have been developed for performing graph processing on distributed systems. The frameworks focus primarily on choosing the right computation model and the partitioning scheme under the assumption that such design choices will automatically reduce the communication overheads. For any computational model and partitioning scheme, communication schemes — the data to be communicated and the virtual interconnection network among the nodes — have significant impact on the performance. To analyze this impact, in this work, we identify widely used communication schemes and estimate their performance. Analyzing the trade-offs between the number of compute nodes and communication costs of various schemes on a distributed platform by brute force experimentation can be prohibitively expensive. Thus, our performance estimation models provide an economic way to perform the analyses given the partitions and the communication scheme as input. We validate our model on a local HPC cluster as well as the cloud hosted NSF Chameleon cluster. Using our estimates as well as the actual measurements, we compare the communication schemes and provide conditions under which one scheme should be preferred over the others.","PeriodicalId":389334,"journal":{"name":"2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114227248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sponsors and Conference Support","authors":"","doi":"10.1109/icbake.2013.76","DOIUrl":"https://doi.org/10.1109/icbake.2013.76","url":null,"abstract":"","PeriodicalId":389334,"journal":{"name":"2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114613348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-01DOI: 10.1109/ISPDC55340.2022.00021
Juan Salamanca
Speculative Taskloop (STL) is a loop parallelization technique that takes the best of Task-based Parallelism and Thread-Level Speculation to speed up loops with may loop-carried dependencies that were previously difficult for compilers to parallelize. Previous studies show the efficiency of STL when implemented using Hardware Transactional Memory and the advantages it offers compared to a typical DOACROSS technique such as OpenMP ordered. This paper presents a performance comparison between STL and a previously proposed technique that implements Thread-Level Speculation (TLS) in the for worksharing construct (FOR-TLS) over a set of loops from cbench and SPEC2006 benchmarks. The results show interesting insights on how each technique can be more appropriate depending on the characteristics of the evaluated loop. Experimental results reveal that by implementing both techniques on top of HTM, speed-ups of up to 2.41× can be obtained for STL and up to 2× for FOR-TLS.
{"title":"Performance Comparison of Speculative Taskloop and OpenMP-for-Loop Thread-Level Speculation on Hardware Transactional Memory","authors":"Juan Salamanca","doi":"10.1109/ISPDC55340.2022.00021","DOIUrl":"https://doi.org/10.1109/ISPDC55340.2022.00021","url":null,"abstract":"Speculative Taskloop (STL) is a loop parallelization technique that takes the best of Task-based Parallelism and Thread-Level Speculation to speed up loops with may loop-carried dependencies that were previously difficult for compilers to parallelize. Previous studies show the efficiency of STL when implemented using Hardware Transactional Memory and the advantages it offers compared to a typical DOACROSS technique such as OpenMP ordered. This paper presents a performance comparison between STL and a previously proposed technique that implements Thread-Level Speculation (TLS) in the for worksharing construct (FOR-TLS) over a set of loops from cbench and SPEC2006 benchmarks. The results show interesting insights on how each technique can be more appropriate depending on the characteristics of the evaluated loop. Experimental results reveal that by implementing both techniques on top of HTM, speed-ups of up to 2.41× can be obtained for STL and up to 2× for FOR-TLS.","PeriodicalId":389334,"journal":{"name":"2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115793756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-01DOI: 10.1109/ISPDC55340.2022.00010
Hovhannes A. Harutyunyan, Narek A. Hovhannisyan, Rakshit Magithiya
Broadcasting is an information dissemination problem in a connected graph in which one vertex, called the originator, must distribute a message to all other vertices by placing a series of calls along the edges of the graph. Every time the informed vertices aid the originator in distributing the message. Finding the broadcast time of any vertex in an arbitrary graph is NP-complete. We designed an efficient heuristic, which improves the results of existing heuristics in most cases. Extensive simulations show that our new heuristic outperforms the existing ones for most of the commonly used interconnection networks in some network models generated by network simulator ns-2.
{"title":"[Full] Deep Heuristic for Broadcasting in Arbitrary Networks","authors":"Hovhannes A. Harutyunyan, Narek A. Hovhannisyan, Rakshit Magithiya","doi":"10.1109/ISPDC55340.2022.00010","DOIUrl":"https://doi.org/10.1109/ISPDC55340.2022.00010","url":null,"abstract":"Broadcasting is an information dissemination problem in a connected graph in which one vertex, called the originator, must distribute a message to all other vertices by placing a series of calls along the edges of the graph. Every time the informed vertices aid the originator in distributing the message. Finding the broadcast time of any vertex in an arbitrary graph is NP-complete. We designed an efficient heuristic, which improves the results of existing heuristics in most cases. Extensive simulations show that our new heuristic outperforms the existing ones for most of the commonly used interconnection networks in some network models generated by network simulator ns-2.","PeriodicalId":389334,"journal":{"name":"2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117303018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-01DOI: 10.1109/ISPDC55340.2022.00015
F. Gava, V. Allombert, J. Tesson
Programming parallel architectures using a hierarchical point of view is becoming today’s standard as machines are structured by multiple layers of memories. To handle such architectures, we focus on the MULTI-BSP bridging model. This model extends BSP and proposes a structured way of programming multi-level architectures. In the context of parallel programming we, now need to manage new concerns such as memory coherency, deadlocks and safe data communications. To do so, we propose a typing system for MULTI-ML, a ML-like programming language based on the MULTI-BSP model. This type system introduces data locality using type annotations and effects to be able to detected wrong uses of multi-level architectures. We thus ensure that "Well-typed programs cannot go wrong" on hierarchical architectures.
{"title":"A type system to avoid runtime errors for Multi-ML","authors":"F. Gava, V. Allombert, J. Tesson","doi":"10.1109/ISPDC55340.2022.00015","DOIUrl":"https://doi.org/10.1109/ISPDC55340.2022.00015","url":null,"abstract":"Programming parallel architectures using a hierarchical point of view is becoming today’s standard as machines are structured by multiple layers of memories. To handle such architectures, we focus on the MULTI-BSP bridging model. This model extends BSP and proposes a structured way of programming multi-level architectures. In the context of parallel programming we, now need to manage new concerns such as memory coherency, deadlocks and safe data communications. To do so, we propose a typing system for MULTI-ML, a ML-like programming language based on the MULTI-BSP model. This type system introduces data locality using type annotations and effects to be able to detected wrong uses of multi-level architectures. We thus ensure that \"Well-typed programs cannot go wrong\" on hierarchical architectures.","PeriodicalId":389334,"journal":{"name":"2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126354049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-01DOI: 10.1109/ISPDC55340.2022.00023
W. Sanders, Srishti Srivastava, I. Banicescu
Advances in computational resources have led to corresponding increases in the scale of large parallel and distributed computer (PDC) systems. With these increases in scale, it becomes increasingly important to understand how these systems will perform as they scale when they are planned and defined, rather than post deployment. Modeling and simulation of these systems can be used to identify unexpected problems and bottlenecks, verify operational functionality, and can result in significant cost savings and avoidance if done prior to the often large capital expenditures that accompany major parallel and distributed computer system deployments. In this paper, we evaluate how PDC systems perform while they are subject to increases in both the number of applications and the number of machines. We generate 42,000 models and evaluate them with the Imperial PEPA Compiler to determine the scaling effects across both an increasing number of applications and an increasing number of machines. These results are then utilized to develop a heuristic for predicting the makespan time for sets of applications mapped onto a number of machines where the applications are subjected to perturbations at runtime. While in the current work the estimated application rates and perturbed rates considered are based on the uniform probability distribution, future work will include a wider range of probability distributions for these rates.
{"title":"Performance Modeling of Scalable Resource Allocations with the Imperial PEPA Compiler","authors":"W. Sanders, Srishti Srivastava, I. Banicescu","doi":"10.1109/ISPDC55340.2022.00023","DOIUrl":"https://doi.org/10.1109/ISPDC55340.2022.00023","url":null,"abstract":"Advances in computational resources have led to corresponding increases in the scale of large parallel and distributed computer (PDC) systems. With these increases in scale, it becomes increasingly important to understand how these systems will perform as they scale when they are planned and defined, rather than post deployment. Modeling and simulation of these systems can be used to identify unexpected problems and bottlenecks, verify operational functionality, and can result in significant cost savings and avoidance if done prior to the often large capital expenditures that accompany major parallel and distributed computer system deployments. In this paper, we evaluate how PDC systems perform while they are subject to increases in both the number of applications and the number of machines. We generate 42,000 models and evaluate them with the Imperial PEPA Compiler to determine the scaling effects across both an increasing number of applications and an increasing number of machines. These results are then utilized to develop a heuristic for predicting the makespan time for sets of applications mapped onto a number of machines where the applications are subjected to perturbations at runtime. While in the current work the estimated application rates and perturbed rates considered are based on the uniform probability distribution, future work will include a wider range of probability distributions for these rates.","PeriodicalId":389334,"journal":{"name":"2022 21st International Symposium on Parallel and Distributed Computing (ISPDC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128244532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}