We study the repeated balls-into-bins process introduced by Becchetti, Clementi, Natale, Pasquale and Posta [3]. This process starts with m balls arbitrarily distributed across n bins. At each step t = 1, 2, . . ., we select one ball from each non-empty bin, and then place it into a bin chosen independently and uniformly at random. We prove the following results: For any n ⩽ m ⩽ poly(n), we prove a lower bound of Ω(m/n · logn) on the maximum load. For the special case m = n, this matches the upper bound of O (logn), as shown in [3]. It also provides a positive answer to the conjecture in [3] that for m = n the maximum load is ω(log n /log log n) in a polynomially large window. For m ∈ [ω (n), n logn], our new lower bound also disproves the conjecture in [3] that the maximum load remains O (logn). For any n ≤ m ≤ poly(n), we prove an upper bound of O (m/n · logn) on the maximum load for a polynomially large window, which matches our lower bound. For any m ≥ n, our analysis also implies an O (m2 /n) waiting time to a configuration with O (m/n . log m) maximum load, even for worst-case initial distributions. For m ≥ n, we show that every ball visits every bin in O (m log m) steps. For m = n, this improves the previous upper bound of O (n log2 n) in [3] and for any n ≤ m ≤ poly(n) this is tight up to multiplicative constants. Full version of the paper at: https://arxiv.org/abs/2203.12400.
{"title":"Brief Announcement: Tight Bounds for Repeated Balls-into-Bins","authors":"Dimitrios Los, Thomas Sauerwald","doi":"10.1145/3490148.3538561","DOIUrl":"https://doi.org/10.1145/3490148.3538561","url":null,"abstract":"We study the repeated balls-into-bins process introduced by Becchetti, Clementi, Natale, Pasquale and Posta [3]. This process starts with m balls arbitrarily distributed across n bins. At each step t = 1, 2, . . ., we select one ball from each non-empty bin, and then place it into a bin chosen independently and uniformly at random. We prove the following results: For any n ⩽ m ⩽ poly(n), we prove a lower bound of Ω(m/n · logn) on the maximum load. For the special case m = n, this matches the upper bound of O (logn), as shown in [3]. It also provides a positive answer to the conjecture in [3] that for m = n the maximum load is ω(log n /log log n) in a polynomially large window. For m ∈ [ω (n), n logn], our new lower bound also disproves the conjecture in [3] that the maximum load remains O (logn). For any n ≤ m ≤ poly(n), we prove an upper bound of O (m/n · logn) on the maximum load for a polynomially large window, which matches our lower bound. For any m ≥ n, our analysis also implies an O (m2 /n) waiting time to a configuration with O (m/n . log m) maximum load, even for worst-case initial distributions. For m ≥ n, we show that every ball visits every bin in O (m log m) steps. For m = n, this improves the previous upper bound of O (n log2 n) in [3] and for any n ≤ m ≤ poly(n) this is tight up to multiplicative constants. Full version of the paper at: https://arxiv.org/abs/2203.12400.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126473716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nathan Beckmann, Phillip B. Gibbons, Charles McGuffey
Real systems make use of a hierarchy ranging from small, fast memories to larger and slower storage devices [15]. Each level of the hierarchy organizes its data in blocks to simplify management and reduce overheads.
{"title":"Brief Announcement: Spatial Locality and Granularity Change in Caching","authors":"Nathan Beckmann, Phillip B. Gibbons, Charles McGuffey","doi":"10.1145/3490148.3538559","DOIUrl":"https://doi.org/10.1145/3490148.3538559","url":null,"abstract":"Real systems make use of a hierarchy ranging from small, fast memories to larger and slower storage devices [15]. Each level of the hierarchy organizes its data in blocks to simplify management and reduce overheads.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127246926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Groundbreaking work analyzing early Internet data revealed novel phenomena that became the basis of a new endeavor: Network Science. This exciting new field has revealed fundamental properties about communication, social, and biological networks. Simultaneously, the Internet has expanded enormously and is now a domain of activity as important to civilization as land, sea, air, and space. The initial Internet observations that nurtured network science have ballooned and become the largest dynamic streaming data sets availability; creating fresh opportunities to examine the foundations of network science in previously unimagined detail. The analysis of streaming networks with trillions of events have stimulated the development of novel mathematics (e.g., associative array algebra), algorithms (e.g., hypersparse neural networks), software (e.g., GraphBLAS.org), and hardware. All of these capabilities are critically dependent on parallel processing. Application of these developments to the worlds' largest publicly available streaming event datasets have revealed a variety of new phenomena.
{"title":"Keynote Talk: Large Scale Parallel Sparse Matrix Streaming Graph/Network Analysis","authors":"J. Kepner","doi":"10.1145/3490148.3538597","DOIUrl":"https://doi.org/10.1145/3490148.3538597","url":null,"abstract":"Groundbreaking work analyzing early Internet data revealed novel phenomena that became the basis of a new endeavor: Network Science. This exciting new field has revealed fundamental properties about communication, social, and biological networks. Simultaneously, the Internet has expanded enormously and is now a domain of activity as important to civilization as land, sea, air, and space. The initial Internet observations that nurtured network science have ballooned and become the largest dynamic streaming data sets availability; creating fresh opportunities to examine the foundations of network science in previously unimagined detail. The analysis of streaming networks with trillions of events have stimulated the development of novel mathematics (e.g., associative array algebra), algorithms (e.g., hypersparse neural networks), software (e.g., GraphBLAS.org), and hardware. All of these capabilities are critically dependent on parallel processing. Application of these developments to the worlds' largest publicly available streaming event datasets have revealed a variety of new phenomena.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"305 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133733079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianming Zhao, Chunhao Li, Wei Li, Albert Y. Zomaya
We consider the problem of non-clairvoyant scheduling on single machine to minimize the total flow time with job size predictions. The existing algorithm achieves 2-consistency to predictions, but no algorithm can simultaneously attain bounded robustness. This work finds a sufficient condition for any algorithm to achieve optimal O(P)-robustness, where P is the maximum ratio of any two job sizes. We give the first algorithm that achieves optimal robustness up to a constant multiplicative factor and optimal consistency using this condition. Finally, for addressing small prediction errors, we present an algorithm that we conjecture to achieve the optimal O(η^2) competitive ratio, where η is the prediction error. Proving the claimed bound is our ongoing work.
{"title":"Brief Announcement: Towards a More Robust Algorithm for Flow Time Scheduling with Predictions","authors":"Tianming Zhao, Chunhao Li, Wei Li, Albert Y. Zomaya","doi":"10.1145/3490148.3538557","DOIUrl":"https://doi.org/10.1145/3490148.3538557","url":null,"abstract":"We consider the problem of non-clairvoyant scheduling on single machine to minimize the total flow time with job size predictions. The existing algorithm achieves 2-consistency to predictions, but no algorithm can simultaneously attain bounded robustness. This work finds a sufficient condition for any algorithm to achieve optimal O(P)-robustness, where P is the maximum ratio of any two job sizes. We give the first algorithm that achieves optimal robustness up to a constant multiplicative factor and optimal consistency using this condition. Finally, for addressing small prediction errors, we present an algorithm that we conjecture to achieve the optimal O(η^2) competitive ratio, where η is the prediction error. Proving the claimed bound is our ongoing work.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121984108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work extends the composable secure-emulation of Canetti et al. to dynamic settings. Our work builds on top of dynamic probabilistic I/O automata, a recent framework introduced to model dynamic probabilistic systems. Our extension is an important tool towards the formal verification of protocols combining probabilistic distributed systems and cryptography in dynamic settings (e.g. blockchains, cybersecure distributed protocols etc).
{"title":"Brief Announcement: Composable Dynamic Secure Emulation","authors":"Pierre Civit, M. Potop-Butucaru","doi":"10.1145/3490148.3538562","DOIUrl":"https://doi.org/10.1145/3490148.3538562","url":null,"abstract":"This work extends the composable secure-emulation of Canetti et al. to dynamic settings. Our work builds on top of dynamic probabilistic I/O automata, a recent framework introduced to model dynamic probabilistic systems. Our extension is an important tool towards the formal verification of protocols combining probabilistic distributed systems and cryptography in dynamic settings (e.g. blockchains, cybersecure distributed protocols etc).","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122051916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this talk, I report on a large-scale census of algorithm improvement spanning 11 sub-fields of computer science, 57 textbooks and more than 1,100 research papers. Across 113 algorithm problems, we find enormous variation in how fast algorithms have improved. Around half experience little or no improvement. At the other extreme, 13% experience transformative improvements, radically changing how and where they can be used. Overall, we find that, for moderate-sized problems, 30% to 45% of algorithmic problems had improvements comparable or greater than those that users experienced from Moore's Law and other hardware advances. I will also discuss our comparison of the upper bounds and lower bounds for these algorithm problems, where we find that nearly two-thirds are already asymptomatically optimal --- representing a triumph for the field, but also a challenge for future progress.
{"title":"Keynote Talk: Algorithm Improvement: How Fast Has It Been and How Much Farther Can It Go?","authors":"Neil C. Thompson","doi":"10.1145/3490148.3538596","DOIUrl":"https://doi.org/10.1145/3490148.3538596","url":null,"abstract":"In this talk, I report on a large-scale census of algorithm improvement spanning 11 sub-fields of computer science, 57 textbooks and more than 1,100 research papers. Across 113 algorithm problems, we find enormous variation in how fast algorithms have improved. Around half experience little or no improvement. At the other extreme, 13% experience transformative improvements, radically changing how and where they can be used. Overall, we find that, for moderate-sized problems, 30% to 45% of algorithmic problems had improvements comparable or greater than those that users experienced from Moore's Law and other hardware advances. I will also discuss our comparison of the upper bounds and lower bounds for these algorithm problems, where we find that nearly two-thirds are already asymptomatically optimal --- representing a triumph for the field, but also a challenge for future progress.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130783101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We design a parallel algorithm for the Constrained Shortest Path (CSP) problem. The CSP problem is known to be NP-hard and there exists a pseudo-polynomial time sequential algorithm that solves it. To design the parallel algorithm, we extend the techniques used in the design of the Δ-stepping algorithm for the single-source shortest paths problem.
{"title":"Brief Announcement: A Parallel (Δ, Γ)-Stepping Algorithm for the Constrained Shortest Path Problem","authors":"Tayebeh Bahreini, N. Fisher, Daniel Grosu","doi":"10.1145/3490148.3538555","DOIUrl":"https://doi.org/10.1145/3490148.3538555","url":null,"abstract":"We design a parallel algorithm for the Constrained Shortest Path (CSP) problem. The CSP problem is known to be NP-hard and there exists a pseudo-polynomial time sequential algorithm that solves it. To design the parallel algorithm, we extend the techniques used in the design of the Δ-stepping algorithm for the single-source shortest paths problem.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130232348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiwon Choe, A. Crotty, T. Moreshet, Maurice Herlihy, R. I. Bahar
In recent years, the ever-increasing impact of memory access bottlenecks has brought forth a renewed interest in near-memory processing (NMP) architectures. In this work, we propose and empirically evaluate hybrid data structures, which are concurrent data structures custom-designed for these new NMP architectures. We focus on cache-optimized data structures, such as skiplists and B+ trees, that are often used as index structures in online transaction processing (OLTP) systems to enable fast key-based lookups. These data structures are hierarchical, where lookups begin at a small number of top-level nodes and diverge to many different node paths as they move down the hierarchy, such that nodes in higher levels benefit more from caching. Our proposed hybrid data structures split traditional hierarchical data structures into a host-managed portion consisting of higher-level nodes and an NMP-managed portion consisting of the remaining lower-level nodes, thus retaining and further enhancing the cache-conscious optimizations of their conventional implementations. Although the idea might seem relatively simple, the splitting of the data structure prompts new synchronization problems, and careful implementation is required to ensure high concurrency and correctness. We provide implementations of a hybrid skiplist and a hybrid B+ tree, and we empirically evaluate them on a cycle-accurate full-system architecture simulator. Our results show that the hybrid data structures have the potential to improve performance by more than 2x compared to state-of-the-art concurrent data structures.
{"title":"HybriDS","authors":"Jiwon Choe, A. Crotty, T. Moreshet, Maurice Herlihy, R. I. Bahar","doi":"10.1145/3490148.3538591","DOIUrl":"https://doi.org/10.1145/3490148.3538591","url":null,"abstract":"In recent years, the ever-increasing impact of memory access bottlenecks has brought forth a renewed interest in near-memory processing (NMP) architectures. In this work, we propose and empirically evaluate hybrid data structures, which are concurrent data structures custom-designed for these new NMP architectures. We focus on cache-optimized data structures, such as skiplists and B+ trees, that are often used as index structures in online transaction processing (OLTP) systems to enable fast key-based lookups. These data structures are hierarchical, where lookups begin at a small number of top-level nodes and diverge to many different node paths as they move down the hierarchy, such that nodes in higher levels benefit more from caching. Our proposed hybrid data structures split traditional hierarchical data structures into a host-managed portion consisting of higher-level nodes and an NMP-managed portion consisting of the remaining lower-level nodes, thus retaining and further enhancing the cache-conscious optimizations of their conventional implementations. Although the idea might seem relatively simple, the splitting of the data structure prompts new synchronization problems, and careful implementation is required to ensure high concurrency and correctness. We provide implementations of a hybrid skiplist and a hybrid B+ tree, and we empirically evaluate them on a cycle-accurate full-system architecture simulator. Our results show that the hybrid data structures have the potential to improve performance by more than 2x compared to state-of-the-art concurrent data structures.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124957877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a new model for (asynchronous) byzantine reliable broadcast to investigate the potential of secret collusion between honest players. To model the collusion, we assume that each honest player has k > 1 distinct communication identities over which they can send and receive messages. A player can obtain these identities - for example - by joining a distributed system under several aliases.
{"title":"Brief Announcement: The (Limited) Power of Multiple Identities: Asynchronous Byzantine Reliable Broadcast with Improved Resilience through Collusion","authors":"Thorsten Götte, C. Scheideler","doi":"10.1145/3490148.3538556","DOIUrl":"https://doi.org/10.1145/3490148.3538556","url":null,"abstract":"We present a new model for (asynchronous) byzantine reliable broadcast to investigate the potential of secret collusion between honest players. To model the collusion, we assume that each honest player has k > 1 distinct communication identities over which they can send and receive messages. A player can obtain these identities - for example - by joining a distributed system under several aliases.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"255 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115991590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Resolving an open question from 2006, we prove the existence of light-weight bounded-degree (1+ε)-spanners for unit ball graphs in the metrics of bounded doubling dimension, and we design a simple O(log*n)-round distributed algorithm in the LOCAL model for finding such spanners using only 2-hop neighborhood information. We further study the problem in the two dimensional Euclidean plane and we propose a construction with similar properties that has a low-intersection property as well. Lastly, we provide experimental results that confirm the performance of our algorithms.
{"title":"Brief Announcement: Distributed Lightweight Spanner Construction for Unit Ball Graphs in Doubling Metrics","authors":"D. Eppstein, Hadi Khodabandeh","doi":"10.1145/3490148.3538553","DOIUrl":"https://doi.org/10.1145/3490148.3538553","url":null,"abstract":"Resolving an open question from 2006, we prove the existence of light-weight bounded-degree (1+ε)-spanners for unit ball graphs in the metrics of bounded doubling dimension, and we design a simple O(log*n)-round distributed algorithm in the LOCAL model for finding such spanners using only 2-hop neighborhood information. We further study the problem in the two dimensional Euclidean plane and we propose a construction with similar properties that has a low-intersection property as well. Lastly, we provide experimental results that confirm the performance of our algorithms.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"3586 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127519677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}