Callista Le, Kiran Gopinathan, Koon Wen Lee, Seth Gilbert, Ilya Sergey
{"title":"Concurrent Data Structures Made Easy (Extended Version)","authors":"Callista Le, Kiran Gopinathan, Koon Wen Lee, Seth Gilbert, Ilya Sergey","doi":"arxiv-2408.13779","DOIUrl":null,"url":null,"abstract":"Design of an efficient thread-safe concurrent data structure is a balancing\nact between its implementation complexity and performance. Lock-based\nconcurrent data structures, which are relatively easy to derive from their\nsequential counterparts and to prove thread-safe, suffer from poor throughput\nunder even light multi-threaded workload. At the same time, lock-free\nconcurrent structures allow for high throughput, but are notoriously difficult\nto get right and require careful reasoning to formally establish their\ncorrectness. We explore a solution to this conundrum based on batch parallelism, an\napproach for designing concurrent data structures via a simple insight:\nefficiently processing a batch of a priori known operations in parallel is\neasier than optimising performance for a stream of arbitrary asynchronous\nrequests. Alas, batch-parallel structures have not seen wide practical adoption\ndue to (i) the inconvenience of having to structure multi-threaded programs to\nexplicitly group operations and (ii) the lack of a systematic methodology to\nimplement batch-parallel structures as simply as lock-based ones. We present OBatcher-an OCaml library that streamlines the design,\nimplementation, and usage of batch-parallel structures. It solves the first\nchallenge (how to use) by suggesting a new lightweight implicit batching design\nthat is built on top of generic asynchronous programming mechanisms. The second\nchallenge (how to implement) is addressed by identifying a family of strategies\nfor converting common sequential structures into efficient batch-parallel ones.\nWe showcase OBatcher with a diverse set of benchmarks. Our evaluation of all\nthe implementations on large asynchronous workloads shows that (a) they\nconsistently outperform the corresponding coarse-grained lock-based\nimplementations and that (b) their throughput scales reasonably with the number\nof processors.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":"58 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.13779","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Design of an efficient thread-safe concurrent data structure is a balancing
act between its implementation complexity and performance. Lock-based
concurrent data structures, which are relatively easy to derive from their
sequential counterparts and to prove thread-safe, suffer from poor throughput
under even light multi-threaded workload. At the same time, lock-free
concurrent structures allow for high throughput, but are notoriously difficult
to get right and require careful reasoning to formally establish their
correctness. We explore a solution to this conundrum based on batch parallelism, an
approach for designing concurrent data structures via a simple insight:
efficiently processing a batch of a priori known operations in parallel is
easier than optimising performance for a stream of arbitrary asynchronous
requests. Alas, batch-parallel structures have not seen wide practical adoption
due to (i) the inconvenience of having to structure multi-threaded programs to
explicitly group operations and (ii) the lack of a systematic methodology to
implement batch-parallel structures as simply as lock-based ones. We present OBatcher-an OCaml library that streamlines the design,
implementation, and usage of batch-parallel structures. It solves the first
challenge (how to use) by suggesting a new lightweight implicit batching design
that is built on top of generic asynchronous programming mechanisms. The second
challenge (how to implement) is addressed by identifying a family of strategies
for converting common sequential structures into efficient batch-parallel ones.
We showcase OBatcher with a diverse set of benchmarks. Our evaluation of all
the implementations on large asynchronous workloads shows that (a) they
consistently outperform the corresponding coarse-grained lock-based
implementations and that (b) their throughput scales reasonably with the number
of processors.