SSRAID：提高串行接口固态盘 RAID 写入性能的条带-队列和条带-线程合并 I/O 策略

IF 6 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS IEEE Transactions on Parallel and Distributed Systems Pub Date : 2024-08-14 DOI:10.1109/TPDS.2024.3443083

Peixuan Li;Ping Xie;Qiang Cao

{"title":"SSRAID：提高串行接口固态盘 RAID 写入性能的条带-队列和条带-线程合并 I/O 策略","authors":"Peixuan Li;Ping Xie;Qiang Cao","doi":"10.1109/TPDS.2024.3443083","DOIUrl":null,"url":null,"abstract":"RAID (Redundant Array of Independent Disks) has been widely used to enhance read and write performance of existing storage systems. Existing software RAID do not fully utilize write performance of Serial interface SSDs (Solid State Drive). The most popular software RAID currently is Linux Multiple-Disks (MD), and the latest software RAID is StRAID. We observe that both of these software RAID methods lead to thread contention in multi-threaded mode, especially when applied to Serial interface SSDs. Multiple threads writing to same address can limit write performance. In this paper, we propose a stripe-queued and stripe-threaded merging I/O strategy. First, SSRAID segregates write requests across different stripes using a set of stripe-queues and stripe-threads to prevent interference between them. As a result, write thread contention in SSRAID is eliminated, allowing stripe-threads to maintain the highest efficiency of parallelism. Secondly, SSRAID can merge write requests from the same stripe-queue multiple times through stripe-thread, effectively reducing the number of additional write I/Os. Finally, SSRAID presents a stage buffer based on data merging. During partial stripe-write, write-induced read I/Os on the SSD are transformed into direct access to the stage buffer, effectively reducing write-induced read I/Os. Compared to StRAID, SSRAID improves average sequential write throughput by 86% and reduces average sequential write latency by 61% in the optimal case.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"35 10","pages":"1841-1853"},"PeriodicalIF":6.0000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SSRAID: A Stripe-Queued and Stripe-Threaded Merging I/O Strategy to Improve Write Performance of Serial Interface SSD RAID\",\"authors\":\"Peixuan Li;Ping Xie;Qiang Cao\",\"doi\":\"10.1109/TPDS.2024.3443083\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"RAID (Redundant Array of Independent Disks) has been widely used to enhance read and write performance of existing storage systems. Existing software RAID do not fully utilize write performance of Serial interface SSDs (Solid State Drive). The most popular software RAID currently is Linux Multiple-Disks (MD), and the latest software RAID is StRAID. We observe that both of these software RAID methods lead to thread contention in multi-threaded mode, especially when applied to Serial interface SSDs. Multiple threads writing to same address can limit write performance. In this paper, we propose a stripe-queued and stripe-threaded merging I/O strategy. First, SSRAID segregates write requests across different stripes using a set of stripe-queues and stripe-threads to prevent interference between them. As a result, write thread contention in SSRAID is eliminated, allowing stripe-threads to maintain the highest efficiency of parallelism. Secondly, SSRAID can merge write requests from the same stripe-queue multiple times through stripe-thread, effectively reducing the number of additional write I/Os. Finally, SSRAID presents a stage buffer based on data merging. During partial stripe-write, write-induced read I/Os on the SSD are transformed into direct access to the stage buffer, effectively reducing write-induced read I/Os. Compared to StRAID, SSRAID improves average sequential write throughput by 86% and reduces average sequential write latency by 61% in the optimal case.\",\"PeriodicalId\":13257,\"journal\":{\"name\":\"IEEE Transactions on Parallel and Distributed Systems\",\"volume\":\"35 10\",\"pages\":\"1841-1853\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2024-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Parallel and Distributed Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10636805/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Parallel and Distributed Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10636805/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

RAID（独立磁盘冗余阵列）已被广泛用于提高现有存储系统的读写性能。现有的软件 RAID 无法充分利用串行接口固态硬盘（SSD）的写入性能。目前最流行的软件 RAID 是 Linux Multiple-Disks（MD），最新的软件 RAID 是 StRAID。我们发现，这两种软件 RAID 方法在多线程模式下都会导致线程争用，尤其是在应用于串行接口固态硬盘时。多个线程写入同一地址会限制写入性能。在本文中，我们提出了一种条带排队和条带线程合并 I/O 策略。首先，SSRAID 使用一组条带队列和条带线程将写入请求隔离到不同的条带上，以防止它们之间的干扰。因此，SSRAID 中的写线程竞争得以消除，从而使条带线程保持最高的并行效率。其次，SSRAID 可以通过条带线程多次合并来自同一条带队列的写入请求，从而有效减少额外的写入 I/O 数量。最后，SSRAID 提出了基于数据合并的阶段缓冲。在部分条带写入过程中，固态硬盘上由写入引起的读 I/O 将转化为对阶段缓冲区的直接访问，从而有效减少由写入引起的读 I/O。与 StRAID 相比，在最佳情况下，SSRAID 将平均连续写吞吐量提高了 86%，将平均连续写延迟降低了 61%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SSRAID: A Stripe-Queued and Stripe-Threaded Merging I/O Strategy to Improve Write Performance of Serial Interface SSD RAID

RAID (Redundant Array of Independent Disks) has been widely used to enhance read and write performance of existing storage systems. Existing software RAID do not fully utilize write performance of Serial interface SSDs (Solid State Drive). The most popular software RAID currently is Linux Multiple-Disks (MD), and the latest software RAID is StRAID. We observe that both of these software RAID methods lead to thread contention in multi-threaded mode, especially when applied to Serial interface SSDs. Multiple threads writing to same address can limit write performance. In this paper, we propose a stripe-queued and stripe-threaded merging I/O strategy. First, SSRAID segregates write requests across different stripes using a set of stripe-queues and stripe-threads to prevent interference between them. As a result, write thread contention in SSRAID is eliminated, allowing stripe-threads to maintain the highest efficiency of parallelism. Secondly, SSRAID can merge write requests from the same stripe-queue multiple times through stripe-thread, effectively reducing the number of additional write I/Os. Finally, SSRAID presents a stage buffer based on data merging. During partial stripe-write, write-induced read I/Os on the SSD are transformed into direct access to the stage buffer, effectively reducing write-induced read I/Os. Compared to StRAID, SSRAID improves average sequential write throughput by 86% and reduces average sequential write latency by 61% in the optimal case.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Parallel and Distributed Systems 工程技术-工程：电子与电气

CiteScore

11.00

自引率

9.40%

发文量

281

审稿时长

5.6 months

期刊介绍： IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers. Particular areas of interest include, but are not limited to: a) Parallel and distributed algorithms, focusing on topics such as: models of computation; numerical, combinatorial, and data-intensive parallel algorithms, scalability of algorithms and data structures for parallel and distributed systems, communication and synchronization protocols, network algorithms, scheduling, and load balancing. b) Applications of parallel and distributed computing, including computational and data-enabled science and engineering, big data applications, parallel crowd sourcing, large-scale social network analysis, management of big data, cloud and grid computing, scientific and biomedical applications, mobile computing, and cyber-physical systems. c) Parallel and distributed architectures, including architectures for instruction-level and thread-level parallelism; design, analysis, implementation, fault resilience and performance measurements of multiple-processor systems; multicore processors, heterogeneous many-core systems; petascale and exascale systems designs; novel big data architectures; special purpose architectures, including graphics processors, signal processors, network processors, media accelerators, and other special purpose processors and accelerators; impact of technology on architecture; network and interconnect architectures; parallel I/O and storage systems; architecture of the memory hierarchy; power-efficient and green computing architectures; dependable architectures; and performance modeling and evaluation. d) Parallel and distributed software, including parallel and multicore programming languages and compilers, runtime systems, operating systems, Internet computing and web services, resource management including green computing, middleware for grids, clouds, and data centers, libraries, performance modeling and evaluation, parallel programming paradigms, and programming environments and tools.