{"title":"在混合 KV 放置的 LSM 键值存储中实现高效复制的索引运输","authors":"Giorgos Stilianakis, Giorgos Saloustros, Orestis Chiotakis, Giorgos Xanthakis, Angelos Bilas","doi":"10.1145/3658672","DOIUrl":null,"url":null,"abstract":"<p>Key-value (KV) stores based on LSM tree have become a foundational layer in the storage stack of datacenters and cloud services. Current approaches for achieving reliability and availability favor reducing network traffic and send to replicas only new KV pairs. As a result, they perform costly compactions to reorganize data in both the primary and backup nodes, which increases device I/O traffic and CPU overhead, and eventually hurts overall system performance. In this paper we describe <i>Tebis</i>, an efficient LSM-based KV store that reduces I/O amplification and CPU overhead for maintaining the replica index. We use a primary-backup replication scheme that performs compactions only on the primary nodes and sends pre-built indexes to backup nodes, avoiding all compactions in backup nodes. Our approach includes an efficient mechanism to deal with pointer translation across nodes in the pre-built region index. Our results show that <i>Tebis</i>\nreduces resource utilization on backup nodes compared to performing full compactions: Throughput is increased by 1.06 − 2.90 ×, CPU efficiency is increased by 1.21 − 2.78 ×, and I/O amplification is reduced by 1.7 − 3.27 ×, while network traffic increases by up to 1.32 − 3.76 ×.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"6 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Index Shipping for Efficient Replication in LSM Key-Value Stores with Hybrid KV Placement\",\"authors\":\"Giorgos Stilianakis, Giorgos Saloustros, Orestis Chiotakis, Giorgos Xanthakis, Angelos Bilas\",\"doi\":\"10.1145/3658672\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Key-value (KV) stores based on LSM tree have become a foundational layer in the storage stack of datacenters and cloud services. Current approaches for achieving reliability and availability favor reducing network traffic and send to replicas only new KV pairs. As a result, they perform costly compactions to reorganize data in both the primary and backup nodes, which increases device I/O traffic and CPU overhead, and eventually hurts overall system performance. In this paper we describe <i>Tebis</i>, an efficient LSM-based KV store that reduces I/O amplification and CPU overhead for maintaining the replica index. We use a primary-backup replication scheme that performs compactions only on the primary nodes and sends pre-built indexes to backup nodes, avoiding all compactions in backup nodes. Our approach includes an efficient mechanism to deal with pointer translation across nodes in the pre-built region index. Our results show that <i>Tebis</i>\\nreduces resource utilization on backup nodes compared to performing full compactions: Throughput is increased by 1.06 − 2.90 ×, CPU efficiency is increased by 1.21 − 2.78 ×, and I/O amplification is reduced by 1.7 − 3.27 ×, while network traffic increases by up to 1.32 − 3.76 ×.</p>\",\"PeriodicalId\":49113,\"journal\":{\"name\":\"ACM Transactions on Storage\",\"volume\":\"6 1\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Storage\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3658672\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Storage","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3658672","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Index Shipping for Efficient Replication in LSM Key-Value Stores with Hybrid KV Placement
Key-value (KV) stores based on LSM tree have become a foundational layer in the storage stack of datacenters and cloud services. Current approaches for achieving reliability and availability favor reducing network traffic and send to replicas only new KV pairs. As a result, they perform costly compactions to reorganize data in both the primary and backup nodes, which increases device I/O traffic and CPU overhead, and eventually hurts overall system performance. In this paper we describe Tebis, an efficient LSM-based KV store that reduces I/O amplification and CPU overhead for maintaining the replica index. We use a primary-backup replication scheme that performs compactions only on the primary nodes and sends pre-built indexes to backup nodes, avoiding all compactions in backup nodes. Our approach includes an efficient mechanism to deal with pointer translation across nodes in the pre-built region index. Our results show that Tebis
reduces resource utilization on backup nodes compared to performing full compactions: Throughput is increased by 1.06 − 2.90 ×, CPU efficiency is increased by 1.21 − 2.78 ×, and I/O amplification is reduced by 1.7 − 3.27 ×, while network traffic increases by up to 1.32 − 3.76 ×.
期刊介绍:
The ACM Transactions on Storage (TOS) is a new journal with an intent to publish original archival papers in the area of storage and closely related disciplines. Articles that appear in TOS will tend either to present new techniques and concepts or to report novel experiences and experiments with practical systems. Storage is a broad and multidisciplinary area that comprises of network protocols, resource management, data backup, replication, recovery, devices, security, and theory of data coding, densities, and low-power. Potential synergies among these fields are expected to open up new research directions.