{"title":"Efficient Crash Consistency for NVMe over PCIe and RDMA","authors":"Xiaojian Liao, Youyou Lu, Zhe Yang, J. Shu","doi":"10.1145/3568428","DOIUrl":null,"url":null,"abstract":"This article presents crash-consistent Non-Volatile Memory Express (ccNVMe), a novel extension of the NVMe that defines how host software communicates with the non-volatile memory (e.g., solid-state drive) across a PCI Express bus and RDMA-capable networks with both crash consistency and performance efficiency. Existing storage systems pay a huge tax on crash consistency, and thus cannot fully exploit the multi-queue parallelism and low latency of the NVMe and RDMA interfaces. ccNVMe alleviates this major bottleneck by coupling the crash consistency to the data dissemination. This new idea allows the storage system to achieve crash consistency by taking the free rides of the data dissemination mechanism of NVMe, using only two lightweight memory-mapped I/Os (MMIOs), unlike traditional systems that use complex update protocol and synchronized block I/Os. ccNVMe introduces a series of techniques including transaction-aware MMIO/doorbell and I/O command coalescing to reduce the PCIe traffic as well as to provide atomicity. We present how to build a high-performance and crash-consistent file system named MQFS atop ccNVMe. We experimentally show that MQFS increases the IOPS of RocksDB by 36% and 28% compared to a state-of-the-art file system and Ext4 without journaling, respectively.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":" ","pages":"1 - 35"},"PeriodicalIF":2.1000,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Storage","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3568428","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 2
Abstract
This article presents crash-consistent Non-Volatile Memory Express (ccNVMe), a novel extension of the NVMe that defines how host software communicates with the non-volatile memory (e.g., solid-state drive) across a PCI Express bus and RDMA-capable networks with both crash consistency and performance efficiency. Existing storage systems pay a huge tax on crash consistency, and thus cannot fully exploit the multi-queue parallelism and low latency of the NVMe and RDMA interfaces. ccNVMe alleviates this major bottleneck by coupling the crash consistency to the data dissemination. This new idea allows the storage system to achieve crash consistency by taking the free rides of the data dissemination mechanism of NVMe, using only two lightweight memory-mapped I/Os (MMIOs), unlike traditional systems that use complex update protocol and synchronized block I/Os. ccNVMe introduces a series of techniques including transaction-aware MMIO/doorbell and I/O command coalescing to reduce the PCIe traffic as well as to provide atomicity. We present how to build a high-performance and crash-consistent file system named MQFS atop ccNVMe. We experimentally show that MQFS increases the IOPS of RocksDB by 36% and 28% compared to a state-of-the-art file system and Ext4 without journaling, respectively.
期刊介绍:
The ACM Transactions on Storage (TOS) is a new journal with an intent to publish original archival papers in the area of storage and closely related disciplines. Articles that appear in TOS will tend either to present new techniques and concepts or to report novel experiences and experiments with practical systems. Storage is a broad and multidisciplinary area that comprises of network protocols, resource management, data backup, replication, recovery, devices, security, and theory of data coding, densities, and low-power. Potential synergies among these fields are expected to open up new research directions.