首页 > 最新文献

ACM Transactions on Storage最新文献

英文 中文
Reliability Evaluation of Erasure-coded Storage Systems with Latent Errors 具有潜在错误的擦除编码存储系统的可靠性评估
IF 1.7 3区 计算机科学 Q3 Computer Science Pub Date : 2022-11-19 DOI: 10.1145/3568313
I. Iliadis
Large-scale storage systems employ erasure-coding redundancy schemes to protect against device failures. The adverse effect of latent sector errors on the Mean Time to Data Loss (MTTDL) and the Expected Annual Fraction of Data Loss (EAFDL) reliability metrics is evaluated. A theoretical model capturing the effect of latent errors and device failures is developed, and closed-form expressions for the metrics of interest are derived. The MTTDL and EAFDL of erasure-coded systems are obtained analytically for (i) the entire range of bit error rates; (ii) the symmetric, clustered, and declustered data placement schemes; and (iii) arbitrary device failure and rebuild time distributions under network rebuild bandwidth constraints. The range of error rates that deteriorate system reliability is derived analytically. For realistic values of sector error rates, the results obtained demonstrate that MTTDL degrades, whereas, for moderate erasure codes, EAFDL remains practically unaffected. It is demonstrated that, in the range of typical sector error rates and for very powerful erasure codes, EAFDL degrades as well. It is also shown that the declustered data placement scheme offers superior reliability.
大规模存储系统采用擦除编码冗余方案来防止设备故障。评估潜在扇区错误对平均数据丢失时间(MTTDL)和预期年数据丢失率(EAFDL)可靠性指标的不利影响。建立了一个捕捉潜在误差和设备故障影响的理论模型,并推导了感兴趣的度量的闭合表达式。擦除编码系统的MTTDL和EAFDL是针对(i)比特错误率的整个范围解析地获得的;(ii)对称、集群和去集群的数据放置方案;以及(iii)在网络重建带宽约束下的任意设备故障和重建时间分布。分析推导了降低系统可靠性的误差率范围。对于扇区错误率的实际值,所获得的结果表明MTTDL退化,而对于中等擦除码,EAFDL实际上保持不受影响。研究表明,在典型的扇区错误率范围内,对于非常强大的擦除码,EAFDL也会降级。还表明,去簇数据放置方案提供了优越的可靠性。
{"title":"Reliability Evaluation of Erasure-coded Storage Systems with Latent Errors","authors":"I. Iliadis","doi":"10.1145/3568313","DOIUrl":"https://doi.org/10.1145/3568313","url":null,"abstract":"Large-scale storage systems employ erasure-coding redundancy schemes to protect against device failures. The adverse effect of latent sector errors on the Mean Time to Data Loss (MTTDL) and the Expected Annual Fraction of Data Loss (EAFDL) reliability metrics is evaluated. A theoretical model capturing the effect of latent errors and device failures is developed, and closed-form expressions for the metrics of interest are derived. The MTTDL and EAFDL of erasure-coded systems are obtained analytically for (i) the entire range of bit error rates; (ii) the symmetric, clustered, and declustered data placement schemes; and (iii) arbitrary device failure and rebuild time distributions under network rebuild bandwidth constraints. The range of error rates that deteriorate system reliability is derived analytically. For realistic values of sector error rates, the results obtained demonstrate that MTTDL degrades, whereas, for moderate erasure codes, EAFDL remains practically unaffected. It is demonstrated that, in the range of typical sector error rates and for very powerful erasure codes, EAFDL degrades as well. It is also shown that the declustered data placement scheme offers superior reliability.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46039172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Crash Consistency for NVMe over PCIe and RDMA NVMe在PCIe和RDMA上的高效崩溃一致性
IF 1.7 3区 计算机科学 Q3 Computer Science Pub Date : 2022-11-19 DOI: 10.1145/3568428
Xiaojian Liao, Youyou Lu, Zhe Yang, J. Shu
This article presents crash-consistent Non-Volatile Memory Express (ccNVMe), a novel extension of the NVMe that defines how host software communicates with the non-volatile memory (e.g., solid-state drive) across a PCI Express bus and RDMA-capable networks with both crash consistency and performance efficiency. Existing storage systems pay a huge tax on crash consistency, and thus cannot fully exploit the multi-queue parallelism and low latency of the NVMe and RDMA interfaces. ccNVMe alleviates this major bottleneck by coupling the crash consistency to the data dissemination. This new idea allows the storage system to achieve crash consistency by taking the free rides of the data dissemination mechanism of NVMe, using only two lightweight memory-mapped I/Os (MMIOs), unlike traditional systems that use complex update protocol and synchronized block I/Os. ccNVMe introduces a series of techniques including transaction-aware MMIO/doorbell and I/O command coalescing to reduce the PCIe traffic as well as to provide atomicity. We present how to build a high-performance and crash-consistent file system named MQFS atop ccNVMe. We experimentally show that MQFS increases the IOPS of RocksDB by 36% and 28% compared to a state-of-the-art file system and Ext4 without journaling, respectively.
本文介绍了崩溃一致性非易失性内存Express (ccNVMe),这是NVMe的一种新扩展,它定义了主机软件如何通过PCI Express总线和支持rdma的网络与非易失性内存(例如固态驱动器)进行通信,同时具有崩溃一致性和性能效率。现有的存储系统在崩溃一致性上付出了巨大的代价,因此不能充分利用NVMe和RDMA接口的多队列并行性和低延迟。ccNVMe通过将崩溃一致性与数据分发相结合,缓解了这一主要瓶颈。与使用复杂更新协议和同步块I/ o的传统系统不同,这种新想法允许存储系统通过免费使用NVMe的数据传播机制,仅使用两个轻量级内存映射I/ o (mmio)来实现崩溃一致性。ccNVMe引入了一系列技术,包括事务感知的MMIO/门铃和I/O命令合并,以减少PCIe流量并提供原子性。我们介绍了如何在ccNVMe之上构建一个名为MQFS的高性能和崩溃一致性文件系统。我们通过实验表明,与最先进的文件系统和没有日志记录的Ext4相比,MQFS使RocksDB的IOPS分别提高了36%和28%。
{"title":"Efficient Crash Consistency for NVMe over PCIe and RDMA","authors":"Xiaojian Liao, Youyou Lu, Zhe Yang, J. Shu","doi":"10.1145/3568428","DOIUrl":"https://doi.org/10.1145/3568428","url":null,"abstract":"This article presents crash-consistent Non-Volatile Memory Express (ccNVMe), a novel extension of the NVMe that defines how host software communicates with the non-volatile memory (e.g., solid-state drive) across a PCI Express bus and RDMA-capable networks with both crash consistency and performance efficiency. Existing storage systems pay a huge tax on crash consistency, and thus cannot fully exploit the multi-queue parallelism and low latency of the NVMe and RDMA interfaces. ccNVMe alleviates this major bottleneck by coupling the crash consistency to the data dissemination. This new idea allows the storage system to achieve crash consistency by taking the free rides of the data dissemination mechanism of NVMe, using only two lightweight memory-mapped I/Os (MMIOs), unlike traditional systems that use complex update protocol and synchronized block I/Os. ccNVMe introduces a series of techniques including transaction-aware MMIO/doorbell and I/O command coalescing to reduce the PCIe traffic as well as to provide atomicity. We present how to build a high-performance and crash-consistent file system named MQFS atop ccNVMe. We experimentally show that MQFS increases the IOPS of RocksDB by 36% and 28% compared to a state-of-the-art file system and Ext4 without journaling, respectively.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41673946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
End-to-end I/O Monitoring on Leading Supercomputers 领先超级计算机上的端到端I/O监控
IF 1.7 3区 计算机科学 Q3 Computer Science Pub Date : 2022-11-19 DOI: 10.1145/3568425
Bin Yang, W. Xue, Tianyu Zhang, Shichao Liu, Xiaosong Ma, Xiyang Wang, Weiguo Liu
This paper offers a solution to overcome the complexities of production system I/O performance monitoring. We present Beacon, an end-to-end I/O resource monitoring and diagnosis system for the 40960-node Sunway TaihuLight supercomputer, currently the fourth-ranked supercomputer in the world. Beacon simultaneously collects and correlates I/O tracing/profiling data from all the compute nodes, forwarding nodes, storage nodes, and metadata servers. With mechanisms such as aggressive online and offline trace compression and distributed caching/storage, it delivers scalable, low-overhead, and sustainable I/O diagnosis under production use. With Beacon’s deployment on TaihuLight for more than three years, we demonstrate Beacon’s effectiveness with real-world use cases for I/O performance issue identification and diagnosis. It has already successfully helped center administrators identify obscure design or configuration flaws, system anomaly occurrences, I/O performance interference, and resource under- or over-provisioning problems. Several of the exposed problems have already been fixed, with others being currently addressed. Encouraged by Beacon’s success in I/O monitoring, we extend it to monitor interconnection networks, which is another contention point on supercomputers. In addition, we demonstrate Beacon’s generality by extending it to other supercomputers. Both Beacon codes and part of collected monitoring data are released.1
本文提供了一种解决方案来克服生产系统I/O性能监控的复杂性。我们为目前世界排名第四的超级计算机——40960节点的神威太湖之光超级计算机提供了端到端I/O资源监测和诊断系统Beacon。Beacon同时收集并关联所有计算节点、转发节点、存储节点和元数据服务器的I/O跟踪/分析数据。通过积极的在线和离线跟踪压缩以及分布式缓存/存储等机制,它可以在生产使用下提供可扩展、低开销和可持续的I/O诊断。通过Beacon在太湖之光上三年多的部署,我们展示了Beacon在I/O性能问题识别和诊断方面的实际用例的有效性。它已经成功地帮助中心管理员识别模糊的设计或配置缺陷、系统异常事件、I/O性能干扰以及资源供应不足或过度的问题。一些暴露的问题已经得到修复,其他问题目前正在解决中。受Beacon在I/O监控方面的成功鼓舞,我们将其扩展到监控互连网络,这是超级计算机上的另一个争论点。此外,我们通过将Beacon扩展到其他超级计算机来演示它的通用性。发布Beacon码和部分采集到的监控数据
{"title":"End-to-end I/O Monitoring on Leading Supercomputers","authors":"Bin Yang, W. Xue, Tianyu Zhang, Shichao Liu, Xiaosong Ma, Xiyang Wang, Weiguo Liu","doi":"10.1145/3568425","DOIUrl":"https://doi.org/10.1145/3568425","url":null,"abstract":"This paper offers a solution to overcome the complexities of production system I/O performance monitoring. We present Beacon, an end-to-end I/O resource monitoring and diagnosis system for the 40960-node Sunway TaihuLight supercomputer, currently the fourth-ranked supercomputer in the world. Beacon simultaneously collects and correlates I/O tracing/profiling data from all the compute nodes, forwarding nodes, storage nodes, and metadata servers. With mechanisms such as aggressive online and offline trace compression and distributed caching/storage, it delivers scalable, low-overhead, and sustainable I/O diagnosis under production use. With Beacon’s deployment on TaihuLight for more than three years, we demonstrate Beacon’s effectiveness with real-world use cases for I/O performance issue identification and diagnosis. It has already successfully helped center administrators identify obscure design or configuration flaws, system anomaly occurrences, I/O performance interference, and resource under- or over-provisioning problems. Several of the exposed problems have already been fixed, with others being currently addressed. Encouraged by Beacon’s success in I/O monitoring, we extend it to monitor interconnection networks, which is another contention point on supercomputers. In addition, we demonstrate Beacon’s generality by extending it to other supercomputers. Both Beacon codes and part of collected monitoring data are released.1","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41618883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Extending and Programming the NVMe I/O Determinism Interface for Flash Arrays 闪存阵列NVMe I/O确定性接口的扩展与编程
IF 1.7 3区 计算机科学 Q3 Computer Science Pub Date : 2022-11-19 DOI: 10.1145/3568427
Huaicheng Li, Martin L. Putra, Ronald Shi, Fadhil I. Kurnia, Xing Lin, Jaeyoung Do, A. I. Kistijantoro, G. Ganger, Haryadi S. Gunawi
Predictable latency on flash storage is a long-pursuit goal, yet unpredictability stays due to the unavoidable disturbance from many well-known SSD internal activities. To combat this issue, the recent NVMe IO Determinism (IOD) interface advocates host-level controls to SSD internal management tasks. Although promising, challenges remain on how to exploit it for truly predictable performance. We present IODA,1 an I/O deterministic flash array design built on top of small but powerful extensions to the IOD interface for easy deployment. IODA exploits data redundancy in the context of IOD for a strong latency predictability contract. In IODA, SSDs are expected to quickly fail an I/O on purpose to allow predictable I/Os through proactive data reconstruction. In the case of concurrent internal operations, IODA introduces busy remaining time exposure and predictable-latency-window formulation to guarantee predictable data reconstructions. Overall, IODA only adds five new fields to the NVMe interface and a small modification in the flash firmware while keeping most of the complexity in the host OS. Our evaluation shows that IODA improves the 95–99.99th latencies by up to 75×. IODA is also the nearest to the ideal, no disturbance case compared to seven state-of-the-art preemption, suspension, GC coordination, partitioning, tiny-tail flash controller, prediction, and proactive approaches.
闪存上的可预测延迟是一个长期追求的目标,但由于许多众所周知的SSD内部活动的不可避免的干扰,不可预测性仍然存在。为了解决这个问题,最近的NVMe IO确定性(IOD)接口提倡对SSD内部管理任务进行主机级控制。尽管前景广阔,但如何利用它获得真正可预测的性能仍然存在挑战。我们介绍了IODA,1,这是一种I/O确定性闪存阵列设计,建立在IOD接口的小型但强大的扩展之上,以便于部署。IODA利用IOD上下文中的数据冗余来实现强大的延迟可预测性契约。在IODA中,预计SSD会故意使I/O快速失效,从而通过主动数据重建实现可预测的I/O。在并发内部操作的情况下,IODA引入了繁忙的剩余时间暴露和可预测的延迟窗口公式,以确保可预测的数据重建。总体而言,IODA只在NVMe接口中添加了五个新字段,并在闪存固件中进行了少量修改,同时在主机操作系统中保留了大部分复杂性。我们的评估表明,IODA将95–99.99秒的延迟提高了75倍。与七种最先进的抢占、暂停、GC协调、分区、微尾闪控制器、预测和主动方法相比,IODA也是最接近理想的无干扰情况。
{"title":"Extending and Programming the NVMe I/O Determinism Interface for Flash Arrays","authors":"Huaicheng Li, Martin L. Putra, Ronald Shi, Fadhil I. Kurnia, Xing Lin, Jaeyoung Do, A. I. Kistijantoro, G. Ganger, Haryadi S. Gunawi","doi":"10.1145/3568427","DOIUrl":"https://doi.org/10.1145/3568427","url":null,"abstract":"Predictable latency on flash storage is a long-pursuit goal, yet unpredictability stays due to the unavoidable disturbance from many well-known SSD internal activities. To combat this issue, the recent NVMe IO Determinism (IOD) interface advocates host-level controls to SSD internal management tasks. Although promising, challenges remain on how to exploit it for truly predictable performance. We present IODA,1 an I/O deterministic flash array design built on top of small but powerful extensions to the IOD interface for easy deployment. IODA exploits data redundancy in the context of IOD for a strong latency predictability contract. In IODA, SSDs are expected to quickly fail an I/O on purpose to allow predictable I/Os through proactive data reconstruction. In the case of concurrent internal operations, IODA introduces busy remaining time exposure and predictable-latency-window formulation to guarantee predictable data reconstructions. Overall, IODA only adds five new fields to the NVMe interface and a small modification in the flash firmware while keeping most of the complexity in the host OS. Our evaluation shows that IODA improves the 95–99.99th latencies by up to 75×. IODA is also the nearest to the ideal, no disturbance case compared to seven state-of-the-art preemption, suspension, GC coordination, partitioning, tiny-tail flash controller, prediction, and proactive approaches.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44129789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Oasis: Controlling Data Migration in Expansion of Object-based Storage Systems Oasis:对象存储系统扩展中的数据迁移控制
IF 1.7 3区 计算机科学 Q3 Computer Science Pub Date : 2022-11-19 DOI: 10.1145/3568424
Yiming Zhang, Li Wang, Shun Gai, Qiwen Ke, Wenhao Li, Zhenlong Song, Guangtao Xue, J. Shu
Object-based storage systems have been widely used for various scenarios such as file storage, block storage, blob (e.g., large videos) storage, and so on, where the data is placed among a large number of object storage devices (OSDs). Data placement is critical for the scalability of decentralized object-based storage systems. The state-of-the-art CRUSH placement method is a decentralized algorithm that deterministically places object replicas onto storage devices without relying on a central directory. While enjoying the benefits of decentralization such as high scalability, robustness, and performance, CRUSH-based storage systems suffer from uncontrolled data migration when expanding the capacity of the storage clusters (i.e., adding new OSDs), which is determined by the nature of CRUSH and will cause significant performance degradation when the expansion is nontrivial. This article presents MapX, a novel extension to CRUSH that uses an extra time-dimension mapping (from object creation times to cluster expansion times) for controlling data migration after cluster expansions. Each expansion is viewed as a new layer of the CRUSH map represented by a virtual node beneath the CRUSH root. MapX controls the mapping from objects onto layers by manipulating the timestamps of the intermediate placement groups (PGs). MapX is applicable to a large variety of object-based storage scenarios where object timestamps can be maintained as higher-level metadata. We have applied MapX to the state-of-the-art Ceph-RBD (RADOS Block Device) to implement a migration-controllable, decentralized object-based block store (called Oasis). Oasis extends the RBD metadata structure to maintain and retrieve approximate object creation times (for migration control) at the granularity of expansion layers. Experimental results show that the MapX-based Oasis block store outperforms the CRUSH-based Ceph-RBD (which is busy in migrating objects after expansions) by 3.17× ∼ 4.31× in tail latency, and 76.3% (respectively, 83.8%) in IOPS for reads (respectively, writes).
基于对象的存储系统已被广泛用于各种场景,如文件存储、块存储、blob(例如,大型视频)存储等,其中数据被放置在大量对象存储设备(OSD)之间。数据放置对于分散式基于对象的存储系统的可扩展性至关重要。最先进的CRUSH放置方法是一种去中心化算法,它可以确定地将对象副本放置到存储设备上,而不依赖于中心目录。基于CRUSH的存储系统在享受去中心化的好处(如高可扩展性、健壮性和性能)的同时,在扩展存储集群的容量(即添加新的操作系统)时会遭受不受控制的数据迁移,这是由CRUSH的性质决定的,并且在扩展不正常时会导致性能显著下降。本文介绍了MapX,这是CRUSH的一个新扩展,它使用额外的时间维度映射(从对象创建时间到集群扩展时间)来控制集群扩展后的数据迁移。每个扩展都被视为CRUSH映射的一个新层,由CRUSH根下的一个虚拟节点表示。MapX通过操纵中间放置组(PG)的时间戳来控制从对象到图层的映射。MapX适用于各种基于对象的存储场景,其中对象时间戳可以作为更高级别的元数据进行维护。我们已将MapX应用于最先进的Ceph RBD(RADOS块设备),以实现可迁移可控、去中心化的基于对象的块存储(称为Oasis)。Oasis扩展了RBD元数据结构,以在扩展层的粒度上维护和检索近似的对象创建时间(用于迁移控制)。实验结果表明,基于MapX的Oasis块存储在尾部延迟方面优于基于CRUSH的Ceph RBD(扩展后忙于迁移对象)3.17×~4.31×,在读(写)IOPS方面分别优于76.3%(83.8%)。
{"title":"Oasis: Controlling Data Migration in Expansion of Object-based Storage Systems","authors":"Yiming Zhang, Li Wang, Shun Gai, Qiwen Ke, Wenhao Li, Zhenlong Song, Guangtao Xue, J. Shu","doi":"10.1145/3568424","DOIUrl":"https://doi.org/10.1145/3568424","url":null,"abstract":"Object-based storage systems have been widely used for various scenarios such as file storage, block storage, blob (e.g., large videos) storage, and so on, where the data is placed among a large number of object storage devices (OSDs). Data placement is critical for the scalability of decentralized object-based storage systems. The state-of-the-art CRUSH placement method is a decentralized algorithm that deterministically places object replicas onto storage devices without relying on a central directory. While enjoying the benefits of decentralization such as high scalability, robustness, and performance, CRUSH-based storage systems suffer from uncontrolled data migration when expanding the capacity of the storage clusters (i.e., adding new OSDs), which is determined by the nature of CRUSH and will cause significant performance degradation when the expansion is nontrivial. This article presents MapX, a novel extension to CRUSH that uses an extra time-dimension mapping (from object creation times to cluster expansion times) for controlling data migration after cluster expansions. Each expansion is viewed as a new layer of the CRUSH map represented by a virtual node beneath the CRUSH root. MapX controls the mapping from objects onto layers by manipulating the timestamps of the intermediate placement groups (PGs). MapX is applicable to a large variety of object-based storage scenarios where object timestamps can be maintained as higher-level metadata. We have applied MapX to the state-of-the-art Ceph-RBD (RADOS Block Device) to implement a migration-controllable, decentralized object-based block store (called Oasis). Oasis extends the RBD metadata structure to maintain and retrieve approximate object creation times (for migration control) at the granularity of expansion layers. Experimental results show that the MapX-based Oasis block store outperforms the CRUSH-based Ceph-RBD (which is busy in migrating objects after expansions) by 3.17× ∼ 4.31× in tail latency, and 76.3% (respectively, 83.8%) in IOPS for reads (respectively, writes).","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44156164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The what, The from, and The to: The Migration Games in Deduplicated Systems 什么,从,到:重复数据删除系统中的迁移游戏
IF 1.7 3区 计算机科学 Q3 Computer Science Pub Date : 2022-11-15 DOI: https://dl.acm.org/doi/10.1145/3565025
Roei Kisous, Ariel Kolikant, Abhinav Duggal, Sarai Sheinvald, Gala Yadgar

Deduplication reduces the size of the data stored in large-scale storage systems by replacing duplicate data blocks with references to their unique copies. This creates dependencies between files that contain similar content and complicates the management of data in the system. In this article, we address the problem of data migration, in which files are remapped between different volumes as a result of system expansion or maintenance. The challenge of determining which files and blocks to migrate has been studied extensively for systems without deduplication. In the context of deduplicated storage, however, only simplified migration scenarios have been considered.

In this article, we formulate the general migration problem for deduplicated systems as an optimization problem whose objective is to minimize the system’s size while ensuring that the storage load is evenly distributed between the system’s volumes and that the network traffic required for the migration does not exceed its allocation.

We then present three algorithms for generating effective migration plans, each based on a different approach and representing a different trade-off between computation time and migration efficiency. Our greedy algorithm provides modest space savings but is appealing thanks to its exceptionally short runtime. Its results can be improved by using larger system representations. Our theoretically optimal algorithm formulates the migration problem as an integer linear programming (ILP) instance. Its migration plans consistently result in smaller and more balanced systems than those of the greedy approach, although its runtime is long and, as a result, the theoretical optimum is not always found. Our clustering algorithm enjoys the best of both worlds: its migration plans are comparable to those generated by the ILP-based algorithm, but its runtime is shorter, sometimes by an order of magnitude. It can be further accelerated at a modest cost in the quality of its results.

重复数据删除在大规模存储系统中,通过引用重复的数据块的唯一副本来替换重复的数据块,从而减少存储数据的大小。这会在包含相似内容的文件之间创建依赖关系,并使系统中的数据管理变得复杂。在本文中,我们将讨论数据迁移问题,即由于系统扩展或维护而在不同卷之间重新映射文件。对于没有重复数据删除的系统,确定要迁移哪些文件和块的挑战已经进行了广泛的研究。但是,在重复数据删除存储上下文中,只考虑了简化的迁移场景。在本文中,我们将重复数据删除系统的一般迁移问题表述为一个优化问题,其目标是最小化系统大小,同时确保存储负载在系统卷之间均匀分布,并且迁移所需的网络流量不超过其分配。然后,我们提出了三种用于生成有效迁移计划的算法,每种算法基于不同的方法,并表示计算时间和迁移效率之间的不同权衡。我们的贪婪算法提供了适度的空间节省,但由于其异常短的运行时间而吸引人。它的结果可以通过使用更大的系统表示来改进。我们的理论最优算法将迁移问题表述为整数线性规划(ILP)实例。它的迁移计划总是产生比贪婪方法更小、更平衡的系统,尽管它的运行时间很长,因此并不总能找到理论上的最优。我们的聚类算法兼顾了两者的优点:它的迁移计划与基于ilp的算法生成的迁移计划相当,但它的运行时间更短,有时会缩短一个数量级。它可以在结果质量方面以适度的代价进一步加速。
{"title":"The what, The from, and The to: The Migration Games in Deduplicated Systems","authors":"Roei Kisous, Ariel Kolikant, Abhinav Duggal, Sarai Sheinvald, Gala Yadgar","doi":"https://dl.acm.org/doi/10.1145/3565025","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3565025","url":null,"abstract":"<p>Deduplication reduces the size of the data stored in large-scale storage systems by replacing duplicate data blocks with references to their unique copies. This creates dependencies between files that contain similar content and complicates the management of data in the system. In this article, we address the problem of data migration, in which files are remapped between different volumes as a result of system expansion or maintenance. The challenge of determining which files and blocks to migrate has been studied extensively for systems without deduplication. In the context of deduplicated storage, however, only simplified migration scenarios have been considered.</p><p>In this article, we formulate the general migration problem for deduplicated systems as an optimization problem whose objective is to minimize the system’s size while ensuring that the storage load is evenly distributed between the system’s volumes and that the network traffic required for the migration does not exceed its allocation.</p><p>We then present three algorithms for generating effective migration plans, each based on a different approach and representing a different trade-off between computation time and migration efficiency. Our <i>greedy algorithm</i> provides modest space savings but is appealing thanks to its exceptionally short runtime. Its results can be improved by using larger system representations. Our <i>theoretically optimal algorithm</i> formulates the migration problem as an integer linear programming (ILP) instance. Its migration plans consistently result in smaller and more balanced systems than those of the greedy approach, although its runtime is long and, as a result, the theoretical optimum is not always found. Our <i>clustering algorithm</i> enjoys the best of both worlds: its migration plans are comparable to those generated by the ILP-based algorithm, but its runtime is shorter, sometimes by an order of magnitude. It can be further accelerated at a modest cost in the quality of its results.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138512830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving the Endurance of Next Generation SSD’s using WOM-v Codes 使用WOM-v码提高下一代SSD的耐久性
IF 1.7 3区 计算机科学 Q3 Computer Science Pub Date : 2022-11-14 DOI: 10.1145/3565027
Shehbaz Jaffer, K. Mahdaviani, Bianca Schroeder
High density Solid State Drives, such as QLC drives, offer increased storage capacity, but a magnitude lower Program and Erase (P/E) cycles, limiting their endurance and hence usability. We present the design and implementation of non-binary, Voltage-Based Write-Once-Memory (WOM-v) Codes to improve the lifetime of QLC drives. First, we develop a FEMU based simulator test-bed to evaluate the gains of WOM-v codes on real world workloads. Second, we propose and implement two optimizations, an efficient garbage collection mechanism and an encoding optimization to drastically improve WOM-v code endurance without compromising performance. Third, we propose analytical approaches to obtain estimates of the endurance gains under WOM-v codes. We analyze the Greedy garbage collection technique with uniform page access distribution and the Least Recently Written (LRW) garbage collection technique with skewed page access distribution in the context of WOM-v codes. We find that although both approaches overestimate the number of required erase operations, the model based on greedy garbage collection with uniform page access distribution provides tighter bounds. A careful evaluation, including microbenchmarks and trace-driven evaluation, demonstrates that WOM-v codes can reduce Erase cycles for QLC drives by 4.4×–11.1× for real world workloads with minimal performance overheads resulting in improved QLC SSD lifetime.
高密度固态驱动器,如QLC驱动器,提供了更大的存储容量,但大大降低了程序和擦除(P/E)周期,限制了它们的耐用性和可用性。我们提出了非二进制,基于电压的写一次存储器(WOM-v)代码的设计和实现,以提高QLC驱动器的使用寿命。首先,我们开发了一个基于FEMU的模拟器测试平台,以评估WOM-v代码在现实世界工作负载上的增益。其次,我们提出并实现了两项优化,一种有效的垃圾收集机制和一种编码优化,以在不影响性能的情况下大幅提高WOM-v的代码耐久性。第三,我们提出了分析方法来估计WOM-v代码下的续航增益。在WOM-v代码环境下,分析了具有均匀页访问分布的贪婪垃圾收集技术和具有倾斜页访问分布的最近最少写入(LRW)垃圾收集技术。我们发现,尽管这两种方法都高估了所需擦除操作的数量,但基于统一页面访问分布的贪婪垃圾收集模型提供了更严格的边界。仔细的评估,包括微基准测试和跟踪驱动评估,表明WOM-v代码可以减少QLC驱动器的擦除周期4.4×-11.1×,以最小的性能开销,从而提高QLC SSD的使用寿命。
{"title":"Improving the Endurance of Next Generation SSD’s using WOM-v Codes","authors":"Shehbaz Jaffer, K. Mahdaviani, Bianca Schroeder","doi":"10.1145/3565027","DOIUrl":"https://doi.org/10.1145/3565027","url":null,"abstract":"High density Solid State Drives, such as QLC drives, offer increased storage capacity, but a magnitude lower Program and Erase (P/E) cycles, limiting their endurance and hence usability. We present the design and implementation of non-binary, Voltage-Based Write-Once-Memory (WOM-v) Codes to improve the lifetime of QLC drives. First, we develop a FEMU based simulator test-bed to evaluate the gains of WOM-v codes on real world workloads. Second, we propose and implement two optimizations, an efficient garbage collection mechanism and an encoding optimization to drastically improve WOM-v code endurance without compromising performance. Third, we propose analytical approaches to obtain estimates of the endurance gains under WOM-v codes. We analyze the Greedy garbage collection technique with uniform page access distribution and the Least Recently Written (LRW) garbage collection technique with skewed page access distribution in the context of WOM-v codes. We find that although both approaches overestimate the number of required erase operations, the model based on greedy garbage collection with uniform page access distribution provides tighter bounds. A careful evaluation, including microbenchmarks and trace-driven evaluation, demonstrates that WOM-v codes can reduce Erase cycles for QLC drives by 4.4×–11.1× for real world workloads with minimal performance overheads resulting in improved QLC SSD lifetime.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43627797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Disk Failure Prediction Method Based on Active Semi-supervised Learning 基于主动半监督学习的磁盘故障预测方法
IF 1.7 3区 计算机科学 Q3 Computer Science Pub Date : 2022-11-12 DOI: https://dl.acm.org/doi/10.1145/3523699
Yang Zhou, Fang Wang, Dan Feng

Disk failure has always been a major problem for data centers, leading to data loss. Current disk failure prediction approaches are mostly offline and assume that the disk labels required for training learning models are available and accurate. However, these offline methods are no longer suitable for disk failure prediction tasks in large-scale data centers. Behind this explosive amount of data, most methods do not consider whether it is not easy to get the label values during the training or the obtained label values are not completely accurate. These problems further restrict the development of supervised learning and offline modeling in disk failure prediction. In this article, Active Semi-supervised Learning Disk-failure Prediction (ASLDP), a novel disk failure prediction method is proposed, which uses active learning and semi-supervised learning. According to the characteristics of data in the disk lifecycle, ASLDP carries out active learning for those clear labeled samples, which selects valuable samples with the most significant probability uncertainty and eliminates redundancy. For those samples that are unclearly labeled or unlabeled, ASLDP uses semi-supervised learning for pre-labeled by calculating the conditional values of the samples and enhances the generalization ability by active learning. Compared with several state-of-the-art offline and online learning approaches, the results on four realistic datasets from Backblaze and Baidu demonstrate that ASLDP achieves stable failure detection rates of 80–85% with low false alarm rates. In addition, we use a dataset from Alibaba to evaluate the generality of ASLDP. Furthermore, ASLDP can overcome the problem of missing sample labels and data redundancy in large data centers, which are not considered and implemented in all offline learning methods for disk failure prediction to the best of our knowledge. Finally, ASLDP can predict the disk failure 4.9 days in advance with lower overhead and latency.

磁盘故障一直是数据中心面临的主要问题,它会导致数据丢失。目前的磁盘故障预测方法大多是离线的,并且假设训练学习模型所需的磁盘标签是可用的和准确的。然而,这些离线方法已经不适合大规模数据中心的硬盘故障预测任务。在这种爆炸式的数据量背后,大多数方法都没有考虑是否在训练过程中不容易获得标签值或者获得的标签值不完全准确。这些问题进一步制约了监督学习和离线建模在磁盘故障预测中的发展。主动半监督学习磁盘故障预测(ASLDP)是一种结合主动学习和半监督学习的磁盘故障预测方法。ASLDP根据数据在磁盘生命周期中的特点,对标记清晰的样本进行主动学习,选取概率不确定性最显著的有价值样本,消除冗余。对于标记不清楚或未标记的样本,ASLDP通过计算样本的条件值,使用半监督学习进行预标记,并通过主动学习增强泛化能力。与几种最先进的离线和在线学习方法相比,来自Backblaze和百度的四个实际数据集的结果表明,ASLDP实现了80-85%的稳定故障检测率和低误报率。此外,我们使用来自阿里巴巴的数据集来评估ASLDP的通用性。此外,ASLDP可以克服大型数据中心中样本标签缺失和数据冗余的问题,这些问题在我们所知的所有离线磁盘故障预测学习方法中都没有考虑和实现。最后,ASLDP能够以较低的开销和延迟提前4.9天预测磁盘故障。
{"title":"A Disk Failure Prediction Method Based on Active Semi-supervised Learning","authors":"Yang Zhou, Fang Wang, Dan Feng","doi":"https://dl.acm.org/doi/10.1145/3523699","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3523699","url":null,"abstract":"<p>Disk failure has always been a major problem for data centers, leading to data loss. Current disk failure prediction approaches are mostly offline and assume that the disk labels required for training learning models are available and accurate. However, these offline methods are no longer suitable for disk failure prediction tasks in large-scale data centers. Behind this explosive amount of data, most methods do not consider whether it is not easy to get the label values during the training or the obtained label values are not completely accurate. These problems further restrict the development of supervised learning and offline modeling in disk failure prediction. In this article, Active Semi-supervised Learning Disk-failure Prediction (<i>ASLDP</i>), a novel disk failure prediction method is proposed, which uses active learning and semi-supervised learning. According to the characteristics of data in the disk lifecycle, <i>ASLDP</i> carries out active learning for those clear labeled samples, which selects valuable samples with the most significant probability uncertainty and eliminates redundancy. For those samples that are unclearly labeled or unlabeled, <i>ASLDP</i> uses semi-supervised learning for pre-labeled by calculating the conditional values of the samples and enhances the generalization ability by active learning. Compared with several state-of-the-art offline and online learning approaches, the results on four realistic datasets from Backblaze and Baidu demonstrate that <i>ASLDP</i> achieves stable failure detection rates of 80–85% with low false alarm rates. In addition, we use a dataset from Alibaba to evaluate the generality of <i>ASLDP</i>. Furthermore, <i>ASLDP</i> can overcome the problem of missing sample labels and data redundancy in large data centers, which are not considered and implemented in all offline learning methods for disk failure prediction to the best of our knowledge. Finally, <i>ASLDP</i> can predict the disk failure 4.9 days in advance with lower overhead and latency.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138512864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ares: Adaptive, Reconfigurable, Erasure coded, Atomic Storage 领域:自适应、可重构、擦除编码、原子存储
IF 1.7 3区 计算机科学 Q3 Computer Science Pub Date : 2022-11-12 DOI: https://dl.acm.org/doi/10.1145/3510613
Nicolas Nicolaou, Viveck Cadambe, N. Prakash, Andria Trigeorgi, Kishori Konwar, Muriel Medard, Nancy Lynch

Emulating a shared atomic, read/write storage system is a fundamental problem in distributed computing. Replicating atomic objects among a set of data hosts was the norm for traditional implementations (e.g., [11]) in order to guarantee the availability and accessibility of the data despite host failures. As replication is highly storage demanding, recent approaches suggested the use of erasure-codes to offer the same fault-tolerance while optimizing storage usage at the hosts. Initial works focused on a fixed set of data hosts. To guarantee longevity and scalability, a storage service should be able to dynamically mask hosts failures by allowing new hosts to join, and failed host to be removed without service interruptions. This work presents the first erasure-code -based atomic algorithm, called Ares, which allows the set of hosts to be modified in the course of an execution. Ares is composed of three main components: (i) a reconfiguration protocol, (ii) a read/write protocol, and (iii) a set of data access primitives (DAPs). The design of Ares is modular and is such to accommodate the usage of various erasure-code parameters on a per-configuration basis. We provide bounds on the latency of read/write operations and analyze the storage and communication costs of the Ares algorithm.

模拟共享原子读/写存储系统是分布式计算中的一个基本问题。在一组数据主机之间复制原子对象是传统实现的标准(例如,[11]),目的是在主机故障时保证数据的可用性和可访问性。由于复制对存储的要求很高,最近的方法建议使用擦除码来提供相同的容错性,同时优化主机上的存储使用。最初的工作集中在一组固定的数据主机上。为了保证寿命和可伸缩性,存储服务应该能够通过允许新主机加入来动态掩盖主机故障,并且在不中断服务的情况下删除故障主机。这项工作提出了第一个基于擦除码的原子算法,称为Ares,它允许在执行过程中修改主机集。Ares由三个主要部分组成:(i)一个重新配置协议,(ii)一个读/写协议,(iii)一组数据访问原语(dap)。Ares的设计是模块化的,因此可以在每个配置的基础上适应各种擦除码参数的使用。我们提供了读/写操作的延迟边界,并分析了Ares算法的存储和通信成本。
{"title":"Ares: Adaptive, Reconfigurable, Erasure coded, Atomic Storage","authors":"Nicolas Nicolaou, Viveck Cadambe, N. Prakash, Andria Trigeorgi, Kishori Konwar, Muriel Medard, Nancy Lynch","doi":"https://dl.acm.org/doi/10.1145/3510613","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3510613","url":null,"abstract":"<p>Emulating a shared <i>atomic</i>, read/write storage system is a fundamental problem in distributed computing. Replicating atomic objects among a set of data hosts was the norm for traditional implementations (e.g., [11]) in order to guarantee the availability and accessibility of the data despite host failures. As replication is highly storage demanding, recent approaches suggested the use of erasure-codes to offer the same fault-tolerance while optimizing storage usage at the hosts. Initial works focused on a fixed set of data hosts. To guarantee longevity and scalability, a storage service should be able to dynamically mask hosts failures by allowing new hosts to join, and failed host to be removed without service interruptions. This work presents the first erasure-code -based atomic algorithm, called <span>Ares</span>, which allows the set of hosts to be modified in the course of an execution. <span>Ares</span> is composed of three main components: (i) a <i>reconfiguration protocol</i>, (ii) a <i>read/write protocol</i>, and (iii) a set of <i>data access primitives</i> <i>(DAPs)</i>. The design of <span>Ares</span> is modular and is such to accommodate the usage of various erasure-code parameters on a per-configuration basis. We provide bounds on the latency of read/write operations and analyze the storage and communication costs of the <span>Ares</span> algorithm.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138542298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tunable Encrypted Deduplication with Attack-resilient Key Management 可调加密重复数据删除与攻击弹性密钥管理
IF 1.7 3区 计算机科学 Q3 Computer Science Pub Date : 2022-11-11 DOI: https://dl.acm.org/doi/10.1145/3510614
Zuoru Yang, Jingwei Li, Yanjing Ren, Patrick P. C. Lee

Conventional encrypted deduplication approaches retain the deduplication capability on duplicate chunks after encryption by always deriving the key for encryption/decryption from the chunk content, but such a deterministic nature causes information leakage due to frequency analysis. We present TED, a tunable encrypted deduplication primitive that provides a tunable mechanism for balancing the tradeoff between storage efficiency and data confidentiality. The core idea of TED is that its key derivation is based on not only the chunk content but also the number of duplicate chunk copies, such that duplicate chunks are encrypted by distinct keys in a controlled manner. In particular, TED allows users to configure a storage blowup factor, under which the information leakage quantified by an information-theoretic measure is minimized for any input workload. In addition, we extend TED with a distributed key management architecture and propose two attack-resilient key generation schemes that trade between performance and fault tolerance. We implement an encrypted deduplication prototype TEDStore to realize TED in networked environments. Evaluation on real-world file system snapshots shows that TED effectively balances the tradeoff between storage efficiency and data confidentiality, with small performance overhead.

传统的加密重复数据删除方法通过始终从数据块内容中获得用于加密/解密的密钥,从而在加密后保留了对重复数据块的重复数据删除功能,但这种确定性会由于频率分析而导致信息泄露。我们介绍了TED,一个可调的加密重复数据删除原语,它提供了一种可调的机制来平衡存储效率和数据机密性之间的权衡。TED的核心思想是,它的密钥派生不仅基于块内容,还基于重复块副本的数量,这样重复的块就可以通过不同的密钥以受控的方式进行加密。特别是,TED允许用户配置存储爆炸因子,在该因子下,通过信息理论度量量化的信息泄漏对于任何输入工作负载都是最小化的。此外,我们用分布式密钥管理架构扩展了TED,并提出了两种攻击弹性密钥生成方案,在性能和容错性之间进行交易。为了在网络环境下实现TED,我们实现了一个加密的重复数据删除原型TEDStore。对实际文件系统快照的评估表明,TED有效地平衡了存储效率和数据机密性之间的权衡,并且性能开销很小。
{"title":"Tunable Encrypted Deduplication with Attack-resilient Key Management","authors":"Zuoru Yang, Jingwei Li, Yanjing Ren, Patrick P. C. Lee","doi":"https://dl.acm.org/doi/10.1145/3510614","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3510614","url":null,"abstract":"<p>Conventional encrypted deduplication approaches retain the deduplication capability on duplicate chunks after encryption by always deriving the key for encryption/decryption from the chunk content, but such a deterministic nature causes information leakage due to frequency analysis. We present <sans-serif>TED</sans-serif>, a tunable encrypted deduplication primitive that provides a tunable mechanism for balancing the tradeoff between storage efficiency and data confidentiality. The core idea of <sans-serif>TED</sans-serif> is that its key derivation is based on not only the chunk content but also the number of duplicate chunk copies, such that duplicate chunks are encrypted by distinct keys in a controlled manner. In particular, <sans-serif>TED</sans-serif> allows users to configure a storage blowup factor, under which the information leakage quantified by an information-theoretic measure is minimized for any input workload. In addition, we extend <sans-serif>TED</sans-serif> with a distributed key management architecture and propose two attack-resilient key generation schemes that trade between performance and fault tolerance. We implement an encrypted deduplication prototype <sans-serif>TEDStore</sans-serif> to realize <sans-serif>TED</sans-serif> in networked environments. Evaluation on real-world file system snapshots shows that <sans-serif>TED</sans-serif> effectively balances the tradeoff between storage efficiency and data confidentiality, with small performance overhead.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138512871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Storage
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1