ACM Transactions on Storage (TOS)最新文献_第8页

Information Leakage in Encrypted Deduplication via Frequency Analysis 基于频率分析的加密重复数据删除中的信息泄露

ACM Transactions on Storage (TOS)

Pub Date : 2019-04-11 DOI: 10.1145/3365840

Jingwei Li, P. Lee, Chufeng Tan, Chuan Qin, Xiaosong Zhang

Encrypted deduplication combines encryption and deduplication to simultaneously achieve both data security and storage efficiency. State-of-the-art encrypted deduplication systems mainly build on deterministic encryption to preserve deduplication effectiveness. However, such deterministic encryption reveals the underlying frequency distribution of the original plaintext chunks. This allows an adversary to launch frequency analysis against the ciphertext chunks and infer the content of the original plaintext chunks. In this article, we study how frequency analysis affects information leakage in encrypted deduplication, from both attack and defense perspectives. Specifically, we target backup workloads and propose a new inference attack that exploits chunk locality to increase the coverage of inferred chunks. We further combine the new inference attack with the knowledge of chunk sizes and show its attack effectiveness against variable-size chunks. We conduct trace-driven evaluation on both real-world and synthetic datasets and show that our proposed attacks infer a significant fraction of plaintext chunks under backup workloads. To defend against frequency analysis, we present two defense approaches, namely MinHash encryption and scrambling. Our trace-driven evaluation shows that our combined MinHash encryption and scrambling scheme effectively mitigates the severity of the inference attacks, while maintaining high storage efficiency and incurring limited metadata access overhead.

加密重复数据删除将加密和重复数据删除相结合，在保证数据安全性的同时提高存储效率。目前最先进的加密重复数据删除系统主要建立在确定性加密的基础上，以保持重复数据删除的有效性。然而，这种确定性加密揭示了原始明文块的底层频率分布。这允许攻击者对密文块进行频率分析，并推断原始明文块的内容。在本文中，我们从攻击和防御两个角度研究频率分析如何影响加密重复数据删除中的信息泄漏。具体来说，我们针对备份工作负载，提出了一种新的推理攻击，利用块局部性来增加推断块的覆盖范围。我们进一步将新的推理攻击与块大小的知识结合起来，并展示了它对变大小块的攻击有效性。我们对真实世界和合成数据集进行了跟踪驱动的评估，并表明我们提出的攻击推断出备份工作负载下的明文块的很大一部分。为了防御频率分析，我们提出了两种防御方法，即MinHash加密和置乱。我们的跟踪驱动评估表明，我们的组合MinHash加密和置乱方案有效地减轻了推理攻击的严重性，同时保持了高存储效率和有限的元数据访问开销。

{"title":"Information Leakage in Encrypted Deduplication via Frequency Analysis","authors":"Jingwei Li, P. Lee, Chufeng Tan, Chuan Qin, Xiaosong Zhang","doi":"10.1145/3365840","DOIUrl":"https://doi.org/10.1145/3365840","url":null,"abstract":"Encrypted deduplication combines encryption and deduplication to simultaneously achieve both data security and storage efficiency. State-of-the-art encrypted deduplication systems mainly build on deterministic encryption to preserve deduplication effectiveness. However, such deterministic encryption reveals the underlying frequency distribution of the original plaintext chunks. This allows an adversary to launch frequency analysis against the ciphertext chunks and infer the content of the original plaintext chunks. In this article, we study how frequency analysis affects information leakage in encrypted deduplication, from both attack and defense perspectives. Specifically, we target backup workloads and propose a new inference attack that exploits chunk locality to increase the coverage of inferred chunks. We further combine the new inference attack with the knowledge of chunk sizes and show its attack effectiveness against variable-size chunks. We conduct trace-driven evaluation on both real-world and synthetic datasets and show that our proposed attacks infer a significant fraction of plaintext chunks under backup workloads. To defend against frequency analysis, we present two defense approaches, namely MinHash encryption and scrambling. Our trace-driven evaluation shows that our combined MinHash encryption and scrambling scheme effectively mitigates the severity of the inference attacks, while maintaining high storage efficiency and incurring limited metadata access overhead.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129215553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

Introduction to the Special Issue on ACM International Systems and Storage Conference (SYSTOR) 2018 ACM国际系统与存储会议(SYSTOR) 2018特刊简介

ACM Transactions on Storage (TOS)

Pub Date : 2019-03-15 DOI: 10.1145/3313898

G. Yadgar, Donald E. Porter

This issue of ACM Transactions on Storage brings some of the highlights of the 11th ACM International Systems and Storage Conference (SYSTOR’18), held in Haifa, Israel, during June 2018. SYSTOR is an international forum for interaction across the systems research community, attracting authors and participants from all over the world. Of the 44 submitted and 10 accepted papers, we invited the authors of two notable papers to extend them for publication in this special issue. The first article, which was selected as the best paper at the conference, “Lerna: Parallelizing Dependent Loops Using Speculation,” by Mohamed M. Saad, Roberto Palmieri, and Binoy Ravindran, presents a technique for using speculation to parallelize code containing data dependencies. Lerna uses a combination of static analysis and profiling to rewrite a sequential program into a parallel one and manages the concurrent execution of jobs using transactional memory. This is the first framework for executing such process automatically, without any input from the programmers. The second article, “REGISTOR: A Platform for Unstructured Data Processing Inside SSD Storage,” by Shuyi Pei, Jing Yang, and Qing Yang, presents the design, implementation, and evaluation of an in-device regular-expression engine for SSDs. The resulting smart storage avoids transfer of data through the low-bandwidth I/O bus, increasing the application-perceived throughput by up to 10×. This is a significant step toward addressing the challenges of the Big Data era.

本期ACM存储交易带来了2018年6月在以色列海法举行的第11届ACM国际系统和存储会议(SYSTOR ' 18)的一些亮点。SYSTOR是一个跨系统研究社区互动的国际论坛，吸引了来自世界各地的作者和参与者。在提交的44篇论文和接受的10篇论文中，我们邀请了两篇著名论文的作者在本期特刊上发表。第一篇文章被选为会议上的最佳论文，“Lerna:使用推测并行化依赖循环”，作者是Mohamed M. Saad、Roberto Palmieri和Binoy Ravindran，该文介绍了一种使用推测来并行化包含数据依赖性的代码的技术。Lerna结合了静态分析和概要分析，将顺序程序重写为并行程序，并使用事务性内存管理作业的并发执行。这是第一个自动执行此类过程的框架，无需程序员的任何输入。第二篇文章，“REGISTOR: SSD存储内部非结构化数据处理的平台”，由Shuyi Pei、Jing Yang和Qing Yang撰写，介绍了SSD设备内正则表达式引擎的设计、实现和评估。由此产生的智能存储避免了通过低带宽I/O总线传输数据，将应用程序感知的吞吐量提高了10倍。这是应对大数据时代挑战的重要一步。

{"title":"Introduction to the Special Issue on ACM International Systems and Storage Conference (SYSTOR) 2018","authors":"G. Yadgar, Donald E. Porter","doi":"10.1145/3313898","DOIUrl":"https://doi.org/10.1145/3313898","url":null,"abstract":"This issue of ACM Transactions on Storage brings some of the highlights of the 11th ACM International Systems and Storage Conference (SYSTOR’18), held in Haifa, Israel, during June 2018. SYSTOR is an international forum for interaction across the systems research community, attracting authors and participants from all over the world. Of the 44 submitted and 10 accepted papers, we invited the authors of two notable papers to extend them for publication in this special issue. The first article, which was selected as the best paper at the conference, “Lerna: Parallelizing Dependent Loops Using Speculation,” by Mohamed M. Saad, Roberto Palmieri, and Binoy Ravindran, presents a technique for using speculation to parallelize code containing data dependencies. Lerna uses a combination of static analysis and profiling to rewrite a sequential program into a parallel one and manages the concurrent execution of jobs using transactional memory. This is the first framework for executing such process automatically, without any input from the programmers. The second article, “REGISTOR: A Platform for Unstructured Data Processing Inside SSD Storage,” by Shuyi Pei, Jing Yang, and Qing Yang, presents the design, implementation, and evaluation of an in-device regular-expression engine for SSDs. The resulting smart storage avoids transfer of data through the low-bandwidth I/O bus, increasing the application-perceived throughput by up to 10×. This is a significant step toward addressing the challenges of the Big Data era.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121836669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ACM TOS Distinguished Reviewers ACM TOS杰出审稿人

ACM Transactions on Storage (TOS)

Pub Date : 2019-03-15 DOI: 10.1145/3313879

S. Noh

In this first issue of 2019, there are a couple of announcements that I would like to make. First, Associate Editors Salima Benbernou, Haryadi Gunawi, Jiri Schindler, and Thomas Schwarz are stepping down after many years of devoted service. I applaud and thank them for their devotion and hard work that has allowed Transactions on Storage (TOS) to continue and maintain the high quality of publications it is offering today. Second, our ACM TOS team would like to express our appreciation to the numerous academic reviewers who contributed to the peer-review process these past years. Our entire community is indebted to these volunteers, who have generously given their time and expertise to thoroughly review the papers that were submitted. In the recent two years, ACM TOS received the help of over 30 editorial board members along with 180 reviewers to curate nearly 250 submissions and publish more than 70 articles with meaningful and impactful results. While the list of all the reviewers have been listed in our website (https://tos.acm.org/), I take this opportunity to list out our distinguished reviewers who went out of their way to provide careful, thorough, and timely reviews that stood out among the reviews. These names are based on the recommendations of the Associate Editors through whom the reviews were solicited. Again, we thank all the reviewers for your dedication and support for ACM TOS and the computer system storage community as a whole. Thank you all.

在2019年的第一期中，我想发布一些公告。首先，副主编Salima Benbernou, Haryadi Gunawi, Jiri Schindler和Thomas Schwarz将在多年的忠诚服务后辞职。我赞扬并感谢他们的奉献和辛勤工作，使存储事务(TOS)能够继续并保持今天提供的高质量出版物。其次，我们的ACM TOS团队想要向众多的学术审稿人表示感谢，他们在过去的几年里为同行评审过程做出了贡献。我们整个社会都感谢这些志愿人员，他们慷慨地奉献了时间和专门知识，彻底审查了所提交的文件。近两年来，ACM TOS在30多位编委会成员和180位审稿人的帮助下，策划了近250篇投稿，发表了70多篇有意义和有影响力的文章。虽然所有审稿人的名单已经在我们的网站(https://tos.acm.org/)上列出了，但我借此机会列出了我们杰出的审稿人，他们不遗余力地提供了仔细、彻底和及时的审稿，在众多审稿中脱颖而出。这些名字是基于副编辑的推荐，通过他们征求评论。再次，我们感谢所有审稿人对ACM TOS和整个计算机系统存储社区的奉献和支持。谢谢大家。

{"title":"ACM TOS Distinguished Reviewers","authors":"S. Noh","doi":"10.1145/3313879","DOIUrl":"https://doi.org/10.1145/3313879","url":null,"abstract":"In this first issue of 2019, there are a couple of announcements that I would like to make. First, Associate Editors Salima Benbernou, Haryadi Gunawi, Jiri Schindler, and Thomas Schwarz are stepping down after many years of devoted service. I applaud and thank them for their devotion and hard work that has allowed Transactions on Storage (TOS) to continue and maintain the high quality of publications it is offering today. Second, our ACM TOS team would like to express our appreciation to the numerous academic reviewers who contributed to the peer-review process these past years. Our entire community is indebted to these volunteers, who have generously given their time and expertise to thoroughly review the papers that were submitted. In the recent two years, ACM TOS received the help of over 30 editorial board members along with 180 reviewers to curate nearly 250 submissions and publish more than 70 articles with meaningful and impactful results. While the list of all the reviewers have been listed in our website (https://tos.acm.org/), I take this opportunity to list out our distinguished reviewers who went out of their way to provide careful, thorough, and timely reviews that stood out among the reviews. These names are based on the recommendations of the Associate Editors through whom the reviews were solicited. Again, we thank all the reviewers for your dedication and support for ACM TOS and the computer system storage community as a whole. Thank you all.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132502774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On the Lifecycle of the File 关于文件的生命周期

ACM Transactions on Storage (TOS)

Pub Date : 2019-02-18 DOI: 10.1145/3295463

Michael J. May, Etamar Laron, Khalid Zoabi, Havah Gerhardt

Users and Operating Systems (OSs) have vastly different views of files. OSs use files to persist data and structured information. To accomplish this, OSs treat files as named collections of bytes managed in hierarchical file systems. Despite their critical role in computing, little attention is paid to the lifecycle of the file, the evolution of file contents, or the evolution of file metadata. In contrast, users have rich mental models of files: they group files into projects, send data repositories to others, work on documents over time, and stash them aside for future use. Current OSs and Revision Control Systems ignore such mental models, persisting a selective, manually designated history of revisions. Preserving the mental model allows applications to better match how users view their files, making file processing and archiving tools more effective. We propose two mechanisms that OSs can adopt to better preserve the mental model: File Lifecycle Events (FLEs) that record a file’s progression and Complex File Events (CFEs) that combine them into meaningful patterns. We present the Complex File Events Engine (CoFEE), which uses file system monitoring and an extensible rulebase (Drools) to detect FLEs and convert them into complex ones. CFEs are persisted in NoSQL stores for later querying.

用户和操作系统(os)对文件的看法截然不同。操作系统使用文件来保存数据和结构化信息。为了实现这一点，操作系统将文件视为在分层文件系统中管理的命名字节集合。尽管它们在计算中起着关键作用，但很少有人关注文件的生命周期、文件内容的演变或文件元数据的演变。相比之下，用户对文件有丰富的心理模型:他们将文件分组到项目中，将数据存储库发送给其他人，随着时间的推移处理文档，并将它们保存起来以备将来使用。当前的操作系统和版本控制系统忽略了这样的心理模型，保留了一个选择性的、手动指定的版本历史。保留心智模型允许应用程序更好地匹配用户查看文件的方式，使文件处理和归档工具更有效。我们提出了两种操作系统可以采用的机制来更好地保存心智模型:记录文件进程的文件生命周期事件(File Lifecycle Events, fle)和将它们组合成有意义模式的复杂文件事件(Complex File Events, CFEs)。我们提出了复杂文件事件引擎(CoFEE)，它使用文件系统监控和可扩展的规则库(Drools)来检测文件事件并将其转换为复杂文件事件。cfe持久化在NoSQL存储中，以供以后查询。

{"title":"On the Lifecycle of the File","authors":"Michael J. May, Etamar Laron, Khalid Zoabi, Havah Gerhardt","doi":"10.1145/3295463","DOIUrl":"https://doi.org/10.1145/3295463","url":null,"abstract":"Users and Operating Systems (OSs) have vastly different views of files. OSs use files to persist data and structured information. To accomplish this, OSs treat files as named collections of bytes managed in hierarchical file systems. Despite their critical role in computing, little attention is paid to the lifecycle of the file, the evolution of file contents, or the evolution of file metadata. In contrast, users have rich mental models of files: they group files into projects, send data repositories to others, work on documents over time, and stash them aside for future use. Current OSs and Revision Control Systems ignore such mental models, persisting a selective, manually designated history of revisions. Preserving the mental model allows applications to better match how users view their files, making file processing and archiving tools more effective. We propose two mechanisms that OSs can adopt to better preserve the mental model: File Lifecycle Events (FLEs) that record a file’s progression and Complex File Events (CFEs) that combine them into meaningful patterns. We present the Complex File Events Engine (CoFEE), which uses file system monitoring and an extensible rulebase (Drools) to detect FLEs and convert them into complex ones. CFEs are persisted in NoSQL stores for later querying.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132414643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Leveraging Glocality for Fast Failure Recovery in Distributed RAM Storage 利用局部性实现分布式RAM存储中的快速故障恢复

ACM Transactions on Storage (TOS)

Pub Date : 2019-02-18 DOI: 10.1145/3289604

Yiming Zhang, Dongsheng Li, Ling Liu

Distributed RAM storage aggregates the RAM of servers in data center networks (DCN) to provide extremely high I/O performance for large-scale cloud systems. For quick recovery of storage server failures, MemCube [53] exploits the proximity of the BCube network to limit the recovery traffic to the recovery servers’ 1-hop neighborhood. However, the previous design is applicable only to the symmetric BCube(n,k) network with nk+1 nodes and has suboptimal recovery performance due to congestion and contention. To address these problems, in this article, we propose CubeX, which (i) generalizes the “1-hop” principle of MemCube for arbitrary cube-based networks and (ii) improves the throughput and recovery performance of RAM-based key-value (KV) store via cross-layer optimizations. At the core of CubeX is to leverage the glocality (= globality + locality) of cube-based networks: It scatters backup data across a large number of disks globally distributed throughout the cube and restricts all recovery traffic within the small local range of each server node. Our evaluation shows that CubeX not only efficiently supports RAM-based KV store for cube-based networks but also significantly outperforms MemCube and RAMCloud in both throughput and recovery time.

分布式RAM存储将数据中心网络(DCN)中服务器的RAM聚合在一起，为大规模云系统提供极高的I/O性能。为了快速恢复存储服务器故障，MemCube[53]利用BCube网络的邻近性，将恢复流量限制在恢复服务器的1跳邻居中。但是，上述设计仅适用于具有nk+1个节点的对称BCube(n,k)网络，并且由于拥塞和争用，恢复性能不是最优。为了解决这些问题，在本文中，我们提出了CubeX，它(i)将MemCube的“1-hop”原则推广到任意基于立方体的网络中，(ii)通过跨层优化提高基于ram的键值(KV)存储的吞吐量和恢复性能。CubeX的核心是利用基于多维数据集的网络的全局局部性(=全局性+局部性):它将备份数据分散到全局分布在整个多维数据集的大量磁盘上，并将所有恢复流量限制在每个服务器节点的小本地范围内。我们的评估表明，CubeX不仅有效地支持基于ram的基于立方体网络的KV存储，而且在吞吐量和恢复时间方面都明显优于MemCube和RAMCloud。

{"title":"Leveraging Glocality for Fast Failure Recovery in Distributed RAM Storage","authors":"Yiming Zhang, Dongsheng Li, Ling Liu","doi":"10.1145/3289604","DOIUrl":"https://doi.org/10.1145/3289604","url":null,"abstract":"Distributed RAM storage aggregates the RAM of servers in data center networks (DCN) to provide extremely high I/O performance for large-scale cloud systems. For quick recovery of storage server failures, MemCube [53] exploits the proximity of the BCube network to limit the recovery traffic to the recovery servers’ 1-hop neighborhood. However, the previous design is applicable only to the symmetric BCube(n,k) network with nk+1 nodes and has suboptimal recovery performance due to congestion and contention. To address these problems, in this article, we propose CubeX, which (i) generalizes the “1-hop” principle of MemCube for arbitrary cube-based networks and (ii) improves the throughput and recovery performance of RAM-based key-value (KV) store via cross-layer optimizations. At the core of CubeX is to leverage the glocality (= globality + locality) of cube-based networks: It scatters backup data across a large number of disks globally distributed throughout the cube and restricts all recovery traffic within the small local range of each server node. Our evaluation shows that CubeX not only efficiently supports RAM-based KV store for cube-based networks but also significantly outperforms MemCube and RAMCloud in both throughput and recovery time.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127002535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

TDDFS

ACM Transactions on Storage (TOS)

Pub Date : 2019-02-05 DOI: 10.1145/3295461

Zhichao Cao, Hao Wen, Xiongzi Ge, Jingwei Ma, Jim Diehl, D. Du

With the rapid increase in the amount of data produced and the development of new types of storage devices, storage tiering continues to be a popular way to achieve a good tradeoff between performance and cost-effectiveness. In a basic two-tier storage system, a storage tier with higher performance and typically higher cost (the fast tier) is used to store frequently-accessed (active) data while a large amount of less-active data are stored in the lower-performance and low-cost tier (the slow tier). Data are migrated between these two tiers according to their activity. In this article, we propose a Tier-aware Data Deduplication-based File System, called TDDFS, which can operate efficiently on top of a two-tier storage environment. Specifically, to achieve better performance, nearly all file operations are performed in the fast tier. To achieve higher cost-effectiveness, files are migrated from the fast tier to the slow tier if they are no longer active, and this migration is done with data deduplication. The distinctiveness of our design is that it maintains the non-redundant (unique) chunks produced by data deduplication in both tiers if possible. When a file is reloaded (called a reloaded file) from the slow tier to the fast tier, if some data chunks of the file already exist in the fast tier, then the data migration of these chunks from the slow tier can be avoided. Our evaluation shows that TDDFS achieves close to the best overall performance among various file-tiering designs for two-tier storage systems.

随着产生的数据量的快速增长和新型存储设备的开发，存储分层仍然是实现性能和成本效益之间良好权衡的流行方法。在基本的两层存储系统中，使用性能较高、成本较高的存储层(快层)存储访问频繁的(活动)数据，而将大量不太活跃的数据存储在性能较低、成本较低的存储层(慢层)中。数据根据这两个层的活动在它们之间迁移。在本文中，我们提出了一种基于分层的数据重复数据删除文件系统，称为TDDFS，它可以在两层存储环境之上高效地运行。具体来说，为了获得更好的性能，几乎所有的文件操作都在快速层执行。为了获得更高的成本效益，如果文件不再活跃，则将其从快速层迁移到慢速层，并且此迁移使用重复数据删除完成。我们设计的独特之处在于，如果可能的话，它会在两个层中维护由重复数据删除产生的非冗余(唯一)块。当文件从慢速存储层重新加载到快速存储层时，如果文件中的一些数据块已经存在于快速存储层中，则可以避免这些数据块从慢速存储层迁移。我们的评估表明，TDDFS在两层存储系统的各种文件分级设计中实现了接近最佳的总体性能。

{"title":"TDDFS","authors":"Zhichao Cao, Hao Wen, Xiongzi Ge, Jingwei Ma, Jim Diehl, D. Du","doi":"10.1145/3295461","DOIUrl":"https://doi.org/10.1145/3295461","url":null,"abstract":"With the rapid increase in the amount of data produced and the development of new types of storage devices, storage tiering continues to be a popular way to achieve a good tradeoff between performance and cost-effectiveness. In a basic two-tier storage system, a storage tier with higher performance and typically higher cost (the fast tier) is used to store frequently-accessed (active) data while a large amount of less-active data are stored in the lower-performance and low-cost tier (the slow tier). Data are migrated between these two tiers according to their activity. In this article, we propose a Tier-aware Data Deduplication-based File System, called TDDFS, which can operate efficiently on top of a two-tier storage environment. Specifically, to achieve better performance, nearly all file operations are performed in the fast tier. To achieve higher cost-effectiveness, files are migrated from the fast tier to the slow tier if they are no longer active, and this migration is done with data deduplication. The distinctiveness of our design is that it maintains the non-redundant (unique) chunks produced by data deduplication in both tiers if possible. When a file is reloaded (called a reloaded file) from the slow tier to the fast tier, if some data chunks of the file already exist in the fast tier, then the data migration of these chunks from the slow tier can be avoided. Our evaluation shows that TDDFS achieves close to the best overall performance among various file-tiering designs for two-tier storage systems.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"132 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114017768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Exploiting Internal Parallelism for Address Translation in Solid-State Drives 利用固态硬盘地址转换的内部并行性

ACM Transactions on Storage (TOS)

Pub Date : 2018-12-15 DOI: 10.1145/3239564

Wei Xie, Yong Chen, P. Roth

Solid-state Drives (SSDs) have changed the landscape of storage systems and present a promising storage solution for data-intensive applications due to their low latency, high bandwidth, and low power consumption compared to traditional hard disk drives. SSDs achieve these desirable characteristics using internal parallelism—parallel access to multiple internal flash memory chips—and a Flash Translation Layer (FTL) that determines where data are stored on those chips so that they do not wear out prematurely. However, current state-of-the-art cache-based FTLs like the Demand-based Flash Translation Layer (DFTL) do not allow IO schedulers to take full advantage of internal parallelism, because they impose a tight coupling between the logical-to-physical address translation and the data access. To address this limitation, we introduce a new FTL design called Parallel-DFTL that works with the DFTL to decouple address translation operations from data accesses. Parallel-DFTL separates address translation and data access operations into different queues, allowing the SSD to use concurrent flash accesses for both types of operations. We also present a Parallel-LRU cache replacement algorithm to improve the concurrency of address translation operations. To compare Parallel-DFTL against existing FTL approaches, we present a Parallel-DFTL performance model and compare its predictions against those for DFTL and an ideal page-mapping approach. We also implemented the Parallel-DFTL approach in an SSD simulator using real device parameters, and used trace-driven simulation to evaluate Parallel-DFTL’s efficacy. Our evaluation results show that Parallel-DFTL improved the overall performance by up to 32% for the real IO workloads we tested, and by up to two orders of magnitude with synthetic test workloads. We also found that Parallel-DFTL is able to achieve reasonable performance with a very small cache size and that it provides the best benefit for those workloads with large request size or with high write ratio.

固态硬盘(ssd)已经改变了存储系统的格局，与传统硬盘驱动器相比，由于其低延迟、高带宽和低功耗，它为数据密集型应用程序提供了一个有前途的存储解决方案。ssd通过内部并行(对多个内部闪存芯片的并行访问)和一个闪存转换层(FTL)来实现这些理想的特性，该转换层确定数据存储在这些芯片上的位置，从而使它们不会过早磨损。然而，当前最先进的基于缓存的ftl，如基于需求的闪存转换层(DFTL)，不允许IO调度器充分利用内部并行性，因为它们在逻辑到物理地址转换和数据访问之间施加了紧密耦合。为了解决这一限制，我们引入了一种新的FTL设计，称为Parallel-DFTL，它与DFTL一起将地址转换操作与数据访问解耦。Parallel-DFTL将地址转换和数据访问操作分离到不同的队列中，允许SSD对这两种操作使用并发的闪存访问。我们还提出了一种并行lru缓存替换算法，以提高地址转换操作的并发性。为了将Parallel-DFTL与现有的FTL方法进行比较，我们提出了一个Parallel-DFTL性能模型，并将其预测结果与DFTL和一种理想的页面映射方法进行比较。我们还使用真实设备参数在SSD模拟器中实现了Parallel-DFTL方法，并使用跟踪驱动仿真来评估Parallel-DFTL的有效性。我们的评估结果表明，对于我们测试的真实IO工作负载，Parallel-DFTL将总体性能提高了32%，对于合成测试工作负载，则提高了两个数量级。我们还发现Parallel-DFTL能够在非常小的缓存大小下实现合理的性能，并且它为那些请求大小大或写比率高的工作负载提供了最佳的好处。

{"title":"Exploiting Internal Parallelism for Address Translation in Solid-State Drives","authors":"Wei Xie, Yong Chen, P. Roth","doi":"10.1145/3239564","DOIUrl":"https://doi.org/10.1145/3239564","url":null,"abstract":"Solid-state Drives (SSDs) have changed the landscape of storage systems and present a promising storage solution for data-intensive applications due to their low latency, high bandwidth, and low power consumption compared to traditional hard disk drives. SSDs achieve these desirable characteristics using internal parallelism—parallel access to multiple internal flash memory chips—and a Flash Translation Layer (FTL) that determines where data are stored on those chips so that they do not wear out prematurely. However, current state-of-the-art cache-based FTLs like the Demand-based Flash Translation Layer (DFTL) do not allow IO schedulers to take full advantage of internal parallelism, because they impose a tight coupling between the logical-to-physical address translation and the data access. To address this limitation, we introduce a new FTL design called Parallel-DFTL that works with the DFTL to decouple address translation operations from data accesses. Parallel-DFTL separates address translation and data access operations into different queues, allowing the SSD to use concurrent flash accesses for both types of operations. We also present a Parallel-LRU cache replacement algorithm to improve the concurrency of address translation operations. To compare Parallel-DFTL against existing FTL approaches, we present a Parallel-DFTL performance model and compare its predictions against those for DFTL and an ideal page-mapping approach. We also implemented the Parallel-DFTL approach in an SSD simulator using real device parameters, and used trace-driven simulation to evaluate Parallel-DFTL’s efficacy. Our evaluation results show that Parallel-DFTL improved the overall performance by up to 32% for the real IO workloads we tested, and by up to two orders of magnitude with synthetic test workloads. We also found that Parallel-DFTL is able to achieve reasonable performance with a very small cache size and that it provides the best benefit for those workloads with large request size or with high write ratio.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"41 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131851931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Introduction to the Special Issue on SYSTOR 2017 SYSTOR 2017特刊简介

ACM Transactions on Storage (TOS)

Pub Date : 2018-12-15 DOI: 10.1145/3287097

Peter Desnoyers, E. D. Lara

引用次数: 0

FlashNet

ACM Transactions on Storage (TOS)

Pub Date : 2018-12-04 DOI: 10.1145/3239562

A. Trivedi, Nikolas Ioannou, B. Metzler, Patrick Stuedi, Jonas Pfefferle, K. Kourtis, Ioannis Koltsidas, T. Gross

During the past decade, network and storage devices have undergone rapid performance improvements, delivering ultra-low latency and several Gbps of bandwidth. Nevertheless, current network and storage stacks fail to deliver this hardware performance to the applications, often due to the loss of I/O efficiency from stalled CPU performance. While many efforts attempt to address this issue solely on either the network or the storage stack, achieving high-performance for networked-storage applications requires a holistic approach that considers both. In this article, we present FlashNet, a software I/O stack that unifies high-performance network properties with flash storage access and management. FlashNet builds on RDMA principles and abstractions to provide a direct, asynchronous, end-to-end data path between a client and remote flash storage. The key insight behind FlashNet is to co-design the stack’s components (an RDMA controller, a flash controller, and a file system) to enable cross-stack optimizations and maximize I/O efficiency. In micro-benchmarks, FlashNet improves 4kB network I/O operations per second (IOPS by 38.6% to 1.22M, decreases access latency by 43.5% to 50.4μs, and prolongs the flash lifetime by 1.6-5.9× for writes. We illustrate the capabilities of FlashNet by building a Key-Value store and porting a distributed data store that uses RDMA on it. The use of FlashNet’s RDMA API improves the performance of KV store by 2× and requires minimum changes for the ported data store to access remote flash devices.

在过去的十年中，网络和存储设备的性能得到了快速提升，提供了超低延迟和几Gbps的带宽。然而，当前的网络和存储堆栈无法向应用程序提供这种硬件性能，这通常是由于CPU性能停滞导致的I/O效率损失。虽然许多努力试图仅在网络或存储堆栈上解决这个问题，但要实现网络存储应用程序的高性能，需要一种综合考虑两者的方法。在本文中，我们介绍了FlashNet，一个将高性能网络属性与闪存存储访问和管理相结合的软件I/O堆栈。FlashNet建立在RDMA原则和抽象之上，在客户端和远程闪存之间提供直接、异步、端到端的数据路径。FlashNet背后的关键思想是共同设计堆栈的组件(RDMA控制器、flash控制器和文件系统)，以实现跨堆栈优化和最大化I/O效率。在微基准测试中，FlashNet将每秒4kB网络I/O操作(IOPS)提高38.6%至1.22M，将访问延迟降低43.5%至50.4μs，并将写入闪存寿命延长1.6-5.9倍。我们通过构建一个键值存储和移植一个在其上使用RDMA的分布式数据存储来演示FlashNet的功能。FlashNet的RDMA API的使用将KV存储的性能提高了2倍，并且需要对端口数据存储进行最小的更改才能访问远程闪存设备。

{"title":"FlashNet","authors":"A. Trivedi, Nikolas Ioannou, B. Metzler, Patrick Stuedi, Jonas Pfefferle, K. Kourtis, Ioannis Koltsidas, T. Gross","doi":"10.1145/3239562","DOIUrl":"https://doi.org/10.1145/3239562","url":null,"abstract":"During the past decade, network and storage devices have undergone rapid performance improvements, delivering ultra-low latency and several Gbps of bandwidth. Nevertheless, current network and storage stacks fail to deliver this hardware performance to the applications, often due to the loss of I/O efficiency from stalled CPU performance. While many efforts attempt to address this issue solely on either the network or the storage stack, achieving high-performance for networked-storage applications requires a holistic approach that considers both. In this article, we present FlashNet, a software I/O stack that unifies high-performance network properties with flash storage access and management. FlashNet builds on RDMA principles and abstractions to provide a direct, asynchronous, end-to-end data path between a client and remote flash storage. The key insight behind FlashNet is to co-design the stack’s components (an RDMA controller, a flash controller, and a file system) to enable cross-stack optimizations and maximize I/O efficiency. In micro-benchmarks, FlashNet improves 4kB network I/O operations per second (IOPS by 38.6% to 1.22M, decreases access latency by 43.5% to 50.4μs, and prolongs the flash lifetime by 1.6-5.9× for writes. We illustrate the capabilities of FlashNet by building a Key-Value store and porting a distributed data store that uses RDMA on it. The use of FlashNet’s RDMA API improves the performance of KV store by 2× and requires minimum changes for the ported data store to access remote flash devices.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123985344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Performance Characterization of NVMe-over-Fabrics Storage Disaggregation nvme -over- fabric存储分解性能表征

ACM Transactions on Storage (TOS)

Pub Date : 2018-12-04 DOI: 10.1145/3239563

Zvika Guz, Harry Li, A. Shayesteh, V. Balakrishnan

Storage disaggregation separates compute and storage to different nodes to allow for independent resource scaling and, thus, better hardware resource utilization. While disaggregation of hard-drives storage is a common practice, NVMe-SSD (i.e., PCIe-based SSD) disaggregation is considered more challenging. This is because SSDs are significantly faster than hard drives, so the latency overheads (due to both network and CPU processing) as well as the extra compute cycles needed for the offloading stack become much more pronounced. In this work, we characterize the overheads of NVMe-SSD disaggregation. We show that NVMe-over-Fabrics (NVMe-oF)—a recently released remote storage protocol specification—reduces the overheads of remote access to a bare minimum, thus greatly increasing the cost-efficiency of Flash disaggregation. Specifically, while recent work showed that SSD storage disaggregation via iSCSI degrades application-level throughput by 20%, we report on negligible performance degradation with NVMe-oF—both when using stress-tests as well as with a more-realistic KV-store workload.

存储分解将计算和存储分离到不同的节点，从而允许独立的资源扩展，从而更好地利用硬件资源。虽然拆分硬盘驱动器存储是一种常见的做法，但NVMe-SSD(即基于pcie的SSD)拆分被认为更具挑战性。这是因为ssd比硬盘驱动器快得多，因此延迟开销(由于网络和CPU处理)以及卸载堆栈所需的额外计算周期变得更加明显。在这项工作中，我们描述了NVMe-SSD分解的开销。我们展示了nvme -over- fabric (NVMe-oF)——最近发布的远程存储协议规范——将远程访问的开销降低到最低限度，从而大大提高了闪存分解的成本效率。具体来说，虽然最近的研究表明，通过iSCSI进行SSD存储分解会使应用程序级吞吐量降低20%，但我们报告说，在使用压力测试以及更现实的kv存储工作负载时，使用nvme - of的性能降低可以忽略不计。

引用次数: 27