ACM Transactions on Storage (TOS)最新文献_第5页

Hybrid Codes 混合编码

ACM Transactions on Storage (TOS)

Pub Date : 2020-09-24 DOI: 10.1145/3407193

Liuqing Ye, D. Feng, Yuchong Hu, Xueliang Wei

Erasure codes are being extensively deployed in practical storage systems to prevent data loss with low redundancy. However, these codes require excessive disk I/Os and network traffic for recovering unavailable data. Among all erasure codes, Minimum Storage Regenerating (MSR) codes can achieve optimal repair bandwidth under the minimum storage during recovery, but some open issues remain to be addressed before applying them in real systems. Facing with the huge burden during recovery, erasure-coded storage systems need to be developed with high repair efficiency. Aiming at this goal, a new class of coding scheme is introduced—Hybrid Regenerating Codes (Hybrid-RC). The codes utilize the superiority of MSR codes to compute a subset of data blocks while some other parity blocks are used for reliability maintenance. As a result, our design is near-optimal with respect to storage and network traffic and shows great improvements in recovery performance.

Erasure码被广泛应用于实际的存储系统中，以防止低冗余的数据丢失。但是，这些代码需要大量的磁盘I/ o和网络流量来恢复不可用的数据。在所有的纠删码中，最小存储再生码(MSR)在恢复过程中可以在最小存储条件下实现最优的修复带宽，但在应用于实际系统之前还存在一些有待解决的开放性问题。面对巨大的恢复负担，需要开发具有高修复效率的擦除编码存储系统。针对这一问题，提出了一种新的编码方案——混合再生码(Hybrid-RC)。该码利用MSR码的优势计算数据块子集，同时使用其他奇偶校验块进行可靠性维护。因此，我们的设计在存储和网络流量方面几乎是最优的，并且在恢复性能方面有很大的改进。

引用次数: 2

Inspection and Characterization of App File Usage in Mobile Devices 移动设备中App文件使用的检测与表征

ACM Transactions on Storage (TOS)

Pub Date : 2020-09-24 DOI: 10.1145/3404119

Cheng Ji, Riwei Pan, Li-Pin Chang, Liang Shi, Zongwei Zhu, Yu Liang, Tei-Wei Kuo, C. Xue

While the computing power of mobile devices has been quickly evolving in recent years, the growth of mobile storage capacity is, however, relatively slower. A common problem shared by budget-phone users is that they frequently run out of storage space. This article conducts a deep inspection of file usage of mobile applications and their potential implications on user experience. Our major findings are as follows: First, mobile applications could rapidly consume storage space by creating temporary cache files, but these cache files quickly become obsolete after being re-used for a short period of time. Second, file access patterns of large files, especially executable files, appear highly sparse and random, and therefore large portions of file space are never visited. Third, file prefetching brings an excessive amount of file data into page cache but only a few prefetched data are actually used. The unnecessary memory pressure causes premature memory reclamation and prolongs application launching time. Through the feasibility study of two preliminary optimizations, we demonstrated a high potential to eliminate unnecessary storage and memory space consumption with a minimal impact on user experience.

虽然近年来移动设备的计算能力一直在快速发展，但移动存储容量的增长相对较慢。廉价手机用户的一个共同问题是它们经常耗尽存储空间。本文深入研究了移动应用程序的文件使用情况及其对用户体验的潜在影响。我们的主要发现如下:首先，移动应用程序可以通过创建临时缓存文件来快速消耗存储空间，但这些缓存文件在短时间内被重用后很快就会过时。其次，大文件(尤其是可执行文件)的文件访问模式显得非常稀疏和随机，因此大部分文件空间永远不会被访问。第三，文件预取会将过多的文件数据带入页面缓存，但实际上只使用了少量预取的数据。不必要的内存压力会导致内存过早回收，延长应用程序启动时间。通过对两个初步优化的可行性研究，我们展示了消除不必要的存储和内存空间消耗的高潜力，同时对用户体验的影响最小。

{"title":"Inspection and Characterization of App File Usage in Mobile Devices","authors":"Cheng Ji, Riwei Pan, Li-Pin Chang, Liang Shi, Zongwei Zhu, Yu Liang, Tei-Wei Kuo, C. Xue","doi":"10.1145/3404119","DOIUrl":"https://doi.org/10.1145/3404119","url":null,"abstract":"While the computing power of mobile devices has been quickly evolving in recent years, the growth of mobile storage capacity is, however, relatively slower. A common problem shared by budget-phone users is that they frequently run out of storage space. This article conducts a deep inspection of file usage of mobile applications and their potential implications on user experience. Our major findings are as follows: First, mobile applications could rapidly consume storage space by creating temporary cache files, but these cache files quickly become obsolete after being re-used for a short period of time. Second, file access patterns of large files, especially executable files, appear highly sparse and random, and therefore large portions of file space are never visited. Third, file prefetching brings an excessive amount of file data into page cache but only a few prefetched data are actually used. The unnecessary memory pressure causes premature memory reclamation and prolongs application launching time. Through the feasibility study of two preliminary optimizations, we demonstrated a high potential to eliminate unnecessary storage and memory space consumption with a minimal impact on user experience.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132846589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Spiffy 整洁的

ACM Transactions on Storage (TOS)

Pub Date : 2020-08-04 DOI: 10.1145/3386368

K. Sun, D. Fryer, Russell Wang, Sagar Patel, J. Chu, Matthew Lakier, Angela Demke Brown, Ashvin Goel

Many file-system applications such as defragmentation tools, file-system checkers, or data recovery tools, operate at the storage layer. Today, developers of these file-system aware storage applications require detailed knowledge of the file-system format, which requires significant time to learn, often by trial and error, due to insufficient documentation or specification of the format. Furthermore, these applications perform ad-hoc processing of the file-system metadata, leading to bugs and vulnerabilities. We propose Spiffy, an annotation language for specifying the on-disk format of a file system. File-system developers annotate the data structures of a file system, and we use these annotations to generate a library that allows identifying, parsing, and traversing file-system metadata, providing support for both offline and online storage applications. This approach simplifies the development of storage applications that work across different file systems because it reduces the amount of file-system--specific code that needs to be written. We have written annotations for the Linux Ext4, Btrfs, and F2FS file systems, and developed several applications for these file systems, including a type-specific metadata corruptor, a file-system converter, an online storage layer cache that preferentially caches files for certain users, and a runtime file-system checker. Our experiments show that applications built with the Spiffy library for accessing file-system metadata can achieve good performance and are robust against file-system corruption errors.

许多文件系统应用程序，如碎片整理工具、文件系统检查器或数据恢复工具，都在存储层操作。如今，这些文件系统感知存储应用程序的开发人员需要对文件系统格式有详细的了解，这需要花费大量的时间来学习，通常是通过反复试验，因为没有足够的文档或格式规范。此外，这些应用程序对文件系统元数据执行临时处理，从而导致错误和漏洞。我们提出Spiffy，这是一种用于指定文件系统磁盘上格式的注释语言。文件系统开发人员注释文件系统的数据结构，我们使用这些注释生成一个库，该库允许识别、解析和遍历文件系统元数据，为离线和在线存储应用程序提供支持。这种方法简化了跨不同文件系统的存储应用程序的开发，因为它减少了需要编写的特定于文件系统的代码的数量。我们已经为Linux Ext4、Btrfs和F2FS文件系统编写了注释，并为这些文件系统开发了几个应用程序，包括特定类型的元数据破坏器、文件系统转换器、优先为某些用户缓存文件的在线存储层缓存，以及运行时文件系统检查器。我们的实验表明，使用Spiffy库构建的用于访问文件系统元数据的应用程序可以获得良好的性能，并且对文件系统损坏错误具有鲁棒性。

{"title":"Spiffy","authors":"K. Sun, D. Fryer, Russell Wang, Sagar Patel, J. Chu, Matthew Lakier, Angela Demke Brown, Ashvin Goel","doi":"10.1145/3386368","DOIUrl":"https://doi.org/10.1145/3386368","url":null,"abstract":"Many file-system applications such as defragmentation tools, file-system checkers, or data recovery tools, operate at the storage layer. Today, developers of these file-system aware storage applications require detailed knowledge of the file-system format, which requires significant time to learn, often by trial and error, due to insufficient documentation or specification of the format. Furthermore, these applications perform ad-hoc processing of the file-system metadata, leading to bugs and vulnerabilities. We propose Spiffy, an annotation language for specifying the on-disk format of a file system. File-system developers annotate the data structures of a file system, and we use these annotations to generate a library that allows identifying, parsing, and traversing file-system metadata, providing support for both offline and online storage applications. This approach simplifies the development of storage applications that work across different file systems because it reduces the amount of file-system--specific code that needs to be written. We have written annotations for the Linux Ext4, Btrfs, and F2FS file systems, and developed several applications for these file systems, including a type-specific metadata corruptor, a file-system converter, an online storage layer cache that preferentially caches files for certain users, and a runtime file-system checker. Our experiments show that applications built with the Spiffy library for accessing file-system metadata can achieve good performance and are robust against file-system corruption errors.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130956701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Batch-file Operations to Optimize Massive Files Accessing 优化海量文件访问的批处理文件操作

ACM Transactions on Storage (TOS)

Pub Date : 2020-07-16 DOI: 10.1145/3394286

Yang Yang, Q. Cao, Jie Yao, Hong Jiang, Li Yang

Existing local file systems, designed to support a typical single-file access mode only, can lead to poor performance when accessing a batch of files, especially small files. This single-file mode essentially serializes accesses to batched files one by one, resulting in a large number of non-sequential, random, and often dependent I/Os between file data and metadata at the storage ends. Such access mode can further worsen the efficiency and performance of applications accessing massive files, such as data migration. We first experimentally analyze the root cause of such inefficiency in batch-file accesses. Then, we propose a novel batch-file access approach, referred to as BFO for its set of optimized Batch-File Operations, by developing novel BFOr and BFOw operations for fundamental read and write processes, respectively, using a two-phase access for metadata and data jointly. The BFO offers dedicated interfaces for batch-file accesses and additional processes integrated into existing file systems without modifying their structures and procedures. In addition, based on BFOr and BFOw, we also propose the novel batch-file migration BFOm to accelerate the data migration for massive small files. We implement a BFO prototype on ext4, one of the most popular file systems. Our evaluation results show that the batch-file read and write performances of BFO are consistently higher than those of the traditional approaches regardless of access patterns, data layouts, and storage media, under synthetic and real-world file sets. BFO improves the read performance by up to 22.4× and 1.8× with HDD and SSD, respectively, and it boosts the write performance by up to 111.4× and 2.9× with HDD and SSD, respectively. BFO also demonstrates consistent performance advantages for data migration in both local and remote situations.

现有的本地文件系统设计为仅支持典型的单文件访问模式，因此在访问一批文件(尤其是小文件)时可能导致性能不佳。这种单文件模式本质上是将对批处理文件的访问一个接一个地序列化，从而导致存储端文件数据和元数据之间的大量非顺序、随机且通常依赖的I/ o。这种访问方式会进一步降低访问海量文件的应用程序(如数据迁移)的效率和性能。我们首先通过实验分析了批处理文件访问效率低下的根本原因。然后，我们提出了一种新的批处理文件访问方法，通过对元数据和数据联合使用两阶段访问，分别为基本读和写过程开发新的BFOr和bflow操作，将其称为优化的批处理文件操作集。BFO为批量文件访问和集成到现有文件系统的附加进程提供专用接口，而无需修改其结构和过程。此外，在BFOr和bflow的基础上，我们还提出了新的批处理文件迁移bom，以加速海量小文件的数据迁移。我们在ext4(最流行的文件系统之一)上实现了一个BFO原型。我们的评估结果表明，无论访问模式、数据布局和存储介质如何，在合成文件集和真实文件集下，BFO的批处理文件读写性能始终高于传统方法。BFO对HDD和SSD的读性能分别提高了22.4倍和1.8倍，对HDD和SSD的写性能分别提高了111.4倍和2.9倍。BFO还展示了本地和远程情况下数据迁移的一致性能优势。

{"title":"Batch-file Operations to Optimize Massive Files Accessing","authors":"Yang Yang, Q. Cao, Jie Yao, Hong Jiang, Li Yang","doi":"10.1145/3394286","DOIUrl":"https://doi.org/10.1145/3394286","url":null,"abstract":"Existing local file systems, designed to support a typical single-file access mode only, can lead to poor performance when accessing a batch of files, especially small files. This single-file mode essentially serializes accesses to batched files one by one, resulting in a large number of non-sequential, random, and often dependent I/Os between file data and metadata at the storage ends. Such access mode can further worsen the efficiency and performance of applications accessing massive files, such as data migration. We first experimentally analyze the root cause of such inefficiency in batch-file accesses. Then, we propose a novel batch-file access approach, referred to as BFO for its set of optimized Batch-File Operations, by developing novel BFOr and BFOw operations for fundamental read and write processes, respectively, using a two-phase access for metadata and data jointly. The BFO offers dedicated interfaces for batch-file accesses and additional processes integrated into existing file systems without modifying their structures and procedures. In addition, based on BFOr and BFOw, we also propose the novel batch-file migration BFOm to accelerate the data migration for massive small files. We implement a BFO prototype on ext4, one of the most popular file systems. Our evaluation results show that the batch-file read and write performances of BFO are consistently higher than those of the traditional approaches regardless of access patterns, data layouts, and storage media, under synthetic and real-world file sets. BFO improves the read performance by up to 22.4× and 1.8× with HDD and SSD, respectively, and it boosts the write performance by up to 111.4× and 2.9× with HDD and SSD, respectively. BFO also demonstrates consistent performance advantages for data migration in both local and remote situations.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116193140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Cosmos+ OpenSSD

ACM Transactions on Storage (TOS)

Pub Date : 2020-07-16 DOI: 10.1145/3385073

Jaewook Kwak, Sangjin Lee, Kibin Park, Jinwoo Jeong, Y. Song

As semiconductor technology has advanced, many storage systems have begun to use non-volatile memories as storage media. The organization and architecture of storage controllers have become more complex to meet various design requirements in terms of performance, response time, quality of service (QoS), and so on. In addition, due to the evolution of memory technology and the emergence of new applications, storage controllers employ new firmware algorithms and hardware modules. When designing storage controllers, engineers often evaluate the performance impact of using new software and hardware components using software simulators. However, this technique often yields limited evaluation accuracy because of the difficulty of modeling complex operations of components and the interactions among them. In this article, we present a reconfigurable flash storage controller design that serves as a rapid prototype. This design can be synthesized into a field-programmable gate array device and used in a realistic performance evaluation environment. We show the usefulness of our design by demonstrating the performance impact of design parameters.

随着半导体技术的发展，许多存储系统开始采用非易失性存储器作为存储介质。存储控制器的组织和体系结构变得越来越复杂，以满足在性能、响应时间、服务质量(QoS)等方面的各种设计需求。此外，由于存储技术的发展和新应用的出现，存储控制器采用新的固件算法和硬件模块。在设计存储控制器时，工程师经常使用软件模拟器来评估使用新软件和硬件组件对性能的影响。然而，由于对组件的复杂操作和组件之间的相互作用进行建模的困难，这种技术通常会产生有限的评估精度。在这篇文章中，我们提出了一个可重构的闪存控制器设计，作为一个快速原型。该设计可以综合成一个现场可编程门阵列器件，并应用于实际的性能评估环境中。我们通过演示设计参数对性能的影响来展示我们设计的有用性。

引用次数: 18

Cache What You Need to Cache 缓存需要缓存的内容

ACM Transactions on Storage (TOS)

Pub Date : 2020-07-16 DOI: 10.1145/3397766

Hua Wang, Jiawei Zhang, Ping-Hsiu Huang, Xinbo Yi, Bin Cheng, Ke Zhou

The SSD has been playing a significantly important role in caching systems due to its high performance-to-cost ratio. Since the cache space is typically much smaller than that of the backend storage by one order of magnitude or even more, write density (defined as writes per unit time and space) of the SSD cache is therefore much more intensive than that of HDD storage, which brings about tremendous challenges to the SSD’s lifetime. Meanwhile, under social network workloads, quite a lot writes to the SSD cache are unnecessary. For example, our study on Tencent’s photo caching shows that about 61% of total photos are accessed only once, whereas they are still swapped in and out of the cache. Therefore, if we can predict these kinds of photos proactively and prevent them from entering the cache, we can eliminate unnecessary SSD cache writes and improve cache space utilization. To cope with the challenge, we put forward a “one-time-access criteria” that is applied to the cache space and further propose a “one-time-access-exclusion” policy. Based on these two techniques, we design a prediction-based classifier to facilitate the policy. Unlike the state-of-the-art history-based predictions, our prediction is non-history oriented, which is challenging to achieve good prediction accuracy. To address this issue, we integrate a decision tree into the classifier, extract social-related information as classifying features, and apply cost-sensitive learning to improve classification precision. Due to these techniques, we attain a prediction accuracy greater than 80%. Experimental results show that the one-time-access-exclusion approach results in outstanding cache performance in most aspects. Take LRU, for instance: applying our approach improves the hit rate by 4.4%, decreases the cache writes by 56.8%, and cuts the average access latency by 5.5%.

由于其高性价比，SSD在缓存系统中一直扮演着非常重要的角色。由于缓存空间通常比后端存储小一个数量级甚至更多，因此SSD缓存的写密度(单位时间和空间的写)要比HDD存储密集得多，这对SSD的生命周期带来了巨大的挑战。同时，在社交网络工作负载下，对SSD缓存的大量写操作是不必要的。例如，我们对腾讯照片缓存的研究表明，大约61%的照片只被访问一次，而它们仍然在缓存中交换。因此，如果我们能够主动预测这类照片并阻止它们进入缓存，就可以消除不必要的SSD缓存写操作，提高缓存空间利用率。为了应对这一挑战，我们提出了适用于缓存空间的“一次访问标准”，并进一步提出了“一次访问排斥”策略。基于这两种技术，我们设计了一个基于预测的分类器来促进策略的实现。与最先进的基于历史的预测不同，我们的预测是非面向历史的，这对实现良好的预测精度具有挑战性。为了解决这个问题，我们将决策树集成到分类器中，提取社会相关信息作为分类特征，并应用代价敏感学习来提高分类精度。由于这些技术，我们获得了超过80%的预测精度。实验结果表明，一次性访问排除方法在大多数方面都具有优异的缓存性能。以LRU为例:应用我们的方法，命中率提高了4.4%，缓存写减少了56.8%，平均访问延迟减少了5.5%。

{"title":"Cache What You Need to Cache","authors":"Hua Wang, Jiawei Zhang, Ping-Hsiu Huang, Xinbo Yi, Bin Cheng, Ke Zhou","doi":"10.1145/3397766","DOIUrl":"https://doi.org/10.1145/3397766","url":null,"abstract":"The SSD has been playing a significantly important role in caching systems due to its high performance-to-cost ratio. Since the cache space is typically much smaller than that of the backend storage by one order of magnitude or even more, write density (defined as writes per unit time and space) of the SSD cache is therefore much more intensive than that of HDD storage, which brings about tremendous challenges to the SSD’s lifetime. Meanwhile, under social network workloads, quite a lot writes to the SSD cache are unnecessary. For example, our study on Tencent’s photo caching shows that about 61% of total photos are accessed only once, whereas they are still swapped in and out of the cache. Therefore, if we can predict these kinds of photos proactively and prevent them from entering the cache, we can eliminate unnecessary SSD cache writes and improve cache space utilization. To cope with the challenge, we put forward a “one-time-access criteria” that is applied to the cache space and further propose a “one-time-access-exclusion” policy. Based on these two techniques, we design a prediction-based classifier to facilitate the policy. Unlike the state-of-the-art history-based predictions, our prediction is non-history oriented, which is challenging to achieve good prediction accuracy. To address this issue, we integrate a decision tree into the classifier, extract social-related information as classifying features, and apply cost-sensitive learning to improve classification precision. Due to these techniques, we attain a prediction accuracy greater than 80%. Experimental results show that the one-time-access-exclusion approach results in outstanding cache performance in most aspects. Take LRU, for instance: applying our approach improves the hit rate by 4.4%, decreases the cache writes by 56.8%, and cuts the average access latency by 5.5%.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115680776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

ShieldNVM ShieldNVM

ACM Transactions on Storage (TOS)

Pub Date : 2020-05-08 DOI: 10.1145/3381835

Fan Yang, Youmin Chen, Haiyu Mao, Youyou Lu, J. Shu

Data encryption and authentication are essential for secure non-volatile memory (NVM). However, the introduced security metadata needs to be atomically written back to NVM along with data, so as to provide crash consistency, which unfortunately incurs high overhead. To support fine-grained data protection and fast recovery for a secure NVM system without compromising the performance, we propose ShieldNVM. It first proposes an epoch-based mechanism to aggressively cache the security metadata in the metadata cache while retaining the consistency of them in NVM. Deferred spreading is also introduced to reduce the calculating overhead for data authentication. Leveraging the ability of data hash message authentication codes, we can always recover the consistent but old security metadata to its newest version. By recording a limited number of dirty addresses of the security metadata, ShieldNVM achieves fast recovering the secure NVM system after crashes. Compared to Osiris, a state-of-the-art secure NVM, ShieldNVM reduces system runtime by 39.1% and hash message authentication code computation overhead by 80.5% on average over NVM workloads. When system crashes happen, ShieldNVM’s recovery time is orders of magnitude faster than Osiris. In addition, ShieldNVM also recovers faster than AGIT, which is the Osiris-based state-of-the-art mechanism addressing the recovery time of the secure NVM system. Once the recovery process fails, instead of dropping all data due to malicious attacks, ShieldNVM is able to detect and locate the area of the tampered data with the help of the tracked addresses.

数据加密和身份验证对于安全非易失性存储器(NVM)至关重要。但是，引入的安全元数据需要与数据一起自动写回NVM，以便提供崩溃一致性，不幸的是这会带来很高的开销。为了在不影响性能的情况下支持安全NVM系统的细粒度数据保护和快速恢复，我们提出了shield dnvm。首先提出了一种基于时代的机制，在元数据缓存中主动缓存安全元数据，同时在NVM中保持其一致性。为了减少数据验证的计算开销，还引入了延迟扩展。利用数据哈希消息身份验证码的能力，我们总是可以将一致但旧的安全元数据恢复到其最新版本。ShieldNVM通过记录有限数量的安全元数据脏地址，实现安全NVM系统崩溃后的快速恢复。与最先进的安全NVM Osiris相比，在NVM工作负载上，ShieldNVM平均减少了39.1%的系统运行时间和80.5%的哈希消息认证码计算开销。当系统崩溃时，ShieldNVM的恢复时间比Osiris快几个数量级。此外，ShieldNVM的恢复速度也比AGIT快，AGIT是基于osiris的最先进的机制，解决了安全NVM系统的恢复时间。一旦恢复过程失败，而不是由于恶意攻击而丢弃所有数据，ShieldNVM能够在跟踪地址的帮助下检测和定位被篡改数据的区域。

{"title":"ShieldNVM","authors":"Fan Yang, Youmin Chen, Haiyu Mao, Youyou Lu, J. Shu","doi":"10.1145/3381835","DOIUrl":"https://doi.org/10.1145/3381835","url":null,"abstract":"Data encryption and authentication are essential for secure non-volatile memory (NVM). However, the introduced security metadata needs to be atomically written back to NVM along with data, so as to provide crash consistency, which unfortunately incurs high overhead. To support fine-grained data protection and fast recovery for a secure NVM system without compromising the performance, we propose ShieldNVM. It first proposes an epoch-based mechanism to aggressively cache the security metadata in the metadata cache while retaining the consistency of them in NVM. Deferred spreading is also introduced to reduce the calculating overhead for data authentication. Leveraging the ability of data hash message authentication codes, we can always recover the consistent but old security metadata to its newest version. By recording a limited number of dirty addresses of the security metadata, ShieldNVM achieves fast recovering the secure NVM system after crashes. Compared to Osiris, a state-of-the-art secure NVM, ShieldNVM reduces system runtime by 39.1% and hash message authentication code computation overhead by 80.5% on average over NVM workloads. When system crashes happen, ShieldNVM’s recovery time is orders of magnitude faster than Osiris. In addition, ShieldNVM also recovers faster than AGIT, which is the Osiris-based state-of-the-art mechanism addressing the recovery time of the secure NVM system. Once the recovery process fails, instead of dropping all data due to malicious attacks, ShieldNVM is able to detect and locate the area of the tampered data with the help of the tracked addresses.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"657 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122700095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Finding Bugs in File Systems with an Extensible Fuzzing Framework 用可扩展模糊测试框架查找文件系统中的bug

ACM Transactions on Storage (TOS)

Pub Date : 2020-05-08 DOI: 10.1145/3391202

Seulbae Kim, Meng Xu, Sanidhya Kashyap, Jungyeon Yoon, Wen Xu, Taesoo Kim

File systems are too large to be bug free. Although handwritten test suites have been widely used to stress file systems, they can hardly keep up with the rapid increase in file system size and complexity, leading to new bugs being introduced. These bugs come in various flavors: buffer overflows to complicated semantic bugs. Although bug-specific checkers exist, they generally lack a way to explore file system states thoroughly. More importantly, no turnkey solution exists that unifies the checking effort of various aspects of a file system under one umbrella. In this article, to highlight the potential of applying fuzzing to find any type of file system bugs in a generic way, we propose Hydra, an extensible fuzzing framework. Hydra provides building blocks for file system fuzzing, including input mutators, feedback engines, test executors, and bug post-processors. As a result, developers only need to focus on building the core logic for finding bugs of their interests. We showcase the effectiveness of Hydra with four checkers that hunt crash inconsistency, POSIX violations, logic assertion failures, and memory errors. So far, Hydra has discovered 157 new bugs in Linux file systems, including three in verified file systems (FSCQ and Yxv6).

文件系统太大，不可能没有bug。尽管手写的测试套件已被广泛用于测试文件系统，但它们很难跟上文件系统大小和复杂性的快速增长，从而导致引入新的错误。这些错误以各种形式出现:缓冲区溢出到复杂的语义错误。尽管存在特定于bug的检查器，但它们通常缺乏彻底探索文件系统状态的方法。更重要的是，不存在将文件系统的各个方面的检查工作统一在一个保护伞下的交钥匙解决方案。在本文中，为了强调应用模糊测试以通用方式查找任何类型的文件系统错误的潜力，我们提出了Hydra，一个可扩展的模糊测试框架。Hydra为文件系统模糊检测提供了构建块，包括输入变异器、反馈引擎、测试执行器和bug后处理器。因此，开发人员只需要专注于构建核心逻辑来查找他们感兴趣的bug。我们通过四个检查器来展示Hydra的有效性，这些检查器可以查找崩溃不一致、POSIX违规、逻辑断言失败和内存错误。到目前为止，Hydra已经在Linux文件系统中发现了157个新bug，其中包括3个经过验证的文件系统(FSCQ和Yxv6)中的bug。

{"title":"Finding Bugs in File Systems with an Extensible Fuzzing Framework","authors":"Seulbae Kim, Meng Xu, Sanidhya Kashyap, Jungyeon Yoon, Wen Xu, Taesoo Kim","doi":"10.1145/3391202","DOIUrl":"https://doi.org/10.1145/3391202","url":null,"abstract":"File systems are too large to be bug free. Although handwritten test suites have been widely used to stress file systems, they can hardly keep up with the rapid increase in file system size and complexity, leading to new bugs being introduced. These bugs come in various flavors: buffer overflows to complicated semantic bugs. Although bug-specific checkers exist, they generally lack a way to explore file system states thoroughly. More importantly, no turnkey solution exists that unifies the checking effort of various aspects of a file system under one umbrella. In this article, to highlight the potential of applying fuzzing to find any type of file system bugs in a generic way, we propose Hydra, an extensible fuzzing framework. Hydra provides building blocks for file system fuzzing, including input mutators, feedback engines, test executors, and bug post-processors. As a result, developers only need to focus on building the core logic for finding bugs of their interests. We showcase the effectiveness of Hydra with four checkers that hunt crash inconsistency, POSIX violations, logic assertion failures, and memory errors. So far, Hydra has discovered 157 new bugs in Linux file systems, including three in verified file systems (FSCQ and Yxv6).","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124003933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

The Reliability of Modern File Systems in the face of SSD Errors 面对SSD错误时现代文件系统的可靠性

ACM Transactions on Storage (TOS)

Pub Date : 2020-03-16 DOI: 10.1145/3375553

Shehbaz Jaffer, Stathis Maneas, Andy A. Hwang, Bianca Schroeder

As solid state drives (SSDs) are increasingly replacing hard disk drives, the reliability of storage systems depends on the failure modes of SSDs and the ability of the file system layered on top to handle these failure modes. While the classical paper on IRON File Systems provides a thorough study of the failure policies of three file systems common at the time, we argue that 13 years later it is time to revisit file system reliability with SSDs and their reliability characteristics in mind, based on modern file systems that incorporate journaling, copy-on-write, and log-structured approaches and are optimized for flash. This article presents a detailed study, spanning ext4, Btrfs, and F2FS, and covering a number of different SSD error modes. We develop our own fault injection framework and explore over 1,000 error cases. Our results indicate that 16% of these cases result in a file system that cannot be mounted or even repaired by its system checker. We also identify the key file system metadata structures that can cause such failures, and, finally, we recommend some design guidelines for file systems that are deployed on top of SSDs.

随着固态硬盘(ssd)越来越多地取代硬盘驱动器，存储系统的可靠性取决于固态硬盘的故障模式以及其上的文件系统处理这些故障模式的能力。虽然关于IRON文件系统的经典论文对当时常见的三种文件系统的故障策略进行了彻底的研究，但我们认为，13年后，是时候重新审视带有ssd的文件系统可靠性及其可靠性特征了，这是基于合并了日志记录、写时复制和日志结构方法并针对闪存进行了优化的现代文件系统。本文对ext4、Btrfs和F2FS进行了详细的研究，并涵盖了许多不同的SSD错误模式。我们开发了自己的故障注入框架，并研究了1000多个错误案例。我们的结果表明，这些情况中有16%导致文件系统无法挂载，甚至无法通过其系统检查器进行修复。我们还确定了可能导致此类故障的关键文件系统元数据结构，最后，我们为部署在ssd上的文件系统推荐了一些设计准则。

引用次数: 6

Fast Erasure Coding for Data Storage 数据存储的快速擦除编码

ACM Transactions on Storage (TOS)

Pub Date : 2020-03-07 DOI: 10.1145/3375554

Tianli Zhou, C. Tian

Various techniques have been proposed in the literature to improve erasure code computation efficiency, including optimizing bitmatrix design and computation schedule, common XOR (exclusive-OR) operation reduction, caching management techniques, and vectorization techniques. These techniques were largely proposed individually, and, in this work, we seek to use them jointly. To accomplish this task, these techniques need to be thoroughly evaluated individually and their relation better understood. Building on extensive testing, we develop methods to systematically optimize the computation chain together with the underlying bitmatrix. This led to a simple design approach of optimizing the bitmatrix by minimizing a weighted computation cost function, and also a straightforward coding procedure—follow a computation schedule produced from the optimized bitmatrix to apply XOR-level vectorization. This procedure provides better performances than most existing techniques (e.g., those used in ISA-L and Jerasure libraries), and sometimes can even compete against well-known but less general codes such as EVENODD, RDP, and STAR codes. One particularly important observation is that vectorizing the XOR operations is a better choice than directly vectorizing finite field operations, not only because of the flexibility in choosing finite field size and the better encoding throughput, but also its minimal migration efforts onto newer CPUs.

文献中提出了各种技术来提高擦除码的计算效率，包括优化位矩阵设计和计算时间表、减少常见异或操作、缓存管理技术和向量化技术。这些技术在很大程度上是单独提出的，在这项工作中，我们寻求将它们联合使用。为了完成这项任务，需要对这些技术单独进行彻底的评估，并更好地理解它们之间的关系。基于广泛的测试，我们开发了系统地优化计算链和底层比特矩阵的方法。这导致了通过最小化加权计算代价函数来优化位矩阵的简单设计方法，以及一个直接的编码过程-遵循由优化的位矩阵产生的计算时间表来应用异或级矢量化。此过程提供了比大多数现有技术(例如ISA-L和Jerasure库中使用的技术)更好的性能，有时甚至可以与众所周知但不太通用的代码(如EVENODD、RDP和STAR代码)竞争。一个特别重要的观察结果是，与直接向量化有限域操作相比，向量化异或操作是一个更好的选择，这不仅是因为选择有限域大小的灵活性和更好的编码吞吐量，而且还因为向新cpu迁移的工作量最小。

{"title":"Fast Erasure Coding for Data Storage","authors":"Tianli Zhou, C. Tian","doi":"10.1145/3375554","DOIUrl":"https://doi.org/10.1145/3375554","url":null,"abstract":"Various techniques have been proposed in the literature to improve erasure code computation efficiency, including optimizing bitmatrix design and computation schedule, common XOR (exclusive-OR) operation reduction, caching management techniques, and vectorization techniques. These techniques were largely proposed individually, and, in this work, we seek to use them jointly. To accomplish this task, these techniques need to be thoroughly evaluated individually and their relation better understood. Building on extensive testing, we develop methods to systematically optimize the computation chain together with the underlying bitmatrix. This led to a simple design approach of optimizing the bitmatrix by minimizing a weighted computation cost function, and also a straightforward coding procedure—follow a computation schedule produced from the optimized bitmatrix to apply XOR-level vectorization. This procedure provides better performances than most existing techniques (e.g., those used in ISA-L and Jerasure libraries), and sometimes can even compete against well-known but less general codes such as EVENODD, RDP, and STAR codes. One particularly important observation is that vectorizing the XOR operations is a better choice than directly vectorizing finite field operations, not only because of the flexibility in choosing finite field size and the better encoding throughput, but also its minimal migration efforts onto newer CPUs.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132249656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8