CrashMonkey and ACE

Jayashree Mohan, Ashlie Martinez, Soujanya Ponnapalli, P. Raju, Vijay Chidambaram
{"title":"CrashMonkey and ACE","authors":"Jayashree Mohan, Ashlie Martinez, Soujanya Ponnapalli, P. Raju, Vijay Chidambaram","doi":"10.1145/3320275","DOIUrl":null,"url":null,"abstract":"We present CrashMonkey and Ace, a set of tools to systematically find crash-consistency bugs in Linux file systems. CrashMonkey is a record-and-replay framework which tests a given workload on the target file system by simulating power-loss crashes while the workload is being executed, and checking if the file system recovers to a correct state after each crash. Ace automatically generates all the workloads to be run on the target file system. We build CrashMonkey and Ace based on a new approach to test file-system crash consistency: bounded black-box crash testing (B3). B3 tests the file system in a black-box manner using workloads of file-system operations. Since the space of possible workloads is infinite, B3 bounds this space based on parameters such as the number of file-system operations or which operations to include, and exhaustively generates workloads within this bounded space. B3 builds upon insights derived from our study of crash-consistency bugs reported in Linux file systems in the last 5 years. We observed that most reported bugs can be reproduced using small workloads of three or fewer file-system operations on a newly created file system, and that all reported bugs result from crashes after fsync()-related system calls. CrashMonkey and Ace are able to find 24 out of the 26 crash-consistency bugs reported in the last 5 years. Our tools also revealed 10 new crash-consistency bugs in widely used, mature Linux file systems, 7 of which existed in the kernel since 2014. Additionally, our tools found a crash-consistency bug in a verified file system, FSCQ. The new bugs result in severe consequences like broken rename atomicity, loss of persisted files and directories, and data loss.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Storage (TOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3320275","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

We present CrashMonkey and Ace, a set of tools to systematically find crash-consistency bugs in Linux file systems. CrashMonkey is a record-and-replay framework which tests a given workload on the target file system by simulating power-loss crashes while the workload is being executed, and checking if the file system recovers to a correct state after each crash. Ace automatically generates all the workloads to be run on the target file system. We build CrashMonkey and Ace based on a new approach to test file-system crash consistency: bounded black-box crash testing (B3). B3 tests the file system in a black-box manner using workloads of file-system operations. Since the space of possible workloads is infinite, B3 bounds this space based on parameters such as the number of file-system operations or which operations to include, and exhaustively generates workloads within this bounded space. B3 builds upon insights derived from our study of crash-consistency bugs reported in Linux file systems in the last 5 years. We observed that most reported bugs can be reproduced using small workloads of three or fewer file-system operations on a newly created file system, and that all reported bugs result from crashes after fsync()-related system calls. CrashMonkey and Ace are able to find 24 out of the 26 crash-consistency bugs reported in the last 5 years. Our tools also revealed 10 new crash-consistency bugs in widely used, mature Linux file systems, 7 of which existed in the kernel since 2014. Additionally, our tools found a crash-consistency bug in a verified file system, FSCQ. The new bugs result in severe consequences like broken rename atomicity, loss of persisted files and directories, and data loss.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CrashMonkey和ACE
我们介绍了CrashMonkey和Ace,这是一套系统地发现Linux文件系统中崩溃一致性错误的工具。CrashMonkey是一个记录和重放框架,它通过模拟工作负载正在执行时的断电崩溃来测试目标文件系统上给定的工作负载,并检查每次崩溃后文件系统是否恢复到正确的状态。Ace自动生成要在目标文件系统上运行的所有工作负载。我们基于一种测试文件系统崩溃一致性的新方法构建了CrashMonkey和Ace:有界黑盒崩溃测试(B3)。B3使用文件系统操作的工作负载以黑盒方式测试文件系统。由于可能的工作负载空间是无限的,因此B3根据文件系统操作的数量或要包含的操作等参数对该空间进行限定,并在这个限定的空间内详尽地生成工作负载。B3基于我们在过去5年中对Linux文件系统中报告的崩溃一致性错误的研究得出的见解。我们观察到,大多数报告的bug都可以在新创建的文件系统上使用三个或更少的文件系统操作的小工作负载来重现,并且所有报告的bug都是由与fsync()相关的系统调用之后的崩溃造成的。CrashMonkey和Ace能够发现过去5年报告的26个崩溃一致性漏洞中的24个。我们的工具还在广泛使用的成熟Linux文件系统中发现了10个新的崩溃一致性错误,其中7个自2014年以来就存在于内核中。此外,我们的工具在经过验证的文件系统FSCQ中发现了一个崩溃一致性错误。这些新bug会导致严重的后果,比如重命名原子性被破坏、持久化文件和目录丢失以及数据丢失。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
WebAssembly-based Delta Sync for Cloud Storage Services DEFUSE: An Interface for Fast and Correct User Space File System Access Donag: Generating Efficient Patches and Diffs for Compressed Archives Building GC-free Key-value Store on HM-SMR Drives with ZoneFS Kangaroo: Theory and Practice of Caching Billions of Tiny Objects on Flash
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1