Execution reconstruction: harnessing failure reoccurrences for failure reproduction

Gefei Zuo, Jiacheng Ma, Andrew Quinn, Pramod Bhatotia, Pedro Fonseca, Baris Kasikci
{"title":"Execution reconstruction: harnessing failure reoccurrences for failure reproduction","authors":"Gefei Zuo, Jiacheng Ma, Andrew Quinn, Pramod Bhatotia, Pedro Fonseca, Baris Kasikci","doi":"10.1145/3453483.3454101","DOIUrl":null,"url":null,"abstract":"Reproducing production failures is crucial for software reliability. Alas, existing bug reproduction approaches are not suitable for production systems because they are not simultaneously efficient, effective, and accurate. In this work, we survey prior techniques and show that existing approaches over-prioritize a subset of these properties, and sacrifice the remaining ones. As a result, existing tools do not enable the plethora of proposed failure reproduction use-cases (e.g., debugging, security forensics, fuzzing) for production failures. We propose Execution Reconstruction (ER), a technique that strikes a better balance between efficiency, effectiveness and accuracy for reproducing production failures. ER uses hardware-assisted control and data tracing to shepherd symbolic execution and reproduce failures. ER’s key novelty lies in identifying data values that are both inexpensive to monitor and useful for eliding the scalability limitations of symbolic execution. ER harnesses failure reoccurrences by iteratively performing tracing and symbolic execution, which reduces runtime overhead. Whereas prior production-grade techniques can only reproduce short executions, ER can reproduce any reoccuring failure. Thus, unlike existing tools, ER reproduces fully replayable executions that can power a variety of debugging and reliabilty use cases. ER incurs on average 0.3% (up to 1.1%) runtime monitoring overhead for a broad range of real-world systems, making it practical for real-world deployment.","PeriodicalId":20557,"journal":{"name":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"38 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3453483.3454101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Reproducing production failures is crucial for software reliability. Alas, existing bug reproduction approaches are not suitable for production systems because they are not simultaneously efficient, effective, and accurate. In this work, we survey prior techniques and show that existing approaches over-prioritize a subset of these properties, and sacrifice the remaining ones. As a result, existing tools do not enable the plethora of proposed failure reproduction use-cases (e.g., debugging, security forensics, fuzzing) for production failures. We propose Execution Reconstruction (ER), a technique that strikes a better balance between efficiency, effectiveness and accuracy for reproducing production failures. ER uses hardware-assisted control and data tracing to shepherd symbolic execution and reproduce failures. ER’s key novelty lies in identifying data values that are both inexpensive to monitor and useful for eliding the scalability limitations of symbolic execution. ER harnesses failure reoccurrences by iteratively performing tracing and symbolic execution, which reduces runtime overhead. Whereas prior production-grade techniques can only reproduce short executions, ER can reproduce any reoccuring failure. Thus, unlike existing tools, ER reproduces fully replayable executions that can power a variety of debugging and reliabilty use cases. ER incurs on average 0.3% (up to 1.1%) runtime monitoring overhead for a broad range of real-world systems, making it practical for real-world deployment.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
执行重构:利用故障再现来实现故障再现
再现生产故障对于软件可靠性至关重要。唉,现有的bug复制方法不适合生产系统,因为它们不能同时高效、有效和准确。在这项工作中,我们调查了先前的技术,并表明现有的方法优先考虑了这些属性的子集,而牺牲了其余的属性。因此,现有的工具不能为生产故障启用过多的故障再现用例(例如,调试、安全取证、模糊测试)。我们提出了执行重建(ER),这是一种在再现生产故障的效率、有效性和准确性之间取得更好平衡的技术。ER使用硬件辅助控制和数据跟踪来指导符号执行和再现故障。ER的关键新颖之处在于识别数据值,这些数据值的监控成本低,而且有助于消除符号执行的可伸缩性限制。ER通过迭代地执行跟踪和符号执行来控制故障的再次发生,从而减少了运行时开销。以前的生产级技术只能重现短时间的执行,而ER可以重现任何反复出现的故障。因此,与现有的工具不同,ER再现了完全可重放的执行,可以为各种调试和可靠性用例提供支持。对于广泛的实际系统,ER平均产生0.3%(最高1.1%)的运行时监视开销,这使得它适用于实际部署。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Learning to find naming issues with big code and small supervision Cyclic program synthesis Fluid: a framework for approximate concurrency via controlled dependency relaxation Bliss: auto-tuning complex applications using a pool of diverse lightweight learning models Phased synthesis of divide and conquer programs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1