Replay without Recording of Production Bugs for Service Oriented Applications

Nipun Arora, Jonathan Bell, Franjo Ivancic, G. Kaiser, Baishakhi Ray
{"title":"Replay without Recording of Production Bugs for Service Oriented Applications","authors":"Nipun Arora, Jonathan Bell, Franjo Ivancic, G. Kaiser, Baishakhi Ray","doi":"10.1145/3238147.3238186","DOIUrl":null,"url":null,"abstract":"Short time-to-localize and time-to-fix for production bugs is extremely important for any 24x7 service-oriented application (SOA). Debugging buggy behavior in deployed applications is hard, as it requires careful reproduction of a similar environment and workload. Prior approaches for automatically reproducing production failures do not scale to large SOA systems. Our key insight is that for many failures in SOA systems (e.g., many semantic and performance bugs), a failure can automatically be reproduced solely by relaying network packets to replicas of suspect services, an insight that we validated through a manual study of 16 real bugs across five different systems. This paper presents Parikshan, an application monitoring framework that leverages user-space virtualization and network proxy technologies to provide a sandbox “debug” environment. In this “debug” environment, developers are free to attach debuggers and analysis tools without impacting performance or correctness of the production environment. In comparison to existing monitoring solutions that can slow down production applications, Parikshan allows application monitoring at significantly lower overhead.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"39 1","pages":"452-463"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3238147.3238186","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Short time-to-localize and time-to-fix for production bugs is extremely important for any 24x7 service-oriented application (SOA). Debugging buggy behavior in deployed applications is hard, as it requires careful reproduction of a similar environment and workload. Prior approaches for automatically reproducing production failures do not scale to large SOA systems. Our key insight is that for many failures in SOA systems (e.g., many semantic and performance bugs), a failure can automatically be reproduced solely by relaying network packets to replicas of suspect services, an insight that we validated through a manual study of 16 real bugs across five different systems. This paper presents Parikshan, an application monitoring framework that leverages user-space virtualization and network proxy technologies to provide a sandbox “debug” environment. In this “debug” environment, developers are free to attach debuggers and analysis tools without impacting performance or correctness of the production environment. In comparison to existing monitoring solutions that can slow down production applications, Parikshan allows application monitoring at significantly lower overhead.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面向服务的应用程序的不记录生产错误的重播
对于任何24x7面向服务的应用程序(SOA)来说,短时间本地化和短时间修复生产错误是极其重要的。调试已部署应用程序中的错误行为是困难的,因为它需要仔细地再现类似的环境和工作负载。以前用于自动再现生产故障的方法不能扩展到大型SOA系统。我们的主要见解是,对于SOA系统中的许多故障(例如,许多语义和性能错误),仅通过将网络数据包转发到可疑服务的副本就可以自动再现故障,我们通过手动研究五个不同系统中的16个真实错误验证了这一见解。本文介绍了Parikshan,一个利用用户空间虚拟化和网络代理技术提供沙盒“调试”环境的应用程序监控框架。在这个“调试”环境中,开发人员可以自由地附加调试器和分析工具,而不会影响生产环境的性能或正确性。与现有的可能降低生产应用程序速度的监控解决方案相比,Parikshan允许以更低的开销进行应用程序监控。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automatically Testing Implementations of Numerical Abstract Domains Self-Protection of Android Systems from Inter-component Communication Attacks Characterizing the Natural Language Descriptions in Software Logging Statements DroidMate-2: A Platform for Android Test Generation CPA-SymExec: Efficient Symbolic Execution in CPAchecker
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1