Optimizing crash dump in virtualized environments

Yijian Huang, Haibo Chen, B. Zang
{"title":"Optimizing crash dump in virtualized environments","authors":"Yijian Huang, Haibo Chen, B. Zang","doi":"10.1145/1735997.1736003","DOIUrl":null,"url":null,"abstract":"Crash dump, or core dump is the typical way to save memory image on system crash for future offline debugging and analysis. However, for typical server machines with likely abundant memory, the time of core dump can significantly increase the mean time to repair (MTTR) by delaying the reboot-based recovery, while not dumping the failure context for analysis would risk recurring crashes on the same problems.\n In this paper, we propose several optimization techniques for core dump in virtualized environments, in order to shorten the MTTR of consolidated virtual machines during crashes. First, we parallelize the process of crash dump and the process of rebooting the crashed VM, by dynamically reclaiming and allocating memory between the crashed VM and the newly spawned VM. Second, we use the virtual machine management layer to introspect the critical data structures of the crashed VM to filter out the dump of unused memory. Finally, we implement disk I/O rate control between core dump and the newly spawned VM according to user-tuned rate control policy to balance the time of crash dump and quality of services in the recovery VM.\n We have implemented a working prototype, Vicover, that optimizes core dump on system crash of a virtual machine in Xen, to minimize the MTTR of core dump and recovery as a whole. In our experiment on a virtualized TPC-W server, Vicover shortens the downtime caused by crash dump by around 5X.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Virtual Execution Environments","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1735997.1736003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Crash dump, or core dump is the typical way to save memory image on system crash for future offline debugging and analysis. However, for typical server machines with likely abundant memory, the time of core dump can significantly increase the mean time to repair (MTTR) by delaying the reboot-based recovery, while not dumping the failure context for analysis would risk recurring crashes on the same problems. In this paper, we propose several optimization techniques for core dump in virtualized environments, in order to shorten the MTTR of consolidated virtual machines during crashes. First, we parallelize the process of crash dump and the process of rebooting the crashed VM, by dynamically reclaiming and allocating memory between the crashed VM and the newly spawned VM. Second, we use the virtual machine management layer to introspect the critical data structures of the crashed VM to filter out the dump of unused memory. Finally, we implement disk I/O rate control between core dump and the newly spawned VM according to user-tuned rate control policy to balance the time of crash dump and quality of services in the recovery VM. We have implemented a working prototype, Vicover, that optimizes core dump on system crash of a virtual machine in Xen, to minimize the MTTR of core dump and recovery as a whole. In our experiment on a virtualized TPC-W server, Vicover shortens the downtime caused by crash dump by around 5X.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
优化虚拟环境中的崩溃转储
崩溃转储或核心转储是在系统崩溃时保存内存映像以供将来脱机调试和分析的典型方法。但是,对于可能具有丰富内存的典型服务器机器,核心转储的时间会延迟基于重新启动的恢复,从而显著增加平均修复时间(MTTR),而不转储故障上下文进行分析可能会导致同一问题上的重复崩溃。在本文中,为了缩短合并虚拟机在崩溃期间的MTTR,我们提出了几种虚拟化环境中核心转储的优化技术。首先,我们通过在崩溃的VM和新生成的VM之间动态回收和分配内存,并行处理崩溃转储和重新启动崩溃VM的过程。其次,我们使用虚拟机管理层来内省崩溃虚拟机的关键数据结构,以过滤掉未使用的内存转储。最后,我们根据用户调整的速率控制策略,在核心转储和新生成的虚拟机之间实现磁盘I/O速率控制,以平衡崩溃转储的时间和恢复虚拟机的服务质量。我们已经实现了一个工作原型,Vicover,它在Xen中优化了虚拟机系统崩溃时的核心转储,以最小化核心转储和恢复的整体MTTR。在我们对虚拟TPC-W服务器的实验中,Vicover将崩溃转储导致的停机时间缩短了大约5倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Shrinking the hypervisor one subsystem at a time: a userspace packet switch for virtual machines A fast abstract syntax tree interpreter for R DBILL: an efficient and retargetable dynamic binary instrumentation framework using llvm backend Ginseng: market-driven memory allocation Tesseract: reconciling guest I/O and hypervisor swapping in a VM
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1