ReHype:允许虚拟机在虚拟机管理程序故障时存活

Michael V. Le, Y. Tamir
{"title":"ReHype:允许虚拟机在虚拟机管理程序故障时存活","authors":"Michael V. Le, Y. Tamir","doi":"10.1145/1952682.1952692","DOIUrl":null,"url":null,"abstract":"With existing virtualized systems, hypervisor failures lead to overall system failure and the loss of all the work in progress of virtual machines (VMs) running on the system. We introduce ReHype, a mechanism for recovery from hypervisor failures by booting a new instance of the hypervisor while preserving the state of running VMs. VMs are stalled during the hypervisor reboot and resume normal execution once the new hypervisor instance is running. Hypervisor failures can lead to arbitrary state corruption and inconsistencies throughout the system. ReHype deals with the challenge of protecting the recovered hypervisor instance from such corrupted state and resolving inconsistencies between different parts of hypervisor state as well as between the hypervisor and VMs and between the hypervisor and the hardware. We have implemented ReHype for the Xen hypervisor. The implementation was done incrementally, using results from fault injection experiments to identify the sources of dangerous state corruption and inconsistencies. The implementation of ReHype involved only 880 LOC added or modified in Xen. The memory space overhead of ReHype is only 2.1MB for a pristine copy of the hypervisor code and static data plus a small reserved memory area. The fault injection campaigns used to evaluate the effectiveness of ReHype involved a system with multiple VMs running I/O and hypercall-intensive benchmarks. Our experimental results show that the ReHype prototype can successfully recover from over 90% of detected hypervisor failures.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":"{\"title\":\"ReHype: enabling VM survival across hypervisor failures\",\"authors\":\"Michael V. Le, Y. Tamir\",\"doi\":\"10.1145/1952682.1952692\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With existing virtualized systems, hypervisor failures lead to overall system failure and the loss of all the work in progress of virtual machines (VMs) running on the system. We introduce ReHype, a mechanism for recovery from hypervisor failures by booting a new instance of the hypervisor while preserving the state of running VMs. VMs are stalled during the hypervisor reboot and resume normal execution once the new hypervisor instance is running. Hypervisor failures can lead to arbitrary state corruption and inconsistencies throughout the system. ReHype deals with the challenge of protecting the recovered hypervisor instance from such corrupted state and resolving inconsistencies between different parts of hypervisor state as well as between the hypervisor and VMs and between the hypervisor and the hardware. We have implemented ReHype for the Xen hypervisor. The implementation was done incrementally, using results from fault injection experiments to identify the sources of dangerous state corruption and inconsistencies. The implementation of ReHype involved only 880 LOC added or modified in Xen. The memory space overhead of ReHype is only 2.1MB for a pristine copy of the hypervisor code and static data plus a small reserved memory area. The fault injection campaigns used to evaluate the effectiveness of ReHype involved a system with multiple VMs running I/O and hypercall-intensive benchmarks. Our experimental results show that the ReHype prototype can successfully recover from over 90% of detected hypervisor failures.\",\"PeriodicalId\":202844,\"journal\":{\"name\":\"International Conference on Virtual Execution Environments\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-03-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"36\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Virtual Execution Environments\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1952682.1952692\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Virtual Execution Environments","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1952682.1952692","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36

摘要

对于现有的虚拟化系统,管理程序故障会导致整个系统故障,并且丢失系统上运行的虚拟机正在进行的所有工作。我们引入ReHype,这是一种通过在保持运行中的虚拟机状态的同时启动虚拟机管理程序的新实例来从虚拟机管理程序故障中恢复的机制。虚拟机在hypervisor重启期间会停止运行,在新的hypervisor实例运行后恢复正常运行。管理程序故障可能导致整个系统的任意状态损坏和不一致。ReHype处理的挑战是保护已恢复的管理程序实例免受这种损坏状态的影响,并解决管理程序状态的不同部分之间、管理程序与vm之间以及管理程序与硬件之间的不一致。我们已经为Xen管理程序实现了ReHype。该实现是逐步完成的,使用故障注入实验的结果来识别危险状态损坏和不一致的来源。ReHype的实现只涉及在Xen中添加或修改的880个LOC。ReHype的内存空间开销仅为2.1MB,用于管理程序代码和静态数据的原始副本以及一个小的保留内存区域。用于评估ReHype有效性的故障注入活动涉及一个具有多个运行I/O和超调用密集型基准的vm的系统。我们的实验结果表明,ReHype原型可以成功地从90%以上检测到的hypervisor故障中恢复。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ReHype: enabling VM survival across hypervisor failures
With existing virtualized systems, hypervisor failures lead to overall system failure and the loss of all the work in progress of virtual machines (VMs) running on the system. We introduce ReHype, a mechanism for recovery from hypervisor failures by booting a new instance of the hypervisor while preserving the state of running VMs. VMs are stalled during the hypervisor reboot and resume normal execution once the new hypervisor instance is running. Hypervisor failures can lead to arbitrary state corruption and inconsistencies throughout the system. ReHype deals with the challenge of protecting the recovered hypervisor instance from such corrupted state and resolving inconsistencies between different parts of hypervisor state as well as between the hypervisor and VMs and between the hypervisor and the hardware. We have implemented ReHype for the Xen hypervisor. The implementation was done incrementally, using results from fault injection experiments to identify the sources of dangerous state corruption and inconsistencies. The implementation of ReHype involved only 880 LOC added or modified in Xen. The memory space overhead of ReHype is only 2.1MB for a pristine copy of the hypervisor code and static data plus a small reserved memory area. The fault injection campaigns used to evaluate the effectiveness of ReHype involved a system with multiple VMs running I/O and hypercall-intensive benchmarks. Our experimental results show that the ReHype prototype can successfully recover from over 90% of detected hypervisor failures.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Shrinking the hypervisor one subsystem at a time: a userspace packet switch for virtual machines A fast abstract syntax tree interpreter for R DBILL: an efficient and retargetable dynamic binary instrumentation framework using llvm backend Ginseng: market-driven memory allocation Tesseract: reconciling guest I/O and hypervisor swapping in a VM
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1