Architectural vulnerability aware checkpoint placement in a multicore processor

A. Lotfi, Arash Bayat, S. Safari
{"title":"Architectural vulnerability aware checkpoint placement in a multicore processor","authors":"A. Lotfi, Arash Bayat, S. Safari","doi":"10.1109/IOLTS.2012.6313852","DOIUrl":null,"url":null,"abstract":"As the system complexity increases, the failure probability increases substantially. Therefore, the system requires techniques for supporting fault tolerance. Checkpointing technique is widely used to reduce the execution time of long-running programs in presence of failures and enhancing the reliability of such systems. Several methods were studied thus far in order to determine the checkpointing interval which optimizes system performance. The crucial parameter in all of these solutions is system failure model which is primarily assumed as exponential or Weibull distributions. But, these models are not perfectly accurate since they fail to model the effect of soft errors. In this paper, we introduce a more realistic failure model based on the processors AVF. In addition, we propose three checkpoint placement methods with constant and variable intervals that determine suitable checkpoint places for the proposed failure model. Our experimental results show that our method, which is implementable on any multicore system, can find the suitable points in which checkpoints should be taken.","PeriodicalId":246222,"journal":{"name":"2012 IEEE 18th International On-Line Testing Symposium (IOLTS)","volume":"157 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 18th International On-Line Testing Symposium (IOLTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IOLTS.2012.6313852","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

As the system complexity increases, the failure probability increases substantially. Therefore, the system requires techniques for supporting fault tolerance. Checkpointing technique is widely used to reduce the execution time of long-running programs in presence of failures and enhancing the reliability of such systems. Several methods were studied thus far in order to determine the checkpointing interval which optimizes system performance. The crucial parameter in all of these solutions is system failure model which is primarily assumed as exponential or Weibull distributions. But, these models are not perfectly accurate since they fail to model the effect of soft errors. In this paper, we introduce a more realistic failure model based on the processors AVF. In addition, we propose three checkpoint placement methods with constant and variable intervals that determine suitable checkpoint places for the proposed failure model. Our experimental results show that our method, which is implementable on any multicore system, can find the suitable points in which checkpoints should be taken.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多核处理器中的体系结构漏洞感知检查点放置
随着系统复杂性的增加,故障概率也随之大幅增加。因此,系统需要支持容错的技术。检查点技术被广泛用于减少存在故障的长时间运行程序的执行时间,并提高这类系统的可靠性。为了确定使系统性能最优的检查点间隔,目前研究了几种方法。所有这些解决方案的关键参数是系统失效模型,主要假设为指数分布或威布尔分布。但是,这些模型并不完全准确,因为它们不能模拟软误差的影响。本文介绍了一种基于处理器AVF的更为现实的故障模型。此外,我们提出了三种具有恒定和可变间隔的检查点放置方法,以确定所提出的故障模型的合适检查点位置。实验结果表明,我们的方法可以在任何多核系统上实现,可以找到应该采取检查点的合适点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analysis of FinFET technology on memories Fault missing rate analysis of the arithmetic residue codes based fault-tolerant FIR filter design Fault coverage of a timing and control flow checker for hard real-time systems Architectural vulnerability aware checkpoint placement in a multicore processor A real-case application of a synergetic design-flow-oriented SER analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1