Recovery scheme for hardening system on programmable chips

C. Meinhardt, R. Reis, M. Violante, M. Reorda
{"title":"Recovery scheme for hardening system on programmable chips","authors":"C. Meinhardt, R. Reis, M. Violante, M. Reorda","doi":"10.1109/LATW.2009.4813816","DOIUrl":null,"url":null,"abstract":"The checkpoint and rollback recovery techniques enable a system to survive failures by periodically saving a known good snapshot of the system's state, and rolling back to it in case a failure is detected. The approach is particularly interesting for developing critical systems on programmable chips that today offers multiple embedded processor cores, as well as configurable fabric that can be used to implement error detection and correction mechanisms. This paper presents an approach that aims at developing a safety- or mission-critical systems on programmable chip able to tolerate soft errors by exploiting processor duplication to implement error detection, as well as checkpoint and rollback recovery to correct errors in a cost-efficient manner. We developed a prototypical implementation of the proposed approach targeting the Leon processor core, and we collected preliminary results that outline the capability of the technique to tolerate soft errors affecting the processor's internal registers. This paper is the first step toward the definition of an automatic design flow for hardening processor cores (either hard of soft) embedded in programmable chips, like for example SRAM-based FPGAs.","PeriodicalId":343240,"journal":{"name":"2009 10th Latin American Test Workshop","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 10th Latin American Test Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LATW.2009.4813816","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The checkpoint and rollback recovery techniques enable a system to survive failures by periodically saving a known good snapshot of the system's state, and rolling back to it in case a failure is detected. The approach is particularly interesting for developing critical systems on programmable chips that today offers multiple embedded processor cores, as well as configurable fabric that can be used to implement error detection and correction mechanisms. This paper presents an approach that aims at developing a safety- or mission-critical systems on programmable chip able to tolerate soft errors by exploiting processor duplication to implement error detection, as well as checkpoint and rollback recovery to correct errors in a cost-efficient manner. We developed a prototypical implementation of the proposed approach targeting the Leon processor core, and we collected preliminary results that outline the capability of the technique to tolerate soft errors affecting the processor's internal registers. This paper is the first step toward the definition of an automatic design flow for hardening processor cores (either hard of soft) embedded in programmable chips, like for example SRAM-based FPGAs.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
可编程芯片上硬化系统的恢复方案
检查点和回滚恢复技术通过定期保存系统状态的已知良好快照,并在检测到故障时回滚到该快照,使系统能够在故障中存活下来。这种方法对于在可编程芯片上开发关键系统特别有趣,这些芯片目前提供多个嵌入式处理器内核,以及可用于实现错误检测和纠正机制的可配置结构。本文提出了一种方法,旨在开发一种安全或关键任务系统在可编程芯片上能够容忍软错误,通过利用处理器复制来实现错误检测,以及检查点和回滚恢复,以一种经济有效的方式纠正错误。我们开发了针对Leon处理器核心的建议方法的原型实现,并收集了初步结果,概述了该技术容忍影响处理器内部寄存器的软错误的能力。本文是定义嵌入可编程芯片(例如基于sram的fpga)的硬化处理器内核(无论是硬的还是软的)的自动设计流程的第一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Test and qualification of a Fault Tolerant FPGA based Active Antenna System for space applications NoC interconnection functional testing: Using boundary-scan to reduce the overall testing time Fault tolerance assessment of PIC microcontroller based on fault injection Using Bulk Built-In Current Sensors and recomputing techniques to mitigate transient faults in microprocessors Study of radiation effects on PIN photodiodes with deep-trap levels using computer modeling
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1