A partial task replication algorithm for fault- tolerant FPGA-based soft-multiprocessors

Masoume Zabihi, Hamed Farbeh, S. Miremadi
{"title":"A partial task replication algorithm for fault- tolerant FPGA-based soft-multiprocessors","authors":"Masoume Zabihi, Hamed Farbeh, S. Miremadi","doi":"10.1109/RTEST.2015.7369842","DOIUrl":null,"url":null,"abstract":"FPGA-based multiprocessors, referred as softmultiprocessors, have an increasing use in embedded systems due to appealing SRAM features. More than 95% of such FPGAs are occupied by SRAM cells constructing the configuration bits. These SRAM cells are highly vulnerable to soft errors threatening the reliability of the system. This paper proposes a fault-tolerant method to detect and correct errors in the configuration bits. The main of this method is to analyze the scheduled task graph and select a subset of tasks to be replicated in multiple processors based on the utilization of the processors in different execution phases. To this end, 1) errors are detected by re-executing a subset of tasks in multiple processors and comparing their output; 2) errors are corrected by re-downloading the fault-free bitstream; 3) errors are recovered from correct checkpoints. To evaluate the proposed method, a FPGA containing four and eight processors running randomly generated task graphs is evaluated. The simulation results show that the performance overhead of the proposed method for four and eight processors is 20% and 15%, respectively. These values for lockstep method are about 90% and 45%, respectively. Moreover, the area overhead of the proposed method is zero.","PeriodicalId":376270,"journal":{"name":"2015 CSI Symposium on Real-Time and Embedded Systems and Technologies (RTEST)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 CSI Symposium on Real-Time and Embedded Systems and Technologies (RTEST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTEST.2015.7369842","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

FPGA-based multiprocessors, referred as softmultiprocessors, have an increasing use in embedded systems due to appealing SRAM features. More than 95% of such FPGAs are occupied by SRAM cells constructing the configuration bits. These SRAM cells are highly vulnerable to soft errors threatening the reliability of the system. This paper proposes a fault-tolerant method to detect and correct errors in the configuration bits. The main of this method is to analyze the scheduled task graph and select a subset of tasks to be replicated in multiple processors based on the utilization of the processors in different execution phases. To this end, 1) errors are detected by re-executing a subset of tasks in multiple processors and comparing their output; 2) errors are corrected by re-downloading the fault-free bitstream; 3) errors are recovered from correct checkpoints. To evaluate the proposed method, a FPGA containing four and eight processors running randomly generated task graphs is evaluated. The simulation results show that the performance overhead of the proposed method for four and eight processors is 20% and 15%, respectively. These values for lockstep method are about 90% and 45%, respectively. Moreover, the area overhead of the proposed method is zero.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于fpga的容错软多处理器部分任务复制算法
基于fpga的多处理器,称为软多处理器,由于具有吸引人的SRAM特性,在嵌入式系统中的应用越来越多。超过95%的此类fpga由构造配置位的SRAM单元占用。这些SRAM单元极易受到软错误的影响,威胁到系统的可靠性。本文提出了一种容错检测和纠错配置位的方法。该方法的主要内容是分析计划任务图,并根据处理器在不同执行阶段的利用率,选择要在多个处理器中复制的任务子集。为此,1)通过在多个处理器中重新执行任务子集并比较它们的输出来检测错误;2)通过重新下载无故障比特流来纠正错误;3)从正确的检查点恢复错误。为了评估所提出的方法,评估了包含4个和8个处理器的FPGA运行随机生成的任务图。仿真结果表明,该方法在4个处理器和8个处理器下的性能开销分别为20%和15%。步进法的这些值分别约为90%和45%。此外,该方法的面积开销为零。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Stretch: exploiting service level degradation for energy management in mixed-criticality systems Fault-tolerant architecture and CAD algorithm for field-programmable pin-constrained digital microfluidic biochips A partial task replication algorithm for fault- tolerant FPGA-based soft-multiprocessors Two effective anomaly correction methods in embedded systems A2CM2: aging-aware cache memory management technique
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1