A fault-tolerant system-on-programmable-chip based on domain-partition and blind reconfiguration

L. Shang, Mi Zhou, Yu Hu
{"title":"A fault-tolerant system-on-programmable-chip based on domain-partition and blind reconfiguration","authors":"L. Shang, Mi Zhou, Yu Hu","doi":"10.1109/AHS.2010.5546245","DOIUrl":null,"url":null,"abstract":"Field programmable gate arrays (FPGAs) are widely used in building Systems-on-Programmable-Chips (SOPCs) since they contain plenty of reconfigurable heterogeneous resources providing the facility to implement various intellectual property cores. However, with the shrinking device feature size and the increasing die area, nowadays FPGAs can be deeply affected by the errors induced by electromigration and radiation, which results in challenges of building reliable SOPCs. In this paper, a SOPC implementing a smart 1553B bus node is presented to investigate the challenges and illustrate a feasible approach for building a complex system aimed at high reliability and low recovery latency on a commercial FPGA. First, a general reliability model, the DomainPartition (DP) model, is introduced to formulate the SOPCs which contain multiple alternative configurations proving the fault recovery capability. The assignment of the alternative configurations for maximizing the reliability is then determined according to a first-order optimal solution under the DP framework. Finally, the blind reconfiguration technique is used to reduce the recovery latency. The experiments based on a Monte Carlo simulation approach are carried out to evaluate the reliability and the latency. The obtained results show that higher reliability is attainable with less overhead than the generic triple-modular redundancy method.","PeriodicalId":101655,"journal":{"name":"2010 NASA/ESA Conference on Adaptive Hardware and Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2010-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 NASA/ESA Conference on Adaptive Hardware and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AHS.2010.5546245","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Field programmable gate arrays (FPGAs) are widely used in building Systems-on-Programmable-Chips (SOPCs) since they contain plenty of reconfigurable heterogeneous resources providing the facility to implement various intellectual property cores. However, with the shrinking device feature size and the increasing die area, nowadays FPGAs can be deeply affected by the errors induced by electromigration and radiation, which results in challenges of building reliable SOPCs. In this paper, a SOPC implementing a smart 1553B bus node is presented to investigate the challenges and illustrate a feasible approach for building a complex system aimed at high reliability and low recovery latency on a commercial FPGA. First, a general reliability model, the DomainPartition (DP) model, is introduced to formulate the SOPCs which contain multiple alternative configurations proving the fault recovery capability. The assignment of the alternative configurations for maximizing the reliability is then determined according to a first-order optimal solution under the DP framework. Finally, the blind reconfiguration technique is used to reduce the recovery latency. The experiments based on a Monte Carlo simulation approach are carried out to evaluate the reliability and the latency. The obtained results show that higher reliability is attainable with less overhead than the generic triple-modular redundancy method.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于域划分和盲重构的可编程芯片容错系统
现场可编程门阵列(fpga)被广泛用于构建可编程芯片系统(sopc),因为它们包含大量可重构的异构资源,提供了实现各种知识产权核心的设施。然而,随着器件特征尺寸的不断缩小和芯片面积的不断增大,如今的fpga容易受到电迁移和辐射引起的误差的严重影响,这给构建可靠的sopc带来了挑战。本文提出了一种实现智能1553B总线节点的SOPC来研究这些挑战,并说明了在商用FPGA上构建高可靠性和低恢复延迟的复杂系统的可行方法。首先,引入了一个通用的可靠性模型DomainPartition (DP)模型来建立包含多个备选配置的sopc,以证明其故障恢复能力。然后根据DP框架下的一阶最优解,确定了可靠性最大化的备选配置的分配。最后,采用盲重构技术降低恢复延迟。基于蒙特卡罗仿真方法进行了实验,以评估可靠性和延迟。研究结果表明,该方法与一般的三模冗余方法相比,可以在更小的开销下获得更高的可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Adaptive and evolvable hardware security architectures Ultimate design security in self-reconfiguring non-volatile environments SDVMR – managing heterogeneity in space and time on multicore SoCs Automated synthesis of 8-output voltage distributor using incremental, evolution An adaptable low density parity check (LDPC) engine for space based communication systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1