双核体系结构的恢复机制

C. E. Salloum, A. Steininger, Peter Tummeltshammer, Werner Harter
{"title":"双核体系结构的恢复机制","authors":"C. E. Salloum, A. Steininger, Peter Tummeltshammer, Werner Harter","doi":"10.1109/DFT.2006.52","DOIUrl":null,"url":null,"abstract":"Dual core architectures are commonly used to establish fault tolerance on the node level. Since comparison is usually performed for the outputs only, no precise diagnostic information is available, and error handling comes down to a reset of both cores. The strategy proposed in this paper allows a more fine-grained error handling. It is based on the following steps: (1) Identification of those registers that are actually relevant for recovering the last known correct core state. (2) Protection of these registers by additional comparators. (3) Use of the trap mechanism for recovering a consistent state of the complete core. (4) (Optional) provision of rollback capability for the relevant registers in order to relax the critical path constraints. In the paper these individual steps was discussed and motivated, and put them into context. In many cases the speed-up that was gained for the recovery was sufficient for using a dual core as a fail-operational instead of a fail-silent component with respect to transient faults. Rather than being restricted to a specific processor design our mechanisms can be employed in a wide variety of dual-core architectures","PeriodicalId":113870,"journal":{"name":"2006 21st IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Recovery Mechanisms for Dual Core Architectures\",\"authors\":\"C. E. Salloum, A. Steininger, Peter Tummeltshammer, Werner Harter\",\"doi\":\"10.1109/DFT.2006.52\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dual core architectures are commonly used to establish fault tolerance on the node level. Since comparison is usually performed for the outputs only, no precise diagnostic information is available, and error handling comes down to a reset of both cores. The strategy proposed in this paper allows a more fine-grained error handling. It is based on the following steps: (1) Identification of those registers that are actually relevant for recovering the last known correct core state. (2) Protection of these registers by additional comparators. (3) Use of the trap mechanism for recovering a consistent state of the complete core. (4) (Optional) provision of rollback capability for the relevant registers in order to relax the critical path constraints. In the paper these individual steps was discussed and motivated, and put them into context. In many cases the speed-up that was gained for the recovery was sufficient for using a dual core as a fail-operational instead of a fail-silent component with respect to transient faults. Rather than being restricted to a specific processor design our mechanisms can be employed in a wide variety of dual-core architectures\",\"PeriodicalId\":113870,\"journal\":{\"name\":\"2006 21st IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems\",\"volume\":\"64 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 21st IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DFT.2006.52\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 21st IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DFT.2006.52","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

双核架构通常用于在节点级别建立容错。由于通常只对输出执行比较,因此没有精确的诊断信息可用,并且错误处理归结为两个核心的重置。本文提出的策略允许更细粒度的错误处理。它基于以下步骤:(1)识别那些与恢复最后已知的正确核心状态实际相关的寄存器。(2)由其他比较国保护这些登记册。(3)利用捕集器机制恢复整个岩心的一致状态。(4)(可选)为相关寄存器提供回滚功能,以放宽关键路径约束。本文对这些单独的步骤进行了讨论和激励,并将它们置于上下文中。在许多情况下,恢复所获得的加速足以将双核用作故障操作组件,而不是用于处理瞬态故障的故障沉默组件。我们的机制不局限于特定的处理器设计,可以应用于各种双核架构
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Recovery Mechanisms for Dual Core Architectures
Dual core architectures are commonly used to establish fault tolerance on the node level. Since comparison is usually performed for the outputs only, no precise diagnostic information is available, and error handling comes down to a reset of both cores. The strategy proposed in this paper allows a more fine-grained error handling. It is based on the following steps: (1) Identification of those registers that are actually relevant for recovering the last known correct core state. (2) Protection of these registers by additional comparators. (3) Use of the trap mechanism for recovering a consistent state of the complete core. (4) (Optional) provision of rollback capability for the relevant registers in order to relax the critical path constraints. In the paper these individual steps was discussed and motivated, and put them into context. In many cases the speed-up that was gained for the recovery was sufficient for using a dual core as a fail-operational instead of a fail-silent component with respect to transient faults. Rather than being restricted to a specific processor design our mechanisms can be employed in a wide variety of dual-core architectures
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Employing On-Chip Jitter Test Circuit for Phase Locked Loop Self-Calibration Timing Failure Analysis of Commercial CPUs Under Operating Stress A Built-In Redundancy-Analysis Scheme for Self-Repairable RAMs with Two-Level Redundancy Effect of Process Variation on the Performance of Phase Frequency Detector Self Testing SoC with Reduced Memory Requirements and Minimized Hardware Overhead
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1