Zero-performance-overhead online fault detection and diagnosis in 3D stacked integrated circuits

S. Safiruddin, M. Lefter, D. Borodin, G. Voicu, S. Cotofana
{"title":"Zero-performance-overhead online fault detection and diagnosis in 3D stacked integrated circuits","authors":"S. Safiruddin, M. Lefter, D. Borodin, G. Voicu, S. Cotofana","doi":"10.1145/2765491.2765514","DOIUrl":null,"url":null,"abstract":"In this paper we present a zero-performance-overhead online fault detection and diagnosis scheme that exploits the vertical proximity of hardware inherent in 3D stacked integrated circuits (3D-SIC). We consider a 3D stacked processor executing independent instruction streams from different threads, on each die. We propose the vertical clustering of functionally identical computational blocks in order to enable the utilization of the 3D specific low-latency interlayer communication infrastructure. The clustering facilitates the parallel re-execution of instructions on idle units located in the proximity of the units which initially computed them and in this way creates the means for fault diagnosis and detection. We detail the control, interconnection communication infrastructure, instruction distribution, and results processing policies required for our scheme. To determine the effectiveness of the approach, we evaluate its performance in terms of diagnosis latency and percentage of verified operations on 3 to 8 core processors implemented on 3 to 8 tier 3D-SICs, respectively, by means of simulations. Our experiments indicate that the diagnosis latency ranges from 9 to 5 cycles, for 3 to 8 cores, respectively. For transient fault detection our simulations indicate that 86% to 94% of all executed instructions are verified, for 3 to 8 cores, respectively. When only one of the layers is protected against transient faults the number of verified operations increases to 94% to 99%, for the same simulation conditions. This suggests that, if certain conditions are fulfilled at design time, our approach can completely protect one instruction stream identified as being critical for the application. Our simulations clearly indicate that the proposed scheme has the potential to improve the 3D stacked integrated circuits dependability with no performance overhead and at the expense of little area overhead.","PeriodicalId":287602,"journal":{"name":"2012 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2765491.2765514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

In this paper we present a zero-performance-overhead online fault detection and diagnosis scheme that exploits the vertical proximity of hardware inherent in 3D stacked integrated circuits (3D-SIC). We consider a 3D stacked processor executing independent instruction streams from different threads, on each die. We propose the vertical clustering of functionally identical computational blocks in order to enable the utilization of the 3D specific low-latency interlayer communication infrastructure. The clustering facilitates the parallel re-execution of instructions on idle units located in the proximity of the units which initially computed them and in this way creates the means for fault diagnosis and detection. We detail the control, interconnection communication infrastructure, instruction distribution, and results processing policies required for our scheme. To determine the effectiveness of the approach, we evaluate its performance in terms of diagnosis latency and percentage of verified operations on 3 to 8 core processors implemented on 3 to 8 tier 3D-SICs, respectively, by means of simulations. Our experiments indicate that the diagnosis latency ranges from 9 to 5 cycles, for 3 to 8 cores, respectively. For transient fault detection our simulations indicate that 86% to 94% of all executed instructions are verified, for 3 to 8 cores, respectively. When only one of the layers is protected against transient faults the number of verified operations increases to 94% to 99%, for the same simulation conditions. This suggests that, if certain conditions are fulfilled at design time, our approach can completely protect one instruction stream identified as being critical for the application. Our simulations clearly indicate that the proposed scheme has the potential to improve the 3D stacked integrated circuits dependability with no performance overhead and at the expense of little area overhead.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
三维堆叠集成电路零性能开销在线故障检测与诊断
在本文中,我们提出了一种零性能开销的在线故障检测和诊断方案,该方案利用了3D堆叠集成电路(3D- sic)中固有的硬件垂直接近性。我们考虑一个3D堆叠处理器,在每个die上执行来自不同线程的独立指令流。我们提出了功能相同的计算块的垂直聚类,以便能够利用3D特定的低延迟层间通信基础设施。聚类有助于在位于最初计算它们的单元附近的空闲单元上并行重新执行指令,并以这种方式创建故障诊断和检测的手段。我们详细介绍了我们的方案所需的控制、互连通信基础设施、指令分发和结果处理策略。为了确定该方法的有效性,我们通过模拟分别在3到8层3d - sic上实现的3到8核处理器上的诊断延迟和验证操作百分比来评估其性能。我们的实验表明,诊断延迟范围为9到5个周期,分别为3到8个核。对于瞬态故障检测,我们的模拟表明,对于3到8个核,分别有86%到94%的执行指令得到验证。在相同的模拟条件下,当只有一层防止瞬态故障时,验证操作的数量增加到94%到99%。这表明,如果在设计时满足某些条件,我们的方法可以完全保护标识为对应用程序至关重要的指令流。仿真结果清楚地表明,所提出的方案具有提高3D堆叠集成电路可靠性的潜力,且没有性能开销,面积开销很小。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Gate-level modeling for CMOS circuit simulation with ultimate FinFETs A novel write-scheme for data integrity in memristor-based crossbar memories Ternary volatile random access memory based on heterogeneous graphene-CMOS fabric Zero-performance-overhead online fault detection and diagnosis in 3D stacked integrated circuits A Monte Carlo analysis of a write method used in passive nanoelectronic crossbars
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1