Using physical and simulated fault injection to evaluate error detection mechanisms

C. Constantinescu
{"title":"Using physical and simulated fault injection to evaluate error detection mechanisms","authors":"C. Constantinescu","doi":"10.1109/PRDC.1999.816228","DOIUrl":null,"url":null,"abstract":"Effective error detection is paramount for building highly dependable computing systems. A new methodology, based on physical and simulated fault injection, is developed for evaluating error detection mechanisms. Our approach consists of two steps. First, transient faults are physically injected at the IC pin level of a prototype server. Experiments are carried our in a three dimensional space of events, the location, time of occurrence and duration of the fault being randomly selected. Improved detection circuitry is devised for decreasing signal sensitivity to transients. Second, simulated fault injection is performed to asses the effectiveness of the new detection mechanisms, without using expensive silicon implementations. Physical fault injection experiments, carried out on the server, and simulated fault injection, performed on protocol checker, are presented. Detection effectiveness is measured by the error detection coverage, defined as the conditional probability that an error is detected given that an error occurs. Fault injection reveals that coverage probability is a function of fault duration. The protocol checker significantly improves error detection. Although, further research is required to increase detection coverage of the errors induced by short transient faults.","PeriodicalId":389294,"journal":{"name":"Proceedings 1999 Pacific Rim International Symposium on Dependable Computing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 1999 Pacific Rim International Symposium on Dependable Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PRDC.1999.816228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Effective error detection is paramount for building highly dependable computing systems. A new methodology, based on physical and simulated fault injection, is developed for evaluating error detection mechanisms. Our approach consists of two steps. First, transient faults are physically injected at the IC pin level of a prototype server. Experiments are carried our in a three dimensional space of events, the location, time of occurrence and duration of the fault being randomly selected. Improved detection circuitry is devised for decreasing signal sensitivity to transients. Second, simulated fault injection is performed to asses the effectiveness of the new detection mechanisms, without using expensive silicon implementations. Physical fault injection experiments, carried out on the server, and simulated fault injection, performed on protocol checker, are presented. Detection effectiveness is measured by the error detection coverage, defined as the conditional probability that an error is detected given that an error occurs. Fault injection reveals that coverage probability is a function of fault duration. The protocol checker significantly improves error detection. Although, further research is required to increase detection coverage of the errors induced by short transient faults.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用物理和模拟故障注入来评估错误检测机制
有效的错误检测对于构建高度可靠的计算系统至关重要。提出了一种基于物理和模拟故障注入的错误检测机制评价方法。我们的方法包括两个步骤。首先,在原型服务器的IC引脚级物理注入瞬态故障。实验是在三维事件空间中进行的,故障的位置、发生时间和持续时间是随机选择的。改进了检测电路,降低了信号对瞬变的灵敏度。其次,在不使用昂贵的硅实现的情况下,进行模拟故障注入来评估新检测机制的有效性。给出了在服务器上进行的物理故障注入实验和在协议检查器上进行的模拟故障注入实验。检测有效性是通过错误检测覆盖率来衡量的,错误检测覆盖率定义为在错误发生的情况下检测到错误的条件概率。故障注入表明,覆盖概率是故障持续时间的函数。协议检查器显著提高了错误检测。然而,对于短时暂态故障引起的误差,需要进一步的研究来提高检测覆盖率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A novel NMR structure with concurrent output error location capability Self-validating diagnosis of hypercube systems LLT and LTn schemes: error recovery schemes in mobile environments An architecture-based software reliability model Cost of ensuring safety in distributed database management systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1