除了单比特纠错(ECC),我们还需要什么吗?

M. Spica, T. M. Mak
{"title":"除了单比特纠错(ECC),我们还需要什么吗?","authors":"M. Spica, T. M. Mak","doi":"10.1109/MTDT.2004.9","DOIUrl":null,"url":null,"abstract":"For a long time, single bit error correction (with double bit error detection) has been the mainstay ECC technology for covering soft errors in the cache. From the soft error rate that has been observed (at least terrestrially), people have been content with what single bit correction can offer. For the rare occasion that a double error occurs, ECC will also be able to alert the system and result in a graceful shutdown or otherwise. However, things are changing. As technology scaling continues, we are approaching the point where we will have a billion transistors on a single piece of silicon, with a big part of this budget as memory elements. In a system, the number of memory bits is also on the rise. The scaled technology also brings with it many variations and sensitivities that can cause memory cells to function improperly, or may not function well at certain environmental conditions. Increasingly, ECC is no longer serving as just radiation induced soft error correction, but may be able to affect other forms of fault corrections as well. Will ECC be able to serve this multi-faceted role? Do we need more than single bit error correction? Can we afford the cost of multiple bit error correction? Should we need it? This paper will attempt to answer some of these questions and raise issues with the status quo.","PeriodicalId":415606,"journal":{"name":"Records of the 2004 International Workshop on Memory Technology, Design and Testing, 2004.","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"57","resultStr":"{\"title\":\"Do we need anything more than single bit error correction (ECC)?\",\"authors\":\"M. Spica, T. M. Mak\",\"doi\":\"10.1109/MTDT.2004.9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For a long time, single bit error correction (with double bit error detection) has been the mainstay ECC technology for covering soft errors in the cache. From the soft error rate that has been observed (at least terrestrially), people have been content with what single bit correction can offer. For the rare occasion that a double error occurs, ECC will also be able to alert the system and result in a graceful shutdown or otherwise. However, things are changing. As technology scaling continues, we are approaching the point where we will have a billion transistors on a single piece of silicon, with a big part of this budget as memory elements. In a system, the number of memory bits is also on the rise. The scaled technology also brings with it many variations and sensitivities that can cause memory cells to function improperly, or may not function well at certain environmental conditions. Increasingly, ECC is no longer serving as just radiation induced soft error correction, but may be able to affect other forms of fault corrections as well. Will ECC be able to serve this multi-faceted role? Do we need more than single bit error correction? Can we afford the cost of multiple bit error correction? Should we need it? This paper will attempt to answer some of these questions and raise issues with the status quo.\",\"PeriodicalId\":415606,\"journal\":{\"name\":\"Records of the 2004 International Workshop on Memory Technology, Design and Testing, 2004.\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"57\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Records of the 2004 International Workshop on Memory Technology, Design and Testing, 2004.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MTDT.2004.9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Records of the 2004 International Workshop on Memory Technology, Design and Testing, 2004.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MTDT.2004.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 57

摘要

长期以来,单比特纠错(双比特纠错检测)一直是ECC技术的主流,用于覆盖缓存中的软错误。从已经观察到的软错误率(至少在陆地上)来看,人们已经对单比特校正所能提供的东西感到满意。对于发生双重错误的罕见情况,ECC也能够提醒系统并导致正常关闭或其他情况。然而,情况正在发生变化。随着技术规模的不断扩大,我们正接近在一块硅片上拥有10亿个晶体管的地步,其中很大一部分预算将用于存储元件。在一个系统中,内存位的数量也在增加。这种规模化技术也带来了许多变化和敏感性,可能导致记忆细胞功能不正常,或者在某些环境条件下可能无法正常工作。越来越多地,ECC不再仅仅作为辐射引起的软错误校正,但也可能影响其他形式的错误校正。ECC能否胜任这一多方面的角色?我们需要比单比特纠错更多的纠错吗?我们能负担得起多比特纠错的费用吗?我们需要它吗?本文将试图回答其中的一些问题,并提出与现状有关的问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Do we need anything more than single bit error correction (ECC)?
For a long time, single bit error correction (with double bit error detection) has been the mainstay ECC technology for covering soft errors in the cache. From the soft error rate that has been observed (at least terrestrially), people have been content with what single bit correction can offer. For the rare occasion that a double error occurs, ECC will also be able to alert the system and result in a graceful shutdown or otherwise. However, things are changing. As technology scaling continues, we are approaching the point where we will have a billion transistors on a single piece of silicon, with a big part of this budget as memory elements. In a system, the number of memory bits is also on the rise. The scaled technology also brings with it many variations and sensitivities that can cause memory cells to function improperly, or may not function well at certain environmental conditions. Increasingly, ECC is no longer serving as just radiation induced soft error correction, but may be able to affect other forms of fault corrections as well. Will ECC be able to serve this multi-faceted role? Do we need more than single bit error correction? Can we afford the cost of multiple bit error correction? Should we need it? This paper will attempt to answer some of these questions and raise issues with the status quo.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Built-in self-test and repair (BISTR) techniques for embedded RAMs Redundancy - it's not just for defects any more Do we need anything more than single bit error correction (ECC)? Embedded memory reliability: the SER challenge A novel method for silicon configurable test flow and algorithms for testing, debugging and characterizing different types of embedded memories through a shared controller
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1