Hi-fi playback: Tolerating position errors in shift operations of racetrack memory

Chao Zhang, Guangyu Sun, Xian Zhang, Weiqi Zhang, Weisheng Zhao, Tao Wang, Yun Liang, Yongpan Liu, Yu Wang, J. Shu
{"title":"Hi-fi playback: Tolerating position errors in shift operations of racetrack memory","authors":"Chao Zhang, Guangyu Sun, Xian Zhang, Weiqi Zhang, Weisheng Zhao, Tao Wang, Yun Liang, Yongpan Liu, Yu Wang, J. Shu","doi":"10.1145/2749469.2750388","DOIUrl":null,"url":null,"abstract":"Racetrack memory is an emerging non-volatile memory based on spintronic domain wall technology. It can achieve ultra-high storage density. Also, its read/write speed is comparable to that of SRAM. Due to the tape-like structure of its storage cell, a “shift” operation is introduced to access racetrack memory. Thus, prior research mainly focused on minimizing shift latency/energy of racetrack memory while leveraging its ultra-high storage density. Yet the reliability issue of a shift operation, however, is not well addressed. In fact, racetrack memory suffers from unsuccessful shift due to domain misalignment. Such a problem is called “position error” in this work. It can significantly reduce mean-time-to-failure (MTTF) of racetrack memory to an intolerable level. Even worse, conventional error correction codes (ECCs), which are designed for “bit errors”, cannot protect racetrack memory from the position errors. In this work, we investigate the position error model of a shift operation and categorize position errors into two types: “stop-in-middle” error and “out-of-step” error. To eliminate the stop-in-middle error, we propose a technique called sub-threshold shift (STS) to perform a more reliable shift in two stages. To detect and recover the out-of-step error, a protection mechanism called position error correction code (p-ECC) is proposed. We first describe how to design a p-ECC for different protection strength and analyze corresponding design overhead. Then, we further propose how to reduce area cost of p-ECC by leveraging the “overhead region” in a racetrack memory stripe. With these protection mechanisms, we introduce a position-error-aware shift architecture. Experimental results demonstrate that, after using our techniques, the overall MTTF of racetrack memory is improved from 1.33μs to more than 69 years, with only 0.2% performance degradation. Trade-off among reliability, area, performance, and energy is also explored with comprehensive discussion.","PeriodicalId":6878,"journal":{"name":"2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA)","volume":"7 1","pages":"694-706"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"69","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2749469.2750388","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 69

Abstract

Racetrack memory is an emerging non-volatile memory based on spintronic domain wall technology. It can achieve ultra-high storage density. Also, its read/write speed is comparable to that of SRAM. Due to the tape-like structure of its storage cell, a “shift” operation is introduced to access racetrack memory. Thus, prior research mainly focused on minimizing shift latency/energy of racetrack memory while leveraging its ultra-high storage density. Yet the reliability issue of a shift operation, however, is not well addressed. In fact, racetrack memory suffers from unsuccessful shift due to domain misalignment. Such a problem is called “position error” in this work. It can significantly reduce mean-time-to-failure (MTTF) of racetrack memory to an intolerable level. Even worse, conventional error correction codes (ECCs), which are designed for “bit errors”, cannot protect racetrack memory from the position errors. In this work, we investigate the position error model of a shift operation and categorize position errors into two types: “stop-in-middle” error and “out-of-step” error. To eliminate the stop-in-middle error, we propose a technique called sub-threshold shift (STS) to perform a more reliable shift in two stages. To detect and recover the out-of-step error, a protection mechanism called position error correction code (p-ECC) is proposed. We first describe how to design a p-ECC for different protection strength and analyze corresponding design overhead. Then, we further propose how to reduce area cost of p-ECC by leveraging the “overhead region” in a racetrack memory stripe. With these protection mechanisms, we introduce a position-error-aware shift architecture. Experimental results demonstrate that, after using our techniques, the overall MTTF of racetrack memory is improved from 1.33μs to more than 69 years, with only 0.2% performance degradation. Trade-off among reliability, area, performance, and energy is also explored with comprehensive discussion.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
高保真回放:容忍赛道存储器移位操作中的位置错误
赛马场存储器是一种基于自旋电子畴壁技术的新兴非易失性存储器。可实现超高存储密度。而且,它的读/写速度与SRAM相当。由于其存储单元的磁带状结构,引入了“移位”操作来访问赛马场存储器。因此,以往的研究主要集中在最小化赛道存储器的移位延迟/能量,同时利用其超高的存储密度。然而,移位操作的可靠性问题并没有得到很好的解决。事实上,赛马场存储器由于域错位而遭受不成功的移位。这种问题在本文中称为“位置误差”。它可以显着将赛道内存的平均故障时间(MTTF)降低到无法忍受的水平。更糟糕的是,传统的纠错码(ecc),设计用于“位错误”,不能保护赛道存储器免受位置错误的影响。在这项工作中,我们研究了移位操作的位置误差模型,并将位置误差分为两种类型:“中途停止”误差和“不同步”误差。为了消除中途停止误差,我们提出了一种称为亚阈值移位(STS)的技术,该技术分两个阶段进行更可靠的移位。为了检测和恢复失步错误,提出了一种位置纠错码(p-ECC)保护机制。我们首先描述了如何设计不同保护强度的p-ECC,并分析了相应的设计开销。然后,我们进一步提出如何利用赛道内存条中的“开销区域”来降低p-ECC的面积成本。利用这些保护机制,我们引入了位置错误感知移位架构。实验结果表明,采用我们的技术后,赛道记忆的总体MTTF从1.33μs提高到69年以上,性能仅下降0.2%。在可靠性、面积、性能和能源之间的权衡也进行了全面的讨论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Redundant Memory Mappings for fast access to large memories Multiple Clone Row DRAM: A low latency and area optimized DRAM Manycore Network Interfaces for in-memory rack-scale computing Coherence protocol for transparent management of scratchpad memories in shared memory manycore architectures ShiDianNao: Shifting vision processing closer to the sensor
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1