异构内存体系结构的可靠性感知数据放置

Manish Gupta, Vilas Sridharan, D. Roberts, A. Prodromou, A. Venkat, D. Tullsen, Rajesh K. Gupta
{"title":"异构内存体系结构的可靠性感知数据放置","authors":"Manish Gupta, Vilas Sridharan, D. Roberts, A. Prodromou, A. Venkat, D. Tullsen, Rajesh K. Gupta","doi":"10.1109/HPCA.2018.00056","DOIUrl":null,"url":null,"abstract":"System reliability is a first-class concern as technology continues to shrink, resulting in increased vulnerability to traditional sources of errors such as single event upsets. By tracking access counts and the Architectural Vulnerability Factor (AVF), application data can be partitioned into groups based on how frequently it is accessed (its \"hotness\") and its likelihood to cause program execution error (its \"risk\"). This is particularly useful for memory systems which exhibit heterogeneity in their performance and reliability such as Heterogeneous Memory Architectures – with a typical configuration combining slow, highly reliable memory with faster, less reliable memory. This work demonstrates that current state of the art, performance-focused data placement techniques affect reliability adversely. It shows that page risk is not necessarily correlated with its hotness; this makes it possible to identify pages that are both hot and low risk, enabling page placement strategies that can find a good balance of performance and reliability. This work explores heuristics to identify and monitor both hotness and risk at run-time, and further proposes static, dynamic, and program annotation-based reliability-aware data placement techniques. This enables an architect to choose among available memories with diverse performance and reliability characteristics. The proposed heuristic-based reliability-aware data placement improves reliability by a factor of 1.6x compared to performance-focused static placement while limiting the performance degradation to 1%. A dynamic reliability-aware migration scheme, which does not require prior knowledge about the application, improves reliability by a factor of 1.5x on average while limiting the performance loss to 4.9%. Finally, program annotation-based data placement improves the reliability by 1.3x at a performance cost of 1.1%.","PeriodicalId":154694,"journal":{"name":"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Reliability-Aware Data Placement for Heterogeneous Memory Architecture\",\"authors\":\"Manish Gupta, Vilas Sridharan, D. Roberts, A. Prodromou, A. Venkat, D. Tullsen, Rajesh K. Gupta\",\"doi\":\"10.1109/HPCA.2018.00056\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"System reliability is a first-class concern as technology continues to shrink, resulting in increased vulnerability to traditional sources of errors such as single event upsets. By tracking access counts and the Architectural Vulnerability Factor (AVF), application data can be partitioned into groups based on how frequently it is accessed (its \\\"hotness\\\") and its likelihood to cause program execution error (its \\\"risk\\\"). This is particularly useful for memory systems which exhibit heterogeneity in their performance and reliability such as Heterogeneous Memory Architectures – with a typical configuration combining slow, highly reliable memory with faster, less reliable memory. This work demonstrates that current state of the art, performance-focused data placement techniques affect reliability adversely. It shows that page risk is not necessarily correlated with its hotness; this makes it possible to identify pages that are both hot and low risk, enabling page placement strategies that can find a good balance of performance and reliability. This work explores heuristics to identify and monitor both hotness and risk at run-time, and further proposes static, dynamic, and program annotation-based reliability-aware data placement techniques. This enables an architect to choose among available memories with diverse performance and reliability characteristics. The proposed heuristic-based reliability-aware data placement improves reliability by a factor of 1.6x compared to performance-focused static placement while limiting the performance degradation to 1%. A dynamic reliability-aware migration scheme, which does not require prior knowledge about the application, improves reliability by a factor of 1.5x on average while limiting the performance loss to 4.9%. Finally, program annotation-based data placement improves the reliability by 1.3x at a performance cost of 1.1%.\",\"PeriodicalId\":154694,\"journal\":{\"name\":\"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCA.2018.00056\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2018.00056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

摘要

随着技术的不断萎缩,系统可靠性是头等大事,这导致传统错误来源(如单事件干扰)的脆弱性增加。通过跟踪访问计数和体系结构漏洞因子(AVF),可以根据访问的频率(其“热度”)和导致程序执行错误的可能性(其“风险”)将应用程序数据划分为组。这对于表现出性能和可靠性异质性的内存系统特别有用,例如异构内存体系结构——典型配置结合了慢速、高可靠的内存和快速、不太可靠的内存。这项工作表明,目前以性能为中心的数据放置技术对可靠性产生了不利影响。结果表明,页面风险与其热度之间没有必然的相关性;这使得识别高风险和低风险的页面成为可能,从而启用能够在性能和可靠性之间找到良好平衡的页面放置策略。这项工作探索了在运行时识别和监控热度和风险的启发式方法,并进一步提出了静态、动态和基于程序注释的可靠性感知数据放置技术。这使架构师能够在具有不同性能和可靠性特性的可用存储器中进行选择。与以性能为中心的静态数据放置相比,提出的基于启发式的可靠性感知数据放置将可靠性提高了1.6倍,同时将性能下降限制在1%以内。动态可靠性感知迁移方案不需要预先了解应用程序,可将可靠性平均提高1.5倍,同时将性能损失限制在4.9%。最后,基于程序注释的数据放置以1.1%的性能代价将可靠性提高了1.3倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Reliability-Aware Data Placement for Heterogeneous Memory Architecture
System reliability is a first-class concern as technology continues to shrink, resulting in increased vulnerability to traditional sources of errors such as single event upsets. By tracking access counts and the Architectural Vulnerability Factor (AVF), application data can be partitioned into groups based on how frequently it is accessed (its "hotness") and its likelihood to cause program execution error (its "risk"). This is particularly useful for memory systems which exhibit heterogeneity in their performance and reliability such as Heterogeneous Memory Architectures – with a typical configuration combining slow, highly reliable memory with faster, less reliable memory. This work demonstrates that current state of the art, performance-focused data placement techniques affect reliability adversely. It shows that page risk is not necessarily correlated with its hotness; this makes it possible to identify pages that are both hot and low risk, enabling page placement strategies that can find a good balance of performance and reliability. This work explores heuristics to identify and monitor both hotness and risk at run-time, and further proposes static, dynamic, and program annotation-based reliability-aware data placement techniques. This enables an architect to choose among available memories with diverse performance and reliability characteristics. The proposed heuristic-based reliability-aware data placement improves reliability by a factor of 1.6x compared to performance-focused static placement while limiting the performance degradation to 1%. A dynamic reliability-aware migration scheme, which does not require prior knowledge about the application, improves reliability by a factor of 1.5x on average while limiting the performance loss to 4.9%. Finally, program annotation-based data placement improves the reliability by 1.3x at a performance cost of 1.1%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Record-Replay Architecture as a General Security Framework LATTE-CC: Latency Tolerance Aware Adaptive Cache Compression Management for Energy Efficient GPUs Secure DIMM: Moving ORAM Primitives Closer to Memory OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator WIR: Warp Instruction Reuse to Minimize Repeated Computations in GPUs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1