玩火:重新访问事务性内存以实现容错和节能的MPSoC执行

Dimitra Papagiannopoulou, A. Marongiu, T. Moreshet, L. Benini, M. Herlihy, R. I. Bahar
{"title":"玩火:重新访问事务性内存以实现容错和节能的MPSoC执行","authors":"Dimitra Papagiannopoulou, A. Marongiu, T. Moreshet, L. Benini, M. Herlihy, R. I. Bahar","doi":"10.1145/2742060.2742090","DOIUrl":null,"url":null,"abstract":"As silicon integration technology pushes toward atomic dimensions, errors due to static and dynamic variability are an increasing concern. To avoid such errors, designers often turn to \"guardband\" restrictions on the operating frequency and voltage. If guardbands are too conservative, they limit performance and waste energy, but less conservative guardbands risk moving the system closer to its Critical Operating Point (COP), a frequency-voltage pair that, if surpassed, causes massive instruction failures. In this paper, we propose a novel scheme that allows to dynamically adjust to an evolving COP and operate at highly reduced margins, while guaranteeing forward progress. Specifically, our scheme dynamically monitors the platform and adaptively adjusts to the COP among multiple cores, using lightweight checkpointing and roll-back mechanisms adopted from Hardware Transactional Memory (HTM) for error recovery. Experiments demonstrate that our technique is particularly effective in saving energy while also offering safe execution guarantees. To the best of our knowledge, this work is the first to describe a full-fledged HTM implementation for error-resilient and energy-efficient MPSoC execution.","PeriodicalId":255133,"journal":{"name":"Proceedings of the 25th edition on Great Lakes Symposium on VLSI","volume":"168 9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Playing with Fire: Transactional Memory Revisited for Error-Resilient and Energy-Efficient MPSoC Execution\",\"authors\":\"Dimitra Papagiannopoulou, A. Marongiu, T. Moreshet, L. Benini, M. Herlihy, R. I. Bahar\",\"doi\":\"10.1145/2742060.2742090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As silicon integration technology pushes toward atomic dimensions, errors due to static and dynamic variability are an increasing concern. To avoid such errors, designers often turn to \\\"guardband\\\" restrictions on the operating frequency and voltage. If guardbands are too conservative, they limit performance and waste energy, but less conservative guardbands risk moving the system closer to its Critical Operating Point (COP), a frequency-voltage pair that, if surpassed, causes massive instruction failures. In this paper, we propose a novel scheme that allows to dynamically adjust to an evolving COP and operate at highly reduced margins, while guaranteeing forward progress. Specifically, our scheme dynamically monitors the platform and adaptively adjusts to the COP among multiple cores, using lightweight checkpointing and roll-back mechanisms adopted from Hardware Transactional Memory (HTM) for error recovery. Experiments demonstrate that our technique is particularly effective in saving energy while also offering safe execution guarantees. To the best of our knowledge, this work is the first to describe a full-fledged HTM implementation for error-resilient and energy-efficient MPSoC execution.\",\"PeriodicalId\":255133,\"journal\":{\"name\":\"Proceedings of the 25th edition on Great Lakes Symposium on VLSI\",\"volume\":\"168 9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 25th edition on Great Lakes Symposium on VLSI\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2742060.2742090\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th edition on Great Lakes Symposium on VLSI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2742060.2742090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

随着硅集成技术向原子尺度发展,由于静态和动态变化引起的误差日益受到关注。为了避免这种错误,设计人员通常会对工作频率和电压进行“保护带”限制。如果保护带过于保守,它们会限制性能并浪费能量,但不太保守的保护带可能会使系统更接近其关键工作点(COP),这是一个频率电压对,如果超过该频率电压对,将导致大量指令故障。在本文中,我们提出了一种新的方案,该方案允许动态调整以适应不断变化的COP,并在保证前进的同时以高度降低的利润率运行。具体来说,我们的方案动态地监控平台,并自适应地调整多核之间的COP,使用硬件事务内存(Hardware Transactional Memory, HTM)中采用的轻量级检查点和回滚机制进行错误恢复。实验表明,该技术在节约能源的同时,也提供了安全的执行保证。据我们所知,这项工作是第一次描述了一个完整的HTM实现,用于容错和节能的MPSoC执行。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Playing with Fire: Transactional Memory Revisited for Error-Resilient and Energy-Efficient MPSoC Execution
As silicon integration technology pushes toward atomic dimensions, errors due to static and dynamic variability are an increasing concern. To avoid such errors, designers often turn to "guardband" restrictions on the operating frequency and voltage. If guardbands are too conservative, they limit performance and waste energy, but less conservative guardbands risk moving the system closer to its Critical Operating Point (COP), a frequency-voltage pair that, if surpassed, causes massive instruction failures. In this paper, we propose a novel scheme that allows to dynamically adjust to an evolving COP and operate at highly reduced margins, while guaranteeing forward progress. Specifically, our scheme dynamically monitors the platform and adaptively adjusts to the COP among multiple cores, using lightweight checkpointing and roll-back mechanisms adopted from Hardware Transactional Memory (HTM) for error recovery. Experiments demonstrate that our technique is particularly effective in saving energy while also offering safe execution guarantees. To the best of our knowledge, this work is the first to describe a full-fledged HTM implementation for error-resilient and energy-efficient MPSoC execution.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Small-World Network Enabled Energy Efficient and Robust 3D NoC Architectures A Novel Framework for Temperature Dependence Aware Clock Skew Scheduling Session details: VLSI Design Proceedings of the 25th edition on Great Lakes Symposium on VLSI Energy Efficient RRAM Spiking Neural Network for Real Time Classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1