Online Prediction of Applications Cache Utility

Miquel Moretó, F. Cazorla, Alex Ramírez, M. Valero
{"title":"Online Prediction of Applications Cache Utility","authors":"Miquel Moretó, F. Cazorla, Alex Ramírez, M. Valero","doi":"10.1109/ICSAMOS.2007.4285748","DOIUrl":null,"url":null,"abstract":"General purpose architectures are designed to offer average high performance regardless of the particular application that is being run. Performance and power inefficiencies appear as a consequence for some programs. Reconfigurable hardware (cache hierarchy, branch predictor, execution units, bandwidth, etc.) has been proposed to overcome these inefficiencies by dynamically adapting the architecture to the application needs. However, nearly all the proposals use indirect measures or heuristics of performance to decide new configurations, what may lead to inefficiencies. In this paper we propose a runtime mechanism that allows to predict the throughput of an application on an architecture using a reconfigurable L2 cache. L2 cache size varies at a way granularity and we predict the performance of the same application on all other L2 cache sizes at the same time. We obtain for different L2 cache sizes an average error of 3.11%, a maximum error of 16.4% and standard deviation of 3.7%. No profiling or operating system participation is needed in this mechanism. We also give a hardware implementation that allows to reduce the hardware cost under 0.4% of the total L2 size and maintains high accuracy. This mechanism can be used to reduce power consumption in single threaded architectures and improve performance in multithreaded architectures that dynamically partition shared L2 caches.","PeriodicalId":106933,"journal":{"name":"2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2007-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAMOS.2007.4285748","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

General purpose architectures are designed to offer average high performance regardless of the particular application that is being run. Performance and power inefficiencies appear as a consequence for some programs. Reconfigurable hardware (cache hierarchy, branch predictor, execution units, bandwidth, etc.) has been proposed to overcome these inefficiencies by dynamically adapting the architecture to the application needs. However, nearly all the proposals use indirect measures or heuristics of performance to decide new configurations, what may lead to inefficiencies. In this paper we propose a runtime mechanism that allows to predict the throughput of an application on an architecture using a reconfigurable L2 cache. L2 cache size varies at a way granularity and we predict the performance of the same application on all other L2 cache sizes at the same time. We obtain for different L2 cache sizes an average error of 3.11%, a maximum error of 16.4% and standard deviation of 3.7%. No profiling or operating system participation is needed in this mechanism. We also give a hardware implementation that allows to reduce the hardware cost under 0.4% of the total L2 size and maintains high accuracy. This mechanism can be used to reduce power consumption in single threaded architectures and improve performance in multithreaded architectures that dynamically partition shared L2 caches.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在线预测应用程序缓存实用程序
通用架构旨在提供平均的高性能,而不考虑正在运行的特定应用程序。性能和电源效率低下是某些程序的结果。可重构硬件(缓存层次结构、分支预测器、执行单元、带宽等)已经被提出,通过动态调整架构以适应应用程序的需求来克服这些低效率问题。然而,几乎所有的建议都使用间接度量或性能启发式来决定可能导致效率低下的新配置。在本文中,我们提出了一种运行时机制,该机制允许使用可重构L2缓存来预测架构上应用程序的吞吐量。二级缓存大小以某种粒度变化,我们同时预测同一应用程序在所有其他二级缓存大小上的性能。我们得到不同二级缓存大小的平均误差为3.11%,最大误差为16.4%,标准差为3.7%。在这种机制中不需要分析或操作系统参与。我们还提供了一个硬件实现,可以将硬件成本降低到L2总尺寸的0.4%以下,并保持高精度。此机制可用于减少单线程架构中的功耗,并提高动态划分共享L2缓存的多线程架构中的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Simulation-Based Methodology for Evaluating the DPA-Resistance of Cryptographic Functional Units with Application to CMOS and MCML Technologies A Hardware/Software Architecture for Tool Path Computation. An Application to Turning Lathe Machining Flexibility Inlining into Arithmetic Data-paths Exploiting A Regular Interconnection Scheme Instruction Set Encoding Optimization for Code Size Reduction Design Space Exploration of Configuration Manager for Network Processing Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1