在硬件中捕获准确的配置文件

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. Pub Date : 2003-02-08 DOI:10.1109/HPCA.2003.1183545

S. Narayanasamy, T. Sherwood, S. Sair, B. Calder, G. Varghese

{"title":"在硬件中捕获准确的配置文件","authors":"S. Narayanasamy, T. Sherwood, S. Sair, B. Calder, G. Varghese","doi":"10.1109/HPCA.2003.1183545","DOIUrl":null,"url":null,"abstract":"Run-time optimization is one of the most important ways of getting performance out of modern processors. Techniques such as prefetching, trace caching, memory disambiguation etc., are all based upon the principle of observation followed by adaptation, and all make use of some sort of profile information gathered at run-time. Programs are very complex, and the real trick in generating useful run-time profiles is sifting through all the unimportant and infrequently occurring events to find those that are important enough to warrant optimization. In this paper, we present the multi-hash architecture to catch important events even in the presence of extensive noise. Multi-hash uses a small amount of area, between 7 to 16 Kilo-bytes, to accurately capture these important events in hardware, without requiring any software support. This is achieved using multiple hash tables for the filtering, and interval-based profiling to help identify how important an event is in relationship to all the other events. We evaluate our design for value and edge profiling, and show that over a set of benchmarks, we get an average error less than 1%.","PeriodicalId":150992,"journal":{"name":"The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"Catching accurate profiles in hardware\",\"authors\":\"S. Narayanasamy, T. Sherwood, S. Sair, B. Calder, G. Varghese\",\"doi\":\"10.1109/HPCA.2003.1183545\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Run-time optimization is one of the most important ways of getting performance out of modern processors. Techniques such as prefetching, trace caching, memory disambiguation etc., are all based upon the principle of observation followed by adaptation, and all make use of some sort of profile information gathered at run-time. Programs are very complex, and the real trick in generating useful run-time profiles is sifting through all the unimportant and infrequently occurring events to find those that are important enough to warrant optimization. In this paper, we present the multi-hash architecture to catch important events even in the presence of extensive noise. Multi-hash uses a small amount of area, between 7 to 16 Kilo-bytes, to accurately capture these important events in hardware, without requiring any software support. This is achieved using multiple hash tables for the filtering, and interval-based profiling to help identify how important an event is in relationship to all the other events. We evaluate our design for value and edge profiling, and show that over a set of benchmarks, we get an average error less than 1%.\",\"PeriodicalId\":150992,\"journal\":{\"name\":\"The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-02-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCA.2003.1183545\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2003.1183545","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 26

摘要

运行时优化是提高现代处理器性能的最重要方法之一。诸如预取、跟踪缓存、内存消歧等技术都是基于先观察后适应的原则，并且都利用了在运行时收集的某种概要信息。程序非常复杂，生成有用的运行时概要文件的真正技巧是筛选所有不重要和不经常发生的事件，以找到那些重要到足以保证优化的事件。在本文中，我们提出了多哈希架构，即使在存在大量噪声的情况下也能捕获重要事件。Multi-hash使用少量的面积(在7到16千字节之间)在硬件中准确捕获这些重要事件，而不需要任何软件支持。这是通过使用多个哈希表进行过滤和基于间隔的分析来帮助确定事件相对于所有其他事件的重要性来实现的。我们评估了我们的设计的价值和边缘分析，并表明在一组基准测试中，我们得到的平均误差小于1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Catching accurate profiles in hardware

Run-time optimization is one of the most important ways of getting performance out of modern processors. Techniques such as prefetching, trace caching, memory disambiguation etc., are all based upon the principle of observation followed by adaptation, and all make use of some sort of profile information gathered at run-time. Programs are very complex, and the real trick in generating useful run-time profiles is sifting through all the unimportant and infrequently occurring events to find those that are important enough to warrant optimization. In this paper, we present the multi-hash architecture to catch important events even in the presence of extensive noise. Multi-hash uses a small amount of area, between 7 to 16 Kilo-bytes, to accurately capture these important events in hardware, without requiring any software support. This is achieved using multiple hash tables for the filtering, and interval-based profiling to help identify how important an event is in relationship to all the other events. We evaluate our design for value and edge profiling, and show that over a set of benchmarks, we get an average error less than 1%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.

自引率

0.00%

发文量

期刊最新文献

Dynamic voltage scaling with links for power optimization of interconnection networks Memory system behavior of Java-based middleware Mini-threads: increasing TLP on small-scale SMT processors Performance enhancement techniques for InfiniBand/sup TM/ Architecture Deterministic clock gating for microprocessor power reduction