Dynamic dead-instruction detection and elimination

ASPLOS X Pub Date : 2002-10-05 DOI:10.1145/605397.605419

J. A. Butts, G. Sohi

{"title":"Dynamic dead-instruction detection and elimination","authors":"J. A. Butts, G. Sohi","doi":"10.1145/605397.605419","DOIUrl":null,"url":null,"abstract":"We observe a non-negligible fraction--3 to 16% in our benchmarks--of dynamically dead instructions, dynamic instruction instances that generate unused results. The majority of these instructions arise from static instructions that also produce useful results. We find that compiler optimization (specifically instruction scheduling) creates a significant portion of these partially dead static instructions. We show that most of the dynamically instructions arise from a small set of static instructions that produce dead values most of the time.We leverage this locality by proposing a dead instruction predictor and presenting a scheme to avoid the execution of predicted-dead instructions. Our predictor achieves an accuracy of 93% while identifying over 91% of the dead instructions using less than 5 KB of state. We achieve such high accuracies by leveraging future control flow information (i.e., branch predictions) to distinguish between useless and useful instances of the same static instruction.We then present a mechanism to avoid the register allocation, instruction scheduling, and execution of predicted dead instructions. We measure reductions in resource utilization averaging over 5% and sometimes exceeding 10%, covering physical register management (allocation and freeing), register file read and write traffic, and data cache accesses. Performance improves by an average of 3.6% on an architecture exhibiting resource contention. Additionally, our scheme frees future compilers from the need to consider the costs of dead instructions, enabling more aggressive code motion and optimization. Simultaneously, it mitigates the need for good path profiling information in making inter-block code motion decisions.","PeriodicalId":377379,"journal":{"name":"ASPLOS X","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"69","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ASPLOS X","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/605397.605419","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 69

Abstract

We observe a non-negligible fraction--3 to 16% in our benchmarks--of dynamically dead instructions, dynamic instruction instances that generate unused results. The majority of these instructions arise from static instructions that also produce useful results. We find that compiler optimization (specifically instruction scheduling) creates a significant portion of these partially dead static instructions. We show that most of the dynamically instructions arise from a small set of static instructions that produce dead values most of the time.We leverage this locality by proposing a dead instruction predictor and presenting a scheme to avoid the execution of predicted-dead instructions. Our predictor achieves an accuracy of 93% while identifying over 91% of the dead instructions using less than 5 KB of state. We achieve such high accuracies by leveraging future control flow information (i.e., branch predictions) to distinguish between useless and useful instances of the same static instruction.We then present a mechanism to avoid the register allocation, instruction scheduling, and execution of predicted dead instructions. We measure reductions in resource utilization averaging over 5% and sometimes exceeding 10%, covering physical register management (allocation and freeing), register file read and write traffic, and data cache accesses. Performance improves by an average of 3.6% on an architecture exhibiting resource contention. Additionally, our scheme frees future compilers from the need to consider the costs of dead instructions, enabling more aggressive code motion and optimization. Simultaneously, it mitigates the need for good path profiling information in making inter-block code motion decisions.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

动态死指令检测和消除

我们观察到一个不可忽略的部分——在我们的基准测试中为3%到16%——是动态死指令，即生成未使用结果的动态指令实例。这些指令中的大多数来自静态指令，这些指令也会产生有用的结果。我们发现编译器优化(特别是指令调度)创建了这些部分死亡的静态指令的很大一部分。我们表明，大多数动态指令都是由一小部分静态指令产生的，这些静态指令在大多数情况下会产生死值。我们利用这种局域性，提出了一个死亡指令预测器，并提出了一种避免执行预测死亡指令的方案。我们的预测器在使用不到5 KB的状态识别超过91%的死指令时达到了93%的准确率。我们通过利用未来的控制流信息(例如，分支预测)来区分相同静态指令的无用和有用实例，从而实现如此高的准确性。然后，我们提出了一种机制来避免寄存器分配、指令调度和预测死指令的执行。我们测量的资源利用率降低平均超过5%，有时超过10%，包括物理寄存器管理(分配和释放)、寄存器文件读写流量和数据缓存访问。在表现出资源争用的架构上，性能平均提高3.6%。此外，我们的方案使未来的编译器不必考虑死指令的代价，从而实现更积极的代码移动和优化。同时，它减少了在做出块间代码运动决策时对良好路径分析信息的需求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ASPLOS X

自引率

0.00%

发文量

期刊最新文献

Understanding and improving operating system effects in control flow prediction Mondrian memory protection Speculative synchronization: applying thread-level speculation to explicitly parallel applications A stream compiler for communication-exposed architectures Evolving RPC for active storage