Rigorous floating-point mixed-precision tuning

Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages Pub Date : 2017-01-01 DOI:10.1145/3009837.3009846

Wei-Fan Chiang, Mark Baranowski, Ian Briggs, A. Solovyev, G. Gopalakrishnan, Zvonimir Rakamaric

{"title":"Rigorous floating-point mixed-precision tuning","authors":"Wei-Fan Chiang, Mark Baranowski, Ian Briggs, A. Solovyev, G. Gopalakrishnan, Zvonimir Rakamaric","doi":"10.1145/3009837.3009846","DOIUrl":null,"url":null,"abstract":"Virtually all real-valued computations are carried out using floating-point data types and operations. The precision of these data types must be set with the goals of reducing the overall round-off error, but also emphasizing performance improvements. Often, a mixed-precision allocation achieves this optimum; unfortunately, there are no techniques available to compute such allocations and conservatively meet a given error target across all program inputs. In this work, we present a rigorous approach to precision allocation based on formal analysis via Symbolic Taylor Expansions, and error analysis based on interval functions. This approach is implemented in an automated tool called FPTuner that generates and solves a quadratically constrained quadratic program to obtain a precision-annotated version of the given expression. FPTuner automatically introduces all the requisite precision up and down casting operations. It also allows users to flexibly control precision allocation using constraints to cap the number of high precision operators as well as group operators to allocate the same precision to facilitate vectorization. We evaluate FPTuner by tuning several benchmarks and measuring the proportion of lower precision operators allocated as we increase the error threshold. We also measure the reduction in energy consumption resulting from executing mixed-precision tuned code on a real hardware platform. We observe significant energy savings in response to mixed-precision tuning, but also observe situations where unexpected compiler behaviors thwart intended optimizations.","PeriodicalId":20657,"journal":{"name":"Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages","volume":"29 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"110","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3009837.3009846","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 110

Abstract

Virtually all real-valued computations are carried out using floating-point data types and operations. The precision of these data types must be set with the goals of reducing the overall round-off error, but also emphasizing performance improvements. Often, a mixed-precision allocation achieves this optimum; unfortunately, there are no techniques available to compute such allocations and conservatively meet a given error target across all program inputs. In this work, we present a rigorous approach to precision allocation based on formal analysis via Symbolic Taylor Expansions, and error analysis based on interval functions. This approach is implemented in an automated tool called FPTuner that generates and solves a quadratically constrained quadratic program to obtain a precision-annotated version of the given expression. FPTuner automatically introduces all the requisite precision up and down casting operations. It also allows users to flexibly control precision allocation using constraints to cap the number of high precision operators as well as group operators to allocate the same precision to facilitate vectorization. We evaluate FPTuner by tuning several benchmarks and measuring the proportion of lower precision operators allocated as we increase the error threshold. We also measure the reduction in energy consumption resulting from executing mixed-precision tuned code on a real hardware platform. We observe significant energy savings in response to mixed-precision tuning, but also observe situations where unexpected compiler behaviors thwart intended optimizations.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

严格的浮点混合精度调优

实际上，所有实值计算都是使用浮点数据类型和操作进行的。这些数据类型的精度必须以减少总体舍入误差为目标来设置，但也要强调性能改进。通常，混合精度分配可以实现这种最优;不幸的是，没有可用的技术来计算这种分配，并且保守地满足所有程序输入的给定错误目标。在这项工作中，我们提出了一种基于符号泰勒展开的形式分析和基于区间函数的误差分析的精确分配方法。这种方法是在一个名为FPTuner的自动化工具中实现的，该工具生成并求解一个二次约束的二次规划，以获得给定表达式的精确注释版本。FPTuner自动引入所有必要的精密上下铸造操作。它还允许用户灵活地控制精度分配，使用约束来限制高精度运算符的数量，以及使用组运算符来分配相同的精度，以方便向量化。我们通过调优几个基准来评估FPTuner，并测量随着误差阈值的增加而分配的低精度算子的比例。我们还测量了在真实硬件平台上执行混合精度调优代码所减少的能耗。我们观察到混合精度调优的显著节能，但也观察到意想不到的编译器行为阻碍预期优化的情况。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages

自引率

0.00%

发文量

期刊最新文献

Gradual refinement types A semantic account of metric preservation A posteriori environment analysis with Pushdown Delta CFA Type systems as macros Complexity verification using guided theorem enumeration