M. Boersma, M. Kroener, Christophe Layer, Petra Leber, S. M. Müller, Kerstin Schelm
{"title":"The POWER7 Binary Floating-Point Unit","authors":"M. Boersma, M. Kroener, Christophe Layer, Petra Leber, S. M. Müller, Kerstin Schelm","doi":"10.1109/ARITH.2011.21","DOIUrl":null,"url":null,"abstract":"The binary Floating-Point Unit (FPU) of the POWER7 processor is a 5.5 cycle Fused Multiply-Add (FMA) design, fully compliant with the IEEE 754-2008 standard. Unlike previous PowerPC designs, the POWER7 FPU merges the scalar and vector FPUs into a single unit executing three floating-point instruction sets: the single and double precision scalar set, the single precision VMX vector set, and the new single and double precision VSX vector and scalar set. Due to a compact buffer-free floor plan and several optimizations in the data and control flow, the streamlined POWER7 FPU achieves a factor of 2 area reduction over the POWER6 design, beyond the normal technology shrink. This results in a very power and area efficient FPU design, supporting a chip frequency of 4.14GHz. A single 64-bit FPU instance measures only 0.26mm2 in 45nm CMOS SOI.","PeriodicalId":272151,"journal":{"name":"2011 IEEE 20th Symposium on Computer Arithmetic","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 20th Symposium on Computer Arithmetic","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARITH.2011.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
The binary Floating-Point Unit (FPU) of the POWER7 processor is a 5.5 cycle Fused Multiply-Add (FMA) design, fully compliant with the IEEE 754-2008 standard. Unlike previous PowerPC designs, the POWER7 FPU merges the scalar and vector FPUs into a single unit executing three floating-point instruction sets: the single and double precision scalar set, the single precision VMX vector set, and the new single and double precision VSX vector and scalar set. Due to a compact buffer-free floor plan and several optimizations in the data and control flow, the streamlined POWER7 FPU achieves a factor of 2 area reduction over the POWER6 design, beyond the normal technology shrink. This results in a very power and area efficient FPU design, supporting a chip frequency of 4.14GHz. A single 64-bit FPU instance measures only 0.26mm2 in 45nm CMOS SOI.