{"title":"忠实地舍入浮点计算","authors":"M. Lange, S. Rump","doi":"10.1145/3290955","DOIUrl":null,"url":null,"abstract":"We present a pair arithmetic for the four basic operations and square root. It can be regarded as a simplified, more-efficient double-double arithmetic. The central assumption on the underlying arithmetic is the first standard model for error analysis for operations on a discrete set of real numbers. Neither do we require a floating-point grid nor a rounding to nearest property. Based on that, we define a relative rounding error unit u and prove rigorous error bounds for the computed result of an arbitrary arithmetic expression depending on u, the size of the expression, and possibly a condition measure. In the second part of this note, we extend the error analysis by examining requirements to ensure faithfully rounded outputs and apply our results to IEEE 754 standard conform floating-point systems. For a class of mathematical expressions, using an IEEE 754 standard conform arithmetic with base β, the result is proved to be faithfully rounded for up to 1 / √βu - 2 operations. Our findings cover a number of previously published algorithms to compute faithfully rounded results, among them Horner’s scheme, products, sums, dot products, or Euclidean norm. Beyond that, several other problems can be analyzed, such as polynomial interpolation, orientation problems, Householder transformations, or the smallest singular value of Hilbert matrices of large size.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"37 1","pages":"1 - 20"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Faithfully Rounded Floating-point Computations\",\"authors\":\"M. Lange, S. Rump\",\"doi\":\"10.1145/3290955\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a pair arithmetic for the four basic operations and square root. It can be regarded as a simplified, more-efficient double-double arithmetic. The central assumption on the underlying arithmetic is the first standard model for error analysis for operations on a discrete set of real numbers. Neither do we require a floating-point grid nor a rounding to nearest property. Based on that, we define a relative rounding error unit u and prove rigorous error bounds for the computed result of an arbitrary arithmetic expression depending on u, the size of the expression, and possibly a condition measure. In the second part of this note, we extend the error analysis by examining requirements to ensure faithfully rounded outputs and apply our results to IEEE 754 standard conform floating-point systems. For a class of mathematical expressions, using an IEEE 754 standard conform arithmetic with base β, the result is proved to be faithfully rounded for up to 1 / √βu - 2 operations. Our findings cover a number of previously published algorithms to compute faithfully rounded results, among them Horner’s scheme, products, sums, dot products, or Euclidean norm. Beyond that, several other problems can be analyzed, such as polynomial interpolation, orientation problems, Householder transformations, or the smallest singular value of Hilbert matrices of large size.\",\"PeriodicalId\":7036,\"journal\":{\"name\":\"ACM Transactions on Mathematical Software (TOMS)\",\"volume\":\"37 1\",\"pages\":\"1 - 20\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Mathematical Software (TOMS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3290955\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Mathematical Software (TOMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3290955","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
We present a pair arithmetic for the four basic operations and square root. It can be regarded as a simplified, more-efficient double-double arithmetic. The central assumption on the underlying arithmetic is the first standard model for error analysis for operations on a discrete set of real numbers. Neither do we require a floating-point grid nor a rounding to nearest property. Based on that, we define a relative rounding error unit u and prove rigorous error bounds for the computed result of an arbitrary arithmetic expression depending on u, the size of the expression, and possibly a condition measure. In the second part of this note, we extend the error analysis by examining requirements to ensure faithfully rounded outputs and apply our results to IEEE 754 standard conform floating-point systems. For a class of mathematical expressions, using an IEEE 754 standard conform arithmetic with base β, the result is proved to be faithfully rounded for up to 1 / √βu - 2 operations. Our findings cover a number of previously published algorithms to compute faithfully rounded results, among them Horner’s scheme, products, sums, dot products, or Euclidean norm. Beyond that, several other problems can be analyzed, such as polynomial interpolation, orientation problems, Householder transformations, or the smallest singular value of Hilbert matrices of large size.