{"title":"二进制和十进制数字系统中的表示错误","authors":"P. Johnstone","doi":"10.1145/503896.503911","DOIUrl":null,"url":null,"abstract":"The representation of a general rational number of the form A/B as a floating point number requires a conversion from the general form to a base specific form. This conversion often results in the generation of infinitely repeating non-zero strings of digits which are truncated to the size of the mantissa resulting in a loss of precision. It is shown that the proportion of repeating versus finite rational numbers specific to a base is expotentially related to the number of unique prime factors of the base. Simulation results are presented which show the relative proportions of finite representations for binary and decimal cases over a range of mantissa sizes. The representation of rational numbers in computer systems is typically implemented by modified forms of scientific notations that are referred to as floating point representations. That is, all rational numbers in general fractional form are converted to a rational number of a default base and stored as a mantissa of fixed precision scaled by a power of the base. In this form the denominator need not be explicity represented or manipulated. This simplification limits the computational overhead and extends the range of the representation at the price of precision. Generally the normalization and base of such numbers are assumed to be a default understood by the algorithms which manipulate them. These systems of floating point representation can be grouped into P e r m i s s i o n to copy w i t h o u t f e e a l l o r p a r t o f t h i s m a t e r i a l i s g r a n t e d p r o v i d e d t h a t t h e c o p i e s are not made or d i s t r i b u t e d f o r d i r e c t c o m m e r c i a l a d v a n t a g e , t h e ACM c o p y r i g h t n o t i c e and the t i t l e o f t h e p u b l i c a t i o n and i t s d a t e appear , and n o t i c e i s g i v e n t h a t c o p y i n g i s by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 1982 ACM 0-89791-071-0/82/0400-0085 $00.75 three distinct categories: a binary mantissa and exponent a binary encodirg of decimal digits a binary encoding of a decimal mantissa Of these representations the first is a true base two scientific notation and has been the choice of the overwhelming majority of floating point systems. The remaining methods rely on only binary representation as a media for storage and representation of values. That is, their algorithms perform actual decimal manipulations. Therefore values represented in these methods are given binary encoding for rational numbers in decimal form. All of these representations require that rational numbers in fractional form be converted to a rational number in the appropriate base. To express a general rational number as a base-specific rational number requires conversion of the original fraction to a form in which the denominator is some power of the base of representation. For example, we typically convert the fraction 1/2 to 5/10 or its equivalent form . 5 to express it within a floating point representation. This power is saved as the \"exponent\" in floating point forms. The key problem in these systems of representation is that not all possible fractions (in fact a very limited subset) can be converted to a base specific representation with a finite numerator. For example, the rational number 1/3 (i/ii binary) cannot be represented in either decimal or binary systems with a finite numerator over a power of the base. Floating point systems of representation must therefore truncate the least significant digits of the numerator to fit within the finite size of the mantissa The result is a loss of significance and imprecision which is introduced into subsequent computations. It follows that this same phenomena (i.e. truncation of This work was funded in part by the Academic Grant Fund of Loyola University, New Orleans.","PeriodicalId":184493,"journal":{"name":"ACM-SE 20","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1982-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Representational error in binary and decimal numbering systems\",\"authors\":\"P. Johnstone\",\"doi\":\"10.1145/503896.503911\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The representation of a general rational number of the form A/B as a floating point number requires a conversion from the general form to a base specific form. This conversion often results in the generation of infinitely repeating non-zero strings of digits which are truncated to the size of the mantissa resulting in a loss of precision. It is shown that the proportion of repeating versus finite rational numbers specific to a base is expotentially related to the number of unique prime factors of the base. Simulation results are presented which show the relative proportions of finite representations for binary and decimal cases over a range of mantissa sizes. The representation of rational numbers in computer systems is typically implemented by modified forms of scientific notations that are referred to as floating point representations. That is, all rational numbers in general fractional form are converted to a rational number of a default base and stored as a mantissa of fixed precision scaled by a power of the base. In this form the denominator need not be explicity represented or manipulated. This simplification limits the computational overhead and extends the range of the representation at the price of precision. Generally the normalization and base of such numbers are assumed to be a default understood by the algorithms which manipulate them. These systems of floating point representation can be grouped into P e r m i s s i o n to copy w i t h o u t f e e a l l o r p a r t o f t h i s m a t e r i a l i s g r a n t e d p r o v i d e d t h a t t h e c o p i e s are not made or d i s t r i b u t e d f o r d i r e c t c o m m e r c i a l a d v a n t a g e , t h e ACM c o p y r i g h t n o t i c e and the t i t l e o f t h e p u b l i c a t i o n and i t s d a t e appear , and n o t i c e i s g i v e n t h a t c o p y i n g i s by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 1982 ACM 0-89791-071-0/82/0400-0085 $00.75 three distinct categories: a binary mantissa and exponent a binary encodirg of decimal digits a binary encoding of a decimal mantissa Of these representations the first is a true base two scientific notation and has been the choice of the overwhelming majority of floating point systems. The remaining methods rely on only binary representation as a media for storage and representation of values. That is, their algorithms perform actual decimal manipulations. Therefore values represented in these methods are given binary encoding for rational numbers in decimal form. All of these representations require that rational numbers in fractional form be converted to a rational number in the appropriate base. To express a general rational number as a base-specific rational number requires conversion of the original fraction to a form in which the denominator is some power of the base of representation. For example, we typically convert the fraction 1/2 to 5/10 or its equivalent form . 5 to express it within a floating point representation. This power is saved as the \\\"exponent\\\" in floating point forms. The key problem in these systems of representation is that not all possible fractions (in fact a very limited subset) can be converted to a base specific representation with a finite numerator. For example, the rational number 1/3 (i/ii binary) cannot be represented in either decimal or binary systems with a finite numerator over a power of the base. Floating point systems of representation must therefore truncate the least significant digits of the numerator to fit within the finite size of the mantissa The result is a loss of significance and imprecision which is introduced into subsequent computations. It follows that this same phenomena (i.e. truncation of This work was funded in part by the Academic Grant Fund of Loyola University, New Orleans.\",\"PeriodicalId\":184493,\"journal\":{\"name\":\"ACM-SE 20\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1982-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM-SE 20\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/503896.503911\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM-SE 20","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/503896.503911","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
摘要
将形式为a /B的一般有理数表示为浮点数需要从一般形式转换为特定基数形式。这种转换通常会产生无限重复的非零数字串,这些数字串被截断到尾数的大小,从而导致精度的损失。结果表明,一个基所特有的重复有理数与有限有理数的比例与该基唯一素数因子的数量有显潜在的关系。仿真结果显示了在尾数大小范围内二进制和十进制的有限表示的相对比例。在计算机系统中,有理数的表示通常是通过被称为浮点表示的科学符号的修改形式来实现的。也就是说,所有一般分数形式的有理数都转换为默认基数的有理数,并存储为固定精度的尾数,该尾数按基数的幂进行缩放。在这种形式下,分母不需要显式表示或操作。这种简化限制了计算开销,并以牺牲精度为代价扩展了表示的范围。通常,这些数字的归一化和基数被假定为操作它们的算法所理解的默认值。浮点表示的这些系统可以分为P e r m is s i o n复制w i t h o u t f e e l l o r P r t o f t h我s m t e r l i s g r n t e d P r o v i d e d t h t t h e c o P e s不是我或者我d s t r b t u e d f o r d i r e c t c o m m e r c我l e d v n t g, t h e ACM c o P y r i g h t n o t i c e t i t l e o f t h e P u b l i c t i o n和t s d t e出现,在计算机协会的许可下,在计算机协会的许可下,在计算机协会的许可下,在计算机协会的许可下,在计算机协会的许可下,在计算机协会的许可下使用计算机。以其他方式复制或重新发布需要付费和/或特定许可。三种不同的类别:二进制尾数和指数十进制数字的二进制编码十进制尾数的二进制编码在这些表示法中,第一种是真正的二进制科学记数法,并且是绝大多数浮点系统的选择。其余方法仅依赖二进制表示作为存储和表示值的媒介。也就是说,它们的算法执行实际的十进制操作。因此,在这些方法中表示的值被给予十进制形式的有理数的二进制编码。所有这些表示都要求将分数形式的有理数转换为相应基数的有理数。要将一般有理数表示为特定基数的有理数,需要将原始分数转换为分母为表示基数的某次幂的形式。例如,我们通常将分数1/2转换为5/10或其等价形式。用浮点数表示。这个幂被保存为浮点形式的“指数”。这些表示系统中的关键问题是,并非所有可能的分数(实际上是一个非常有限的子集)都可以转换为具有有限分子的特定基表示。例如,有理数1/3(二进制i/ii)不能用有限分子除以基数幂的十进制或二进制表示。因此,浮点表示系统必须截断分子的最低有效数字以适应尾数的有限大小,其结果是在随后的计算中引入重要性和不精确性的损失。由此可见,同样的现象(即本工作的删节)部分由新奥尔良洛约拉大学学术资助基金资助。
Representational error in binary and decimal numbering systems
The representation of a general rational number of the form A/B as a floating point number requires a conversion from the general form to a base specific form. This conversion often results in the generation of infinitely repeating non-zero strings of digits which are truncated to the size of the mantissa resulting in a loss of precision. It is shown that the proportion of repeating versus finite rational numbers specific to a base is expotentially related to the number of unique prime factors of the base. Simulation results are presented which show the relative proportions of finite representations for binary and decimal cases over a range of mantissa sizes. The representation of rational numbers in computer systems is typically implemented by modified forms of scientific notations that are referred to as floating point representations. That is, all rational numbers in general fractional form are converted to a rational number of a default base and stored as a mantissa of fixed precision scaled by a power of the base. In this form the denominator need not be explicity represented or manipulated. This simplification limits the computational overhead and extends the range of the representation at the price of precision. Generally the normalization and base of such numbers are assumed to be a default understood by the algorithms which manipulate them. These systems of floating point representation can be grouped into P e r m i s s i o n to copy w i t h o u t f e e a l l o r p a r t o f t h i s m a t e r i a l i s g r a n t e d p r o v i d e d t h a t t h e c o p i e s are not made or d i s t r i b u t e d f o r d i r e c t c o m m e r c i a l a d v a n t a g e , t h e ACM c o p y r i g h t n o t i c e and the t i t l e o f t h e p u b l i c a t i o n and i t s d a t e appear , and n o t i c e i s g i v e n t h a t c o p y i n g i s by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 1982 ACM 0-89791-071-0/82/0400-0085 $00.75 three distinct categories: a binary mantissa and exponent a binary encodirg of decimal digits a binary encoding of a decimal mantissa Of these representations the first is a true base two scientific notation and has been the choice of the overwhelming majority of floating point systems. The remaining methods rely on only binary representation as a media for storage and representation of values. That is, their algorithms perform actual decimal manipulations. Therefore values represented in these methods are given binary encoding for rational numbers in decimal form. All of these representations require that rational numbers in fractional form be converted to a rational number in the appropriate base. To express a general rational number as a base-specific rational number requires conversion of the original fraction to a form in which the denominator is some power of the base of representation. For example, we typically convert the fraction 1/2 to 5/10 or its equivalent form . 5 to express it within a floating point representation. This power is saved as the "exponent" in floating point forms. The key problem in these systems of representation is that not all possible fractions (in fact a very limited subset) can be converted to a base specific representation with a finite numerator. For example, the rational number 1/3 (i/ii binary) cannot be represented in either decimal or binary systems with a finite numerator over a power of the base. Floating point systems of representation must therefore truncate the least significant digits of the numerator to fit within the finite size of the mantissa The result is a loss of significance and imprecision which is introduced into subsequent computations. It follows that this same phenomena (i.e. truncation of This work was funded in part by the Academic Grant Fund of Loyola University, New Orleans.