嵌入式系统的新颖整除算法：最适合大除数的通用算法

IF 4.6 2区数学 Q1 MATHEMATICS, APPLIED Applied and Computational Mathematics Pub Date : 2024-07-24 DOI:10.11648/j.acm.20241304.12

Mervat M. A. Mahmoud, N. E. Elashker

{"title":"嵌入式系统的新颖整除算法：最适合大除数的通用算法","authors":"Mervat M. A. Mahmoud, N. E. Elashker","doi":"10.11648/j.acm.20241304.12","DOIUrl":null,"url":null,"abstract":"The integer Constant Division (ICD) is the type of integer division in which the divisor is known in advance, enabling pre-computing operations to be included. Therefore, it can be more efficient regarding computing resources and time. However, most ICD techniques are restricted by a few values or narrow boundaries for the divisor. On the other hand, the main approaches of the division algorithms, where the divisor is variable, are digit-by-digit and convergence methods. The first techniques are simple and have less sophisticated conversion logic for the quotient but also have the problem of taking significantly long latency. On the contrary, the convergence techniques rely on multiplication rather than subtraction. They estimate the quotient of division providing the quotient with minimal latency at the expense of precision. This article suggests a precise, generic, and novel integer division algorithm based on sequential recursion with fewer iterations. The suggested methodology relies on extracting the division results for non-powers-of-two divisors from those for the closest power-of-two divisors, which are obtained simply using the right bit shifting. To the authors’ best knowledge of the state-of-the-art, the number of iterations in the recurrent variable division is half the divisor bit size, and the Sweeney, Robertson, and Tocher (SRT) division, which is named after its developers, involves log2(n) iterations. The suggested algorithm has an [(m/(n-1))-1] number of recursive iterations, where m and n are the number of bits of the dividend and the divisor, respectively. The design is simulated in the Vivado tool for validation and implemented with a Zynq UltraScale FPGA. The technique performance depends on the number of nested divisions and the size of a LUT. The two factors change according to the value of the divisor. Nevertheless, the size of the LUT is proportional to the range and the number of bits of the divisor. Furthermore, the equation that controls the number of nested blocks is illustrated in the manuscript. The proposed technique applies to both constant and variable divisors with a compact hardware area in the case of constant division. The hardware implementation of constant division has unlimited values for dividends and divisors with a compact hardware area in the case of large divisors. However, using the design in the hardware implementation of variable division is up to 64-bit dividend and 12-bit divisor. The result analysis demonstrates that this algorithm is more efficient for constant division for large numbers.","PeriodicalId":55503,"journal":{"name":"Applied and Computational Mathematics","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Novel Integer Division for Embedded Systems: Generic Algorithm Optimal for Large Divisors\",\"authors\":\"Mervat M. A. Mahmoud, N. E. Elashker\",\"doi\":\"10.11648/j.acm.20241304.12\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The integer Constant Division (ICD) is the type of integer division in which the divisor is known in advance, enabling pre-computing operations to be included. Therefore, it can be more efficient regarding computing resources and time. However, most ICD techniques are restricted by a few values or narrow boundaries for the divisor. On the other hand, the main approaches of the division algorithms, where the divisor is variable, are digit-by-digit and convergence methods. The first techniques are simple and have less sophisticated conversion logic for the quotient but also have the problem of taking significantly long latency. On the contrary, the convergence techniques rely on multiplication rather than subtraction. They estimate the quotient of division providing the quotient with minimal latency at the expense of precision. This article suggests a precise, generic, and novel integer division algorithm based on sequential recursion with fewer iterations. The suggested methodology relies on extracting the division results for non-powers-of-two divisors from those for the closest power-of-two divisors, which are obtained simply using the right bit shifting. To the authors’ best knowledge of the state-of-the-art, the number of iterations in the recurrent variable division is half the divisor bit size, and the Sweeney, Robertson, and Tocher (SRT) division, which is named after its developers, involves log2(n) iterations. The suggested algorithm has an [(m/(n-1))-1] number of recursive iterations, where m and n are the number of bits of the dividend and the divisor, respectively. The design is simulated in the Vivado tool for validation and implemented with a Zynq UltraScale FPGA. The technique performance depends on the number of nested divisions and the size of a LUT. The two factors change according to the value of the divisor. Nevertheless, the size of the LUT is proportional to the range and the number of bits of the divisor. Furthermore, the equation that controls the number of nested blocks is illustrated in the manuscript. The proposed technique applies to both constant and variable divisors with a compact hardware area in the case of constant division. The hardware implementation of constant division has unlimited values for dividends and divisors with a compact hardware area in the case of large divisors. However, using the design in the hardware implementation of variable division is up to 64-bit dividend and 12-bit divisor. The result analysis demonstrates that this algorithm is more efficient for constant division for large numbers.\",\"PeriodicalId\":55503,\"journal\":{\"name\":\"Applied and Computational Mathematics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied and Computational Mathematics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.11648/j.acm.20241304.12\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied and Computational Mathematics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.11648/j.acm.20241304.12","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 0

摘要

整数常式除法（ICD）是一种整数除法，在这种除法中，除数是预先知道的，因此可以包含预计算操作。因此，它能更有效地利用计算资源和时间。然而，大多数 ICD 技术都受限于除数的几个值或狭窄边界。另一方面，除数可变的除法算法主要有逐位法和收敛法。第一种方法简单，商的转换逻辑不复杂，但也存在延迟时间长的问题。相反，收敛技术依靠的是乘法而不是减法。它们估算除法的商，以最小的延迟提供商，但牺牲了精度。本文提出了一种精确、通用和新颖的整数除法算法，该算法基于迭代次数较少的顺序递归。所建议的方法依赖于从最接近的二乘二除法的除法结果中提取非二乘二除法的除法结果，而这些结果只需使用正确的位移即可获得。根据作者对最新技术的了解，循环变量除法的迭代次数是除数位大小的一半，而以其开发者名字命名的 Sweeney、Robertson 和 Tocher（SRT）除法需要 log2(n) 次迭代。建议算法的递归迭代次数为[(m/(n-1))-1]次，其中 m 和 n 分别是除数和被除数的位数。设计在 Vivado 工具中模拟验证，并通过 Zynq UltraScale FPGA 实现。该技术的性能取决于嵌套除法的数量和 LUT 的大小。这两个因素会根据除数的值发生变化。不过，LUT 的大小与除数的范围和位数成正比。此外，手稿中还说明了控制嵌套块数量的方程。所提出的技术既适用于常除法，也适用于变除法，在常除法的情况下，硬件面积更小。常数除法的硬件实现对红利和除数的取值没有限制，在大除数的情况下硬件面积小。然而，在变除法的硬件实现中，使用该设计最多可实现 64 位红利和 12 位除数。结果分析表明，这种算法对大数恒除法更有效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Novel Integer Division for Embedded Systems: Generic Algorithm Optimal for Large Divisors

The integer Constant Division (ICD) is the type of integer division in which the divisor is known in advance, enabling pre-computing operations to be included. Therefore, it can be more efficient regarding computing resources and time. However, most ICD techniques are restricted by a few values or narrow boundaries for the divisor. On the other hand, the main approaches of the division algorithms, where the divisor is variable, are digit-by-digit and convergence methods. The first techniques are simple and have less sophisticated conversion logic for the quotient but also have the problem of taking significantly long latency. On the contrary, the convergence techniques rely on multiplication rather than subtraction. They estimate the quotient of division providing the quotient with minimal latency at the expense of precision. This article suggests a precise, generic, and novel integer division algorithm based on sequential recursion with fewer iterations. The suggested methodology relies on extracting the division results for non-powers-of-two divisors from those for the closest power-of-two divisors, which are obtained simply using the right bit shifting. To the authors’ best knowledge of the state-of-the-art, the number of iterations in the recurrent variable division is half the divisor bit size, and the Sweeney, Robertson, and Tocher (SRT) division, which is named after its developers, involves log₂(n) iterations. The suggested algorithm has an [(m/(n-1))-1] number of recursive iterations, where m and n are the number of bits of the dividend and the divisor, respectively. The design is simulated in the Vivado tool for validation and implemented with a Zynq UltraScale FPGA. The technique performance depends on the number of nested divisions and the size of a LUT. The two factors change according to the value of the divisor. Nevertheless, the size of the LUT is proportional to the range and the number of bits of the divisor. Furthermore, the equation that controls the number of nested blocks is illustrated in the manuscript. The proposed technique applies to both constant and variable divisors with a compact hardware area in the case of constant division. The hardware implementation of constant division has unlimited values for dividends and divisors with a compact hardware area in the case of large divisors. However, using the design in the hardware implementation of variable division is up to 64-bit dividend and 12-bit divisor. The result analysis demonstrates that this algorithm is more efficient for constant division for large numbers.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied and Computational Mathematics 数学-应用数学

CiteScore

8.80

自引率

5.00%

发文量

审稿时长

6 months

期刊介绍： Applied and Computational Mathematics (ISSN Online: 2328-5613, ISSN Print: 2328-5605) is a prestigious journal that focuses on the field of applied and computational mathematics. It is driven by the computational revolution and places a strong emphasis on innovative applied mathematics with potential for real-world applicability and practicality. The journal caters to a broad audience of applied mathematicians and scientists who are interested in the advancement of mathematical principles and practical aspects of computational mathematics. Researchers from various disciplines can benefit from the diverse range of topics covered in ACM. To ensure the publication of high-quality content, all research articles undergo a rigorous peer review process. This process includes an initial screening by the editors and anonymous evaluation by expert reviewers. This guarantees that only the most valuable and accurate research is published in ACM.