Numerous applications based on very large scale intergration (VLSI) architecture suffer from large size components that lead to an error in the design of the filter during the stages of floating point arithmetic. Hence, it is necessary to change the architectural model that increases the design complexity and the time delay effect. The issue encountered in the VLSI architectures for finite impulse response (FIR) filter is the increased number of components, especially delay elements. For the VLSI architecture reconfigured with reduced register usage, this article provides the floating point processing element (FPPE) implementation with Cross-Folded Shifting. The proposed FIR filter system reduces the number of components in the circuit which increases the complexity and high delay rate in the logical operation. The system has a comparatively reduced delay rate and power consumption. Hence, an efficient fast architecture based on the FPPE method is developed in this paper.
{"title":"FPGA-based implementation of floating point processing element for the design of efficient FIR filters","authors":"Tintu Mary John, Shanty Chacko","doi":"10.1049/cdt2.12010","DOIUrl":"10.1049/cdt2.12010","url":null,"abstract":"<p>Numerous applications based on very large scale intergration (VLSI) architecture suffer from large size components that lead to an error in the design of the filter during the stages of floating point arithmetic. Hence, it is necessary to change the architectural model that increases the design complexity and the time delay effect. The issue encountered in the VLSI architectures for finite impulse response (FIR) filter is the increased number of components, especially delay elements. For the VLSI architecture reconfigured with reduced register usage, this article provides the floating point processing element (FPPE) implementation with Cross-Folded Shifting. The proposed FIR filter system reduces the number of components in the circuit which increases the complexity and high delay rate in the logical operation. The system has a comparatively reduced delay rate and power consumption. Hence, an efficient fast architecture based on the FPPE method is developed in this paper.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"15 4","pages":"296-301"},"PeriodicalIF":1.2,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83302759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Saeed Ansari, Shyama Gandhi, Bruce F. Cockburn, Jie Han
The logarithmic number system (LNS) can be used to simplify the computation of arithmetic functions, such as multiplication. This article proposes three leading-one detectors (LODs) to speed up the binary logarithm calculation in the LNS. The first LOD (LOD I) uses a single fixed value to approximate the d least significant bits (LSBs) in the outputs of the LOD. The second design (LOD II) partitions the d LSBs into smaller fields and uses a multiplexer to select the closest approximation to the exact value. These two LODs help with error cancellation as they introduce signed errors for inputs N < 2d. Additionally, a scaling scheme is proposed that scales up the input N < 2d to avoid large approximation errors. Finally, an improved exact LOD (LOD III) is proposed that only passes half of the input N to the LOD; the more significant half is passed if there is at least one ‘1’ in that half; otherwise, the less significant half is passed. Our simulation results show that the 32-bit LOD III can be up to 2.8× more energy-efficient than existing designs in the literature. The Mitchell logarithmic multiplier and a neural network are considered to further illustrate the practicality of the proposed designs.
{"title":"Fast and low-power leading-one detectors for energy-efficient logarithmic computing","authors":"Mohammad Saeed Ansari, Shyama Gandhi, Bruce F. Cockburn, Jie Han","doi":"10.1049/cdt2.12019","DOIUrl":"10.1049/cdt2.12019","url":null,"abstract":"<p>The logarithmic number system (LNS) can be used to simplify the computation of arithmetic functions, such as multiplication. This article proposes three leading-one detectors (LODs) to speed up the binary logarithm calculation in the LNS. The first LOD (LOD I) uses a single fixed value to approximate the <i>d</i> least significant bits (LSBs) in the outputs of the LOD. The second design (LOD II) partitions the <i>d</i> LSBs into smaller fields and uses a multiplexer to select the closest approximation to the exact value. These two LODs help with error cancellation as they introduce signed errors for inputs <i>N</i> < 2<sup><i>d</i></sup>. Additionally, a scaling scheme is proposed that scales up the input <i>N</i> < 2<sup><i>d</i></sup> to avoid large approximation errors. Finally, an improved exact LOD (LOD III) is proposed that only passes half of the input <i>N</i> to the LOD; the more significant half is passed if there is at least one ‘1’ in that half; otherwise, the less significant half is passed. Our simulation results show that the 32-bit LOD III can be up to 2.8× more energy-efficient than existing designs in the literature. The Mitchell logarithmic multiplier and a neural network are considered to further illustrate the practicality of the proposed designs.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"15 4","pages":"241-250"},"PeriodicalIF":1.2,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85907537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pralhadrao V. Shantagiri, Rohit Kapur, Chandrasekar Shastry
The test data volume (TDV) increases with increased target compression in scan compression and adds to the test cost. Increased TDV is the result of a dependency across scan flip-flops (SFFs) that resulted from compression architecture, which is absent in scan mode. The SFFs have uncompressible values logic-0 and logic-1 in many or most of the patterns contribute to the TDV. In the proposed new scan compression (NSC) architecture, SFFs are analysed from Automatic Test Pattern Generation (ATPG) patterns generated in a scan mode. The identification of SFFs to be moved out of the compression architecture is carried out based on the NSC. The method includes a ranking of SFFs based on the specified values present in the test patterns. The SFFs having higher specified values are moved out of the compression architecture and placed in the outside scan chain. The NSC is the combination of scan compression and scan mode. This method decides the percentage (%) of SFFs to be moved out of compression architecture and is less than 0.5% of the total SFFs present in the design to achieve a better result. The NSC reduces dependencies across the SFFs present in the test compression architecture. It reduces the TDV and test application time. The results show a significant reduction in the TDV up to 78.14% for the same test coverage.
{"title":"New scan compression approach to reduce the test data volume","authors":"Pralhadrao V. Shantagiri, Rohit Kapur, Chandrasekar Shastry","doi":"10.1049/cdt2.12020","DOIUrl":"10.1049/cdt2.12020","url":null,"abstract":"<p>The test data volume (TDV) increases with increased target compression in scan compression and adds to the test cost. Increased TDV is the result of a dependency across scan flip-flops (SFFs) that resulted from compression architecture, which is absent in scan mode. The SFFs have uncompressible values logic-0 and logic-1 in many or most of the patterns contribute to the TDV. In the proposed new scan compression (NSC) architecture, SFFs are analysed from Automatic Test Pattern Generation (ATPG) patterns generated in a scan mode. The identification of SFFs to be moved out of the compression architecture is carried out based on the NSC. The method includes a ranking of SFFs based on the specified values present in the test patterns. The SFFs having higher specified values are moved out of the compression architecture and placed in the outside scan chain. The NSC is the combination of scan compression and scan mode. This method decides the percentage (%) of SFFs to be moved out of compression architecture and is less than 0.5% of the total SFFs present in the design to achieve a better result. The NSC reduces dependencies across the SFFs present in the test compression architecture. It reduces the TDV and test application time. The results show a significant reduction in the TDV up to 78.14% for the same test coverage.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"15 4","pages":"251-262"},"PeriodicalIF":1.2,"publicationDate":"2021-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12020","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88164538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heba E. Hassan, Gihan Nagib, Khaled Hosny Ibrahiem
Multiprocessor task scheduling problem is a pressing problem that affects systems' performance and is still being investigated by the researchers. Finding the optimal schedules is considered to be a computationally hard problem. Recently, researchers have used fuzzy logic in the field of task scheduling to achieve optimal performance, but this area of research is still not well investigated. In addition, there are various scheduling algorithms that used fuzzy logic but most of them are often performed on uniprocessor systems. This article presents a new proposed algorithm in which the priorities of the tasks are derived from the fuzzy logic and bottom level parameter. This approach is designed to find task schedules with optimal or sub-optimal lengths in order to achieve high performance for a multiprocessor environment. With respect to the proposed algorithm, the precedence constraints between the non-preemptive tasks and their execution times are known and described by a directed acyclic graph. The number of processors is fixed, the communication costs are negligible and the processors are homogeneous. The suggested technique is tested and compared with the Prototype Standard Task Graph Set.
{"title":"A novel task scheduling approach for dependent non-preemptive tasks using fuzzy logic","authors":"Heba E. Hassan, Gihan Nagib, Khaled Hosny Ibrahiem","doi":"10.1049/cdt2.12018","DOIUrl":"10.1049/cdt2.12018","url":null,"abstract":"<p>Multiprocessor task scheduling problem is a pressing problem that affects systems' performance and is still being investigated by the researchers. Finding the optimal schedules is considered to be a computationally hard problem. Recently, researchers have used fuzzy logic in the field of task scheduling to achieve optimal performance, but this area of research is still not well investigated. In addition, there are various scheduling algorithms that used fuzzy logic but most of them are often performed on uniprocessor systems. This article presents a new proposed algorithm in which the priorities of the tasks are derived from the fuzzy logic and bottom level parameter. This approach is designed to find task schedules with optimal or sub-optimal lengths in order to achieve high performance for a multiprocessor environment. With respect to the proposed algorithm, the precedence constraints between the non-preemptive tasks and their execution times are known and described by a directed acyclic graph. The number of processors is fixed, the communication costs are negligible and the processors are homogeneous. The suggested technique is tested and compared with the Prototype Standard Task Graph Set.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"15 3","pages":"214-222"},"PeriodicalIF":1.2,"publicationDate":"2021-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82794869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
<p>The advancements in wireless communication have created exponential growth in the Internet of Things (IoT) systems. Security and privacy of the IoT systems are critical challenges in many data-sensitive applications. Herein, high-throughput and flexible hardware implementations of the Camellia block cipher for IoT applications are presented. In the proposed structures, sub-blocks of the ciphers are implemented based on optimised circuits. The proposed structures for Camellia are designed and shared for implementing the encryption process and generating some intermediate key values in the two separate times. The most complex block in these ciphers is the substitution box (S-box). The S-boxes are implemented based on area-optimised logic circuits. The Camellia S-boxes consist of a field inversion over <math>