Pub Date : 2002-12-16DOI: 10.1109/FPT.2002.1188718
M. Jing, C. Hsu, T. Truong, Yan-Haw Chen, Y. Chang
In the applications of AES, the long-term robustness/reliability during the period of operation should be taken into serious considerations. From such considerations, one may initiate the requirements of the design for diversity against break through from outside. In system design, the use of reconfigurable FPGA can provide higher level of flexibility. In this paper, the proposed system uses different generators, various transforms, modules and algorithms to enhance the randomization of the ciphertext. It is also a challenge to improve the system flexibility and to get a more secure design in the AES system. Several reconfigurable modules are developed on our integrated test-bench.
{"title":"The diversity study of AES on FPGA application","authors":"M. Jing, C. Hsu, T. Truong, Yan-Haw Chen, Y. Chang","doi":"10.1109/FPT.2002.1188718","DOIUrl":"https://doi.org/10.1109/FPT.2002.1188718","url":null,"abstract":"In the applications of AES, the long-term robustness/reliability during the period of operation should be taken into serious considerations. From such considerations, one may initiate the requirements of the design for diversity against break through from outside. In system design, the use of reconfigurable FPGA can provide higher level of flexibility. In this paper, the proposed system uses different generators, various transforms, modules and algorithms to enhance the randomization of the ciphertext. It is also a challenge to improve the system flexibility and to get a more secure design in the AES system. Several reconfigurable modules are developed on our integrated test-bench.","PeriodicalId":355740,"journal":{"name":"2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117345750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-16DOI: 10.1109/FPT.2002.1188725
R. Scrofano, S. Choi, V. Prasanna
Advances in their technologies have positioned FPGAs and embedded processors to compete with digital signal processors (DSPs). In this paper, we evaluate the performance in terms of both latency and energy-efficiency of FPGAs, embedded processors, and DSPs in multiplying two n /spl times/ n matrices. As specific examples, we have chosen a representative of each type of device. Our results show that the FPGAs can multiply two n /spl times/ n matrices with both lower latency and lower energy consumption than the other two types of devices. This makes FPGAs the ideal choice for matrix multiplication in signal processing applications.
{"title":"Energy efficiency of FPGAs and programmable processors for matrix multiplication","authors":"R. Scrofano, S. Choi, V. Prasanna","doi":"10.1109/FPT.2002.1188725","DOIUrl":"https://doi.org/10.1109/FPT.2002.1188725","url":null,"abstract":"Advances in their technologies have positioned FPGAs and embedded processors to compete with digital signal processors (DSPs). In this paper, we evaluate the performance in terms of both latency and energy-efficiency of FPGAs, embedded processors, and DSPs in multiplying two n /spl times/ n matrices. As specific examples, we have chosen a representative of each type of device. Our results show that the FPGAs can multiply two n /spl times/ n matrices with both lower latency and lower energy consumption than the other two types of devices. This makes FPGAs the ideal choice for matrix multiplication in signal processing applications.","PeriodicalId":355740,"journal":{"name":"2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings.","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115212580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-16DOI: 10.1109/FPT.2002.1188664
T. Rissa, R. Uusikartano, J. Niittylahti
This paper presents a technique for realizing adaptive FIR filters that use constant-coefficient multipliers on a run-time reconfigurable FPGA. Three different adaptive FIR filter architectures for run-time reconfigurable FPGAs are presented. It is shown that run-time reconfigurable logic can be used to efficiently implement adaptive constant-coefficient FIR filters. With reasonable configuration latency, benefits in speed, area and power consumption are obtained.
{"title":"Adaptive FIR filter architectures for run-time reconfigurable FPGAs","authors":"T. Rissa, R. Uusikartano, J. Niittylahti","doi":"10.1109/FPT.2002.1188664","DOIUrl":"https://doi.org/10.1109/FPT.2002.1188664","url":null,"abstract":"This paper presents a technique for realizing adaptive FIR filters that use constant-coefficient multipliers on a run-time reconfigurable FPGA. Three different adaptive FIR filter architectures for run-time reconfigurable FPGAs are presented. It is shown that run-time reconfigurable logic can be used to efficiently implement adaptive constant-coefficient FIR filters. With reasonable configuration latency, benefits in speed, area and power consumption are obtained.","PeriodicalId":355740,"journal":{"name":"2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125663149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-16DOI: 10.1109/FPT.2002.1188697
T. Mak, K. Lam
Implementation of shortest path algorithm in FPGA has been recently proposed for solving the network routing problem. This paper discusses the architecture and implementation of shortest path algorithms for Floyd-Warshall algorithm and the parallel implementation of Bellman-Ford algorithm in the Binary Relation Inference Network architecture. There are significant differences in the performance of computing shortest paths for these two different approaches. The computation speed and resource consumption issues are discussed. An alternative, serial implementation of the synchronized inference network for single-destination problem is also explored, with emphasis on computation time, resource consumption, and scaling problem size.
{"title":"Serial-parallel tradeoff analysis of all-pairs shortest path algorithms in reconfigurable computing","authors":"T. Mak, K. Lam","doi":"10.1109/FPT.2002.1188697","DOIUrl":"https://doi.org/10.1109/FPT.2002.1188697","url":null,"abstract":"Implementation of shortest path algorithm in FPGA has been recently proposed for solving the network routing problem. This paper discusses the architecture and implementation of shortest path algorithms for Floyd-Warshall algorithm and the parallel implementation of Bellman-Ford algorithm in the Binary Relation Inference Network architecture. There are significant differences in the performance of computing shortest paths for these two different approaches. The computation speed and resource consumption issues are discussed. An alternative, serial implementation of the synchronized inference network for single-destination problem is also explored, with emphasis on computation time, resource consumption, and scaling problem size.","PeriodicalId":355740,"journal":{"name":"2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114323649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-16DOI: 10.1109/FPT.2002.1188660
Paul L. Master
The age of adaptive computing is upon us, empowering the engineering community with the next big leap in computing; one in which algorithmic elements are mapped directly on to dynamic hardware resources to create the exact hardware needed for a task, clock cycle by clock cycle. The outcome of this powerful concept is a computing platform that combines the best of hardware and software into a powerful enabling technology for design and innovation. The Adaptive Computing Machine (ACM) is the first instantiation of adaptive computing. The ACM offers ASIC-class performance and low power consumption by means of a highly flexible architecture that is dynamically configured, both spatially and temporally, so that software becomes the needed hardware. The inherent adaptability and high performance of the ACM enable next-generation mobile and wireless devices to become personal communicators with multifunctionality that includes advanced features, such as streaming media and digital imaging, as well as software defined radio (SDR) for world phone capabilities.
{"title":"The next big leap in reconfigurable systems","authors":"Paul L. Master","doi":"10.1109/FPT.2002.1188660","DOIUrl":"https://doi.org/10.1109/FPT.2002.1188660","url":null,"abstract":"The age of adaptive computing is upon us, empowering the engineering community with the next big leap in computing; one in which algorithmic elements are mapped directly on to dynamic hardware resources to create the exact hardware needed for a task, clock cycle by clock cycle. The outcome of this powerful concept is a computing platform that combines the best of hardware and software into a powerful enabling technology for design and innovation. The Adaptive Computing Machine (ACM) is the first instantiation of adaptive computing. The ACM offers ASIC-class performance and low power consumption by means of a highly flexible architecture that is dynamically configured, both spatially and temporally, so that software becomes the needed hardware. The inherent adaptability and high performance of the ACM enable next-generation mobile and wireless devices to become personal communicators with multifunctionality that includes advanced features, such as streaming media and digital imaging, as well as software defined radio (SDR) for world phone capabilities.","PeriodicalId":355740,"journal":{"name":"2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123587649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-16DOI: 10.1109/FPT.2002.1188719
S. Miremadi, Siavash Bayat Sarmadi, H. Asadi
This paper presents an analytical approach to estimate the speedup in a simulation-emulation cooperation environment. The speedup of this approach as compared with the speedup of a pure simulation is analyzed. Also, an analysis of the speedup is given when different types of application instructions are utilized. The analysis is based on using both Verilog and VHDL. The results show that when only the simulation part of the simulation-emulation co-operation is used, the speedup is higher, than when the pure simulation is used. The total speedup is also depended on the type of application instructions and the communication cycle time between the simulator and the emulator.
{"title":"Speedup analysis in simulation-emulation co-operation","authors":"S. Miremadi, Siavash Bayat Sarmadi, H. Asadi","doi":"10.1109/FPT.2002.1188719","DOIUrl":"https://doi.org/10.1109/FPT.2002.1188719","url":null,"abstract":"This paper presents an analytical approach to estimate the speedup in a simulation-emulation cooperation environment. The speedup of this approach as compared with the speedup of a pure simulation is analyzed. Also, an analysis of the speedup is given when different types of application instructions are utilized. The analysis is based on using both Verilog and VHDL. The results show that when only the simulation part of the simulation-emulation co-operation is used, the speedup is higher, than when the pure simulation is used. The total speedup is also depended on the type of application instructions and the communication cycle time between the simulator and the emulator.","PeriodicalId":355740,"journal":{"name":"2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings.","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127606464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-16DOI: 10.1109/FPT.2002.1188695
K. Benkrid
This paper presents the design and implementation of a high-level generator of optimised FPGA configurations for Image Algebra (IA) neighbourhood operations. These configurations are parameterised and scaleable in terms of the IA operation itself the window size, the window coefficients, the input pixel word length and the image size. The window coefficients of the neighbourhood operations are represented as sum/subtract of power of twos in Canonical Signed Digit (CSD) representation, which means that the usually costly multiplication operation can be easily implemented using a small number of simple shift-and-add operations, leading to considerable hardware savings. EDIF netlists are generated automatically from high-level descriptions of the IA operations in /spl sim/1 sec. These are specifically optimised for Xilinx XC4000 chips, although implementations for other targets can also be easily realised.
{"title":"A multiplier-less FPGA core for image algebra neighbourhood operations","authors":"K. Benkrid","doi":"10.1109/FPT.2002.1188695","DOIUrl":"https://doi.org/10.1109/FPT.2002.1188695","url":null,"abstract":"This paper presents the design and implementation of a high-level generator of optimised FPGA configurations for Image Algebra (IA) neighbourhood operations. These configurations are parameterised and scaleable in terms of the IA operation itself the window size, the window coefficients, the input pixel word length and the image size. The window coefficients of the neighbourhood operations are represented as sum/subtract of power of twos in Canonical Signed Digit (CSD) representation, which means that the usually costly multiplication operation can be easily implemented using a small number of simple shift-and-add operations, leading to considerable hardware savings. EDIF netlists are generated automatically from high-level descriptions of the IA operations in /spl sim/1 sec. These are specifically optimised for Xilinx XC4000 chips, although implementations for other targets can also be easily realised.","PeriodicalId":355740,"journal":{"name":"2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings.","volume":"51 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128330282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-16DOI: 10.1109/FPT.2002.1188717
C. Hsu, T. Truong, M. Jing, W. Wu, H. C. Wu
In digital system development, the CPLD/FPGA is usually used to implement basic function blocks for the purposes of testing, integration and IP proof. The advantages of CPLD/FPGA are high efficiency, flexibility and easy reconfiguration. Taking AES as an example, this application needs more flexible transformations to design for diversity. In order to meet such requirements without declining the performance, a modified architecture of FPGA is proposed to increase the overall efficiency and keep high throughput. A finite field multiplier is provided for the explanation of the newly developed core. The parallel and pipelined design in FPGA can replace high-speed VLSI chip with dynamic reconfigurability.
{"title":"The feasibility study of designing a FPGA multiplier-core on finite field","authors":"C. Hsu, T. Truong, M. Jing, W. Wu, H. C. Wu","doi":"10.1109/FPT.2002.1188717","DOIUrl":"https://doi.org/10.1109/FPT.2002.1188717","url":null,"abstract":"In digital system development, the CPLD/FPGA is usually used to implement basic function blocks for the purposes of testing, integration and IP proof. The advantages of CPLD/FPGA are high efficiency, flexibility and easy reconfiguration. Taking AES as an example, this application needs more flexible transformations to design for diversity. In order to meet such requirements without declining the performance, a modified architecture of FPGA is proposed to increase the overall efficiency and keep high throughput. A finite field multiplier is provided for the explanation of the newly developed core. The parallel and pipelined design in FPGA can replace high-speed VLSI chip with dynamic reconfigurability.","PeriodicalId":355740,"journal":{"name":"2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings.","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124478591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-16DOI: 10.1109/FPT.2002.1188677
A. A. Gaffar, O. Mencer, W. Luk, P. Cheung, N. Shirazi
Automatic bitwidth analysis is a key ingredient for highlevel programming of FPGAs and high-level synthesis of VLSI circuits. The objective is to find the minimal number of bits to represent a value in order to minimise the circuit area and to improve efficiency of the respective arithmetic operations, while satisfying user-defined numerical constraints. We present a novel approach to bitwidth- or precision-analysis for floating-point designs. The approach involves analysing the dataflow graph representation of a design to see how sensitive the output of a node is to changes in the outputs of other nodes: higher sensitivity requires higher precision and hence more output bits. We automate such sensitivity analysis by a mathematical method called automatic differentiation, which involves differentiating variables in a design with respect to other variables. We illustrate our approach by optimising the bitwidth for two examples, a discrete Fourier transform (DFT) implementation and a Finite Impulse Response (FIR) filter implementation.
{"title":"Floating-point bitwidth analysis via automatic differentiation","authors":"A. A. Gaffar, O. Mencer, W. Luk, P. Cheung, N. Shirazi","doi":"10.1109/FPT.2002.1188677","DOIUrl":"https://doi.org/10.1109/FPT.2002.1188677","url":null,"abstract":"Automatic bitwidth analysis is a key ingredient for highlevel programming of FPGAs and high-level synthesis of VLSI circuits. The objective is to find the minimal number of bits to represent a value in order to minimise the circuit area and to improve efficiency of the respective arithmetic operations, while satisfying user-defined numerical constraints. We present a novel approach to bitwidth- or precision-analysis for floating-point designs. The approach involves analysing the dataflow graph representation of a design to see how sensitive the output of a node is to changes in the outputs of other nodes: higher sensitivity requires higher precision and hence more output bits. We automate such sensitivity analysis by a mathematical method called automatic differentiation, which involves differentiating variables in a design with respect to other variables. We illustrate our approach by optimising the bitwidth for two examples, a discrete Fourier transform (DFT) implementation and a Finite Impulse Response (FIR) filter implementation.","PeriodicalId":355740,"journal":{"name":"2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings.","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126398333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-16DOI: 10.1109/FPT.2002.1188656
P. Cheung
Looks at the development of innovation and technology funding, infrastructure and achievements in Hong Kong in the past few years, and examine the potential areas where Hong Kong can excel and be a significant contributor to technology development. Comparison will be drawn from Finland, an economy of a size and GDP very similar to that of Hong Kong.
{"title":"Technology research and development in Hong Kong: hype or reality","authors":"P. Cheung","doi":"10.1109/FPT.2002.1188656","DOIUrl":"https://doi.org/10.1109/FPT.2002.1188656","url":null,"abstract":"Looks at the development of innovation and technology funding, infrastructure and achievements in Hong Kong in the past few years, and examine the potential areas where Hong Kong can excel and be a significant contributor to technology development. Comparison will be drawn from Finland, an economy of a size and GDP very similar to that of Hong Kong.","PeriodicalId":355740,"journal":{"name":"2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings.","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125442490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}