{"title":"Memory Optimized Architecture for Efficient Gauss-Jordan Matrix Inversion","authors":"Gon alo","doi":"10.1109/SPL.2007.371720","DOIUrl":null,"url":null,"abstract":"This paper presents a new architecture for efficient Gauss-Jordan matrix inversion algorithm on reconfigurable hardware platforms. The results show that currently available re- configurable computing technology can easily achieve significantly higher floating-point performance than high-end CPUs, running state-of-the-art routines for large matrices operations. For common reconfigurable systems, where the FPGAs are directly coupled to the on-board memory, the achievable performance scales directly with the number of realizable simultaneous memory accesses. A new dedicated reconfigurable architecture is proposed and analysed and the results show a performance improvement of 2x over the previous implementation, using only half of the memory and half of the floating-point units. Benchmarking against Matlab, which features high performance matrix inversion routines, shows that a 100 MHz FPGA can easily surpass the performance of 3,2 GHz Intel Pentium IV processors. This is possible having only 5 double-port memory banks or 9 single-port memory banks connected to the FPGA.","PeriodicalId":419253,"journal":{"name":"2007 3rd Southern Conference on Programmable Logic","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 3rd Southern Conference on Programmable Logic","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPL.2007.371720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
This paper presents a new architecture for efficient Gauss-Jordan matrix inversion algorithm on reconfigurable hardware platforms. The results show that currently available re- configurable computing technology can easily achieve significantly higher floating-point performance than high-end CPUs, running state-of-the-art routines for large matrices operations. For common reconfigurable systems, where the FPGAs are directly coupled to the on-board memory, the achievable performance scales directly with the number of realizable simultaneous memory accesses. A new dedicated reconfigurable architecture is proposed and analysed and the results show a performance improvement of 2x over the previous implementation, using only half of the memory and half of the floating-point units. Benchmarking against Matlab, which features high performance matrix inversion routines, shows that a 100 MHz FPGA can easily surpass the performance of 3,2 GHz Intel Pentium IV processors. This is possible having only 5 double-port memory banks or 9 single-port memory banks connected to the FPGA.