Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.12
L. Colavito, D. Silage
Simulation of digital communication systems for the evaluation of bit error rate (BER) and other performance characteristics can be accelerated if the processing is implemented in programmable gate array (PGA) hardware. These simulations often require one or more Gaussian distributed pseudorandom number sources. Although uniformly distributed pseudorandom number sources can be readily implemented, Gaussian sources are not as easily configured. A typical method is to build a uniform source and transform the distribution to Gaussian. The inversion method accomplishes the transformation by the application of the inverse Gaussian cumulative distribution function (IGCDF). The IGCDF is easily obtained by the use of a look-up table (LUT). However, the memory required for the LUT can become large if it is to accurately represent the IGCDF. In this paper we demonstrate a method that can reduce the size of this LUT while allowing for control of the accuracy.
{"title":"Composite Look-Up Table Gaussian Pseudo-Random Number Generator","authors":"L. Colavito, D. Silage","doi":"10.1109/ReConFig.2009.12","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.12","url":null,"abstract":"Simulation of digital communication systems for the evaluation of bit error rate (BER) and other performance characteristics can be accelerated if the processing is implemented in programmable gate array (PGA) hardware. These simulations often require one or more Gaussian distributed pseudorandom number sources. Although uniformly distributed pseudorandom number sources can be readily implemented, Gaussian sources are not as easily configured. A typical method is to build a uniform source and transform the distribution to Gaussian. The inversion method accomplishes the transformation by the application of the inverse Gaussian cumulative distribution function (IGCDF). The IGCDF is easily obtained by the use of a look-up table (LUT). However, the memory required for the LUT can become large if it is to accurately represent the IGCDF. In this paper we demonstrate a method that can reduce the size of this LUT while allowing for control of the accuracy.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114662628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.60
Siddhartha Datta, R. Sass
BLASTn is a ubiquitous and important tool used for large scale DNA analysis. As such, it is a good candidate for acceleration with FPGAs. The aim of this paper is two-fold. First, building upon our prior BLAST work we describe a design composed of multiple cores that can be scaled in two dimensions. The ungapped extension and a second dimension are new in this work. Second, we use this non-trivial example to explore spatially scalable designs. To provide the ability to move the design to a future generation chip, a mathematical model of performance that incorporates all of the system design parameters and the user’s preference (high throughput vs low latency) is developed. We demonstrate here that the model correctly predicts the optimal ratio between the two dimensions on a Xilinx Virtex-4 and measures four to five times faster performance figures as compared to a state of the art general purpose processor.
{"title":"Scalability Studies of the BLASTn Scan and Ungapped Extension Functions","authors":"Siddhartha Datta, R. Sass","doi":"10.1109/ReConFig.2009.60","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.60","url":null,"abstract":"BLASTn is a ubiquitous and important tool used for large scale DNA analysis. As such, it is a good candidate for acceleration with FPGAs. The aim of this paper is two-fold. First, building upon our prior BLAST work we describe a design composed of multiple cores that can be scaled in two dimensions. The ungapped extension and a second dimension are new in this work. Second, we use this non-trivial example to explore spatially scalable designs. To provide the ability to move the design to a future generation chip, a mathematical model of performance that incorporates all of the system design parameters and the user’s preference (high throughput vs low latency) is developed. We demonstrate here that the model correctly predicts the optimal ratio between the two dimensions on a Xilinx Virtex-4 and measures four to five times faster performance figures as compared to a state of the art general purpose processor.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123468102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.32
Tobias Schumacher, Tim Süß, Christian Plessl, M. Platzner
Providing customized memory architectures is key for achieving high-performance with reconfigurable accelerators. Since reconfigurable computers provide limited possibilities for customizing the organization of external memory, a specific challenge is to make use of the existing memory layout in a flexible, yet efficient way. In this paper we build on IMORC, our architectural template and on-chip network for creating reconfigurable accelerators, and discuss its infrastructure for accessing memory. We characterize the IMORC communication bandwidth on the XtremeData XD1000 reconfigurable computer. Based on this characterization, we present a z-buffer compositing accelerator which is able to double the frame-rate of a parallel renderer.
{"title":"Communication Performance Characterization for Reconfigurable Accelerator Design on the XD1000","authors":"Tobias Schumacher, Tim Süß, Christian Plessl, M. Platzner","doi":"10.1109/ReConFig.2009.32","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.32","url":null,"abstract":"Providing customized memory architectures is key for achieving high-performance with reconfigurable accelerators. Since reconfigurable computers provide limited possibilities for customizing the organization of external memory, a specific challenge is to make use of the existing memory layout in a flexible, yet efficient way. In this paper we build on IMORC, our architectural template and on-chip network for creating reconfigurable accelerators, and discuss its infrastructure for accessing memory. We characterize the IMORC communication bandwidth on the XtremeData XD1000 reconfigurable computer. Based on this characterization, we present a z-buffer compositing accelerator which is able to double the frame-rate of a parallel renderer.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129237865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.79
Mingjie Lin, Yaling Ma
A reconfigurable computing method based on dynamic Bayesian learning network is proposed for base-calling in pyrosequencing from microarray gene expression data. Due to long memory and stochastic non-idealities in the pyrosequencing process, exact inference on the proposed dynamic Bayesian learning network is computationally prohibitive in both run-time and memory usage for reasonable problem sizes. To circumvent these issues, we design a reconfigurable Bayesian learning network, whereby processing nodes evaluate posterior probabilities of all states in parallel and crossbar switch realizes network topology that interconnects all processing nodes. The success of the proposed method is demonstrated by a prototype system implemented with Berkeley Emulation Engine 3 (BEE3) board, which achieves close to 2 times increase in read length and about 3 orders of reduction in run-time than previously reported for both experimental and simulated pyrosequencing data.
{"title":"Base-Calling in DNA Pyrosequencing with Reconfigurable Bayesian Network","authors":"Mingjie Lin, Yaling Ma","doi":"10.1109/ReConFig.2009.79","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.79","url":null,"abstract":"A reconfigurable computing method based on dynamic Bayesian learning network is proposed for base-calling in pyrosequencing from microarray gene expression data. Due to long memory and stochastic non-idealities in the pyrosequencing process, exact inference on the proposed dynamic Bayesian learning network is computationally prohibitive in both run-time and memory usage for reasonable problem sizes. To circumvent these issues, we design a reconfigurable Bayesian learning network, whereby processing nodes evaluate posterior probabilities of all states in parallel and crossbar switch realizes network topology that interconnects all processing nodes. The success of the proposed method is demonstrated by a prototype system implemented with Berkeley Emulation Engine 3 (BEE3) board, which achieves close to 2 times increase in read length and about 3 orders of reduction in run-time than previously reported for both experimental and simulated pyrosequencing data.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124978693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.37
L. Gantel, Salah Layouni, M. A. Benkhelifa, F. Verdier, S. Chauvet
Mutiprocessor architecture in embedded computing is becoming widely used. In fact, with specific development tools, platforms such as Xilinx Virtex-5 or Virtex-6 FPGA can implement multiprocessor systems (with soft-core and hard-core processors) {with just a few mouse clicks} and offer the possibility of partial and dynamic reconfiguration. Software tasks are scheduled on these platforms by embedded and distributed Real Time Operating System (RTOS). To provide high performance (execution time, power consumption...) to these Multiprocessor Soc (MPSoC) platforms, the RTOS can enable the migration of software tasks between processors. Our work deals with the study and the development of a software layer (an application programming interface) which allows task migration between soft-core processors. The soft-core can be dynamically loaded on FPGA on demand. In this paper, we present a platform that merges these two aspects, partial reconfiguration and software task migration in the context of MPSoCs. We notably investigate the incurred time and overhead for task migration and partial reconfiguration.
{"title":"Multiprocessor Task Migration Implementation in a Reconfigurable Platform","authors":"L. Gantel, Salah Layouni, M. A. Benkhelifa, F. Verdier, S. Chauvet","doi":"10.1109/ReConFig.2009.37","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.37","url":null,"abstract":"Mutiprocessor architecture in embedded computing is becoming widely used. In fact, with specific development tools, platforms such as Xilinx Virtex-5 or Virtex-6 FPGA can implement multiprocessor systems (with soft-core and hard-core processors) {with just a few mouse clicks} and offer the possibility of partial and dynamic reconfiguration. Software tasks are scheduled on these platforms by embedded and distributed Real Time Operating System (RTOS). To provide high performance (execution time, power consumption...) to these Multiprocessor Soc (MPSoC) platforms, the RTOS can enable the migration of software tasks between processors. Our work deals with the study and the development of a software layer (an application programming interface) which allows task migration between soft-core processors. The soft-core can be dynamically loaded on FPGA on demand. In this paper, we present a platform that merges these two aspects, partial reconfiguration and software task migration in the context of MPSoCs. We notably investigate the incurred time and overhead for task migration and partial reconfiguration.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122330996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.85
Juan Fernando Eusse Giraldo, R. Jacobi
This paper introduces the proposal of an Expression Grain Reconfigurable Architecture called BRICK, its functionality and main components. A mapping for three signal processing applications such as a 3x3 2-D convolution, a 16-Tap FIR filter and an 8-point FFT is developed inside the 4x4 Reconfigurable Array. A performance simulation analysis study is developed comparing the BRICK reconfigurable array VHDL implementation to a MIPS and a SPARC V8 simulators in order to validate the Reconfigurable Array proposal. Considerable gains up to an order of magnitude are obtained and important design issues and challenges were discovered when developing this work.
{"title":"Signal Processing Domain Application Mapping on the Brick Reconfigurable Array","authors":"Juan Fernando Eusse Giraldo, R. Jacobi","doi":"10.1109/ReConFig.2009.85","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.85","url":null,"abstract":"This paper introduces the proposal of an Expression Grain Reconfigurable Architecture called BRICK, its functionality and main components. A mapping for three signal processing applications such as a 3x3 2-D convolution, a 16-Tap FIR filter and an 8-point FFT is developed inside the 4x4 Reconfigurable Array. A performance simulation analysis study is developed comparing the BRICK reconfigurable array VHDL implementation to a MIPS and a SPARC V8 simulators in order to validate the Reconfigurable Array proposal. Considerable gains up to an order of magnitude are obtained and important design issues and challenges were discovered when developing this work.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128971339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.65
Diego F. Sánchez, Daniel M. Muñoz Arboleda, C. Llanos, J. M. Motta
The sequential behavior of general purpose processors presents limitations in applications that require high processing speeds. One of the advantages of FPGAs implementations is the parallel process capability, allowing acceleration of complex algorithms. Nowadays it is common to find FPGA implementations in applications requiring high speed processing. In this paper a hardware architecture for computing direct kinematics of robot manipulators using floating-point arithmetic is presented for 32, 43 and 64 bit-width representations. Otherwise, the processing time of the hardware architecture is compared with the same formulation implemented in software, using the PowerPC (FPGA embedded processor). The proposed architecture was validated using Matlab results as a statistical estimator in order to compute the Mean Square Error (MSE). Synthesis and simulation results demonstrate the accuracy and high performance of the implemented hardware architecture.
{"title":"FPGA Implementation for Direct Kinematics of a Spherical Robot Manipulator","authors":"Diego F. Sánchez, Daniel M. Muñoz Arboleda, C. Llanos, J. M. Motta","doi":"10.1109/ReConFig.2009.65","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.65","url":null,"abstract":"The sequential behavior of general purpose processors presents limitations in applications that require high processing speeds. One of the advantages of FPGAs implementations is the parallel process capability, allowing acceleration of complex algorithms. Nowadays it is common to find FPGA implementations in applications requiring high speed processing. In this paper a hardware architecture for computing direct kinematics of robot manipulators using floating-point arithmetic is presented for 32, 43 and 64 bit-width representations. Otherwise, the processing time of the hardware architecture is compared with the same formulation implemented in software, using the PowerPC (FPGA embedded processor). The proposed architecture was validated using Matlab results as a statistical estimator in order to compute the Mean Square Error (MSE). Synthesis and simulation results demonstrate the accuracy and high performance of the implemented hardware architecture.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127469994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.39
Ming Liu, Zhonghai Lu, W. Kuehn, Shuo Yang, A. Jantsch
Partial Reconfiguration (PR) offers the possibility to adaptively change part of the FPGA design without stopping the remaining system. In this paper, we present a comprehensive framework for adaptive computing, in which design key points of hardware processes, system interconnections, Operating Systems (OS), device drivers, scheduler software as well as context switching are respectively concerned in different hardware/software layers. A case study is discussed to demonstrate an example of swapping a Flash memory controller and an SRAM controller in response to diverse memory access needs. Result analysis reveals a more efficient resource utilization of 52.1% I/O pads, 86.5% LUTs and 81.3% Flip-Flops, when compared to the static design with same functionalities. A small reconfiguration overhead of context switching is measured within the range from hundreds of microseconds to milliseconds. Moreover, technical perspectives are analyzed and it is foreseen to obtain great benefits with the proposed design framework in object applications of particle physics experiments.
{"title":"A Reconfigurable Design Framework for FPGA Adaptive Computing","authors":"Ming Liu, Zhonghai Lu, W. Kuehn, Shuo Yang, A. Jantsch","doi":"10.1109/ReConFig.2009.39","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.39","url":null,"abstract":"Partial Reconfiguration (PR) offers the possibility to adaptively change part of the FPGA design without stopping the remaining system. In this paper, we present a comprehensive framework for adaptive computing, in which design key points of hardware processes, system interconnections, Operating Systems (OS), device drivers, scheduler software as well as context switching are respectively concerned in different hardware/software layers. A case study is discussed to demonstrate an example of swapping a Flash memory controller and an SRAM controller in response to diverse memory access needs. Result analysis reveals a more efficient resource utilization of 52.1% I/O pads, 86.5% LUTs and 81.3% Flip-Flops, when compared to the static design with same functionalities. A small reconfiguration overhead of context switching is measured within the range from hundreds of microseconds to milliseconds. Moreover, technical perspectives are analyzed and it is foreseen to obtain great benefits with the proposed design framework in object applications of particle physics experiments.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130521902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.80
Adwait Gupte, Phillip H. Jones
As the chips get denser and faster, heat dissipation is fast turning into a major problem in development of ICs. Nonuniform heating of chips due to hotspots is also an area of concern and much research. In this paper, we propose an adaptive method which takes advantage of the self-reconfiguration capability of modern FPGAs to mitigate hotspots. We adapt the floor plan of the IC in response to the current use and ambient conditions on the fly. It is most applicable to paradigms such as Network on Chip (NoC) that allow separation of communication and computation and allow communication between modules to be abstracted away. We achieve a reduction of up to 8 ¿C in the maximum temperature of a hotspot using typical power numbers. Alternatively, by increasing the frequency, we achieve a 2-3 times increase in throughput while maintaining the same maximum temperature.
{"title":"Hotspot Mitigation Using Dynamic Partial Reconfiguration for Improved Performance","authors":"Adwait Gupte, Phillip H. Jones","doi":"10.1109/ReConFig.2009.80","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.80","url":null,"abstract":"As the chips get denser and faster, heat dissipation is fast turning into a major problem in development of ICs. Nonuniform heating of chips due to hotspots is also an area of concern and much research. In this paper, we propose an adaptive method which takes advantage of the self-reconfiguration capability of modern FPGAs to mitigate hotspots. We adapt the floor plan of the IC in response to the current use and ambient conditions on the fly. It is most applicable to paradigms such as Network on Chip (NoC) that allow separation of communication and computation and allow communication between modules to be abstracted away. We achieve a reduction of up to 8 ¿C in the maximum temperature of a hotspot using typical power numbers. Alternatively, by increasing the frequency, we achieve a 2-3 times increase in throughput while maintaining the same maximum temperature.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133863763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.71
Guillermo Conde, G. Donohoe, S. Maheswaran
This paper describes a project undertaken to explore reconfigurable computing as a means to achieve high-throughput, low-power on-board computing for spacecraft. The solution consists of a reconfigurable data processor chip, a reconfigurable memory module, reconfigurable interconnect, and dynamic power management. The reconfigurable processor chip was fabricated in a 0.25µ bulk CMOS process using a radiation-hard-by-design standard cell library. Two challenge algorithms were demonstrated in hardware, and a dozen others in software simulation. It was shown to achieve up to 3 giga- operations/second-watt. This architecture is well-suited to future generations of ultra-low-power, low-voltage processors and memories, as the extensibility offsets the loss in throughput due to low-voltage
{"title":"Low Power, Reconfigurable Computing Platform for Spacecraft","authors":"Guillermo Conde, G. Donohoe, S. Maheswaran","doi":"10.1109/ReConFig.2009.71","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.71","url":null,"abstract":"This paper describes a project undertaken to explore reconfigurable computing as a means to achieve high-throughput, low-power on-board computing for spacecraft. The solution consists of a reconfigurable data processor chip, a reconfigurable memory module, reconfigurable interconnect, and dynamic power management. The reconfigurable processor chip was fabricated in a 0.25µ bulk CMOS process using a radiation-hard-by-design standard cell library. Two challenge algorithms were demonstrated in hardware, and a dozen others in software simulation. It was shown to achieve up to 3 giga- operations/second-watt. This architecture is well-suited to future generations of ultra-low-power, low-voltage processors and memories, as the extensibility offsets the loss in throughput due to low-voltage","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128463312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}