首页 > 最新文献

2009 International Conference on Reconfigurable Computing and FPGAs最新文献

英文 中文
Composite Look-Up Table Gaussian Pseudo-Random Number Generator 复合查找表高斯伪随机数生成器
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.12
L. Colavito, D. Silage
Simulation of digital communication systems for the evaluation of bit error rate (BER) and other performance characteristics can be accelerated if the processing is implemented in programmable gate array (PGA) hardware. These simulations often require one or more Gaussian distributed pseudorandom number sources. Although uniformly distributed pseudorandom number sources can be readily implemented, Gaussian sources are not as easily configured. A typical method is to build a uniform source and transform the distribution to Gaussian. The inversion method accomplishes the transformation by the application of the inverse Gaussian cumulative distribution function (IGCDF). The IGCDF is easily obtained by the use of a look-up table (LUT). However, the memory required for the LUT can become large if it is to accurately represent the IGCDF. In this paper we demonstrate a method that can reduce the size of this LUT while allowing for control of the accuracy.
在可编程门阵列(PGA)硬件中实现对误码率(BER)和其他性能特性的评估,可以加快数字通信系统仿真的速度。这些模拟通常需要一个或多个高斯分布伪随机数源。虽然均匀分布的伪随机数源可以很容易地实现,但高斯源不容易配置。一种典型的方法是建立一个均匀源,并将分布转换为高斯分布。反演方法通过应用逆高斯累积分布函数(IGCDF)来完成变换。IGCDF很容易通过使用查找表(LUT)获得。但是,如果要准确地表示IGCDF, LUT所需的内存可能会变得很大。在本文中,我们展示了一种方法,可以减少这种LUT的大小,同时允许控制精度。
{"title":"Composite Look-Up Table Gaussian Pseudo-Random Number Generator","authors":"L. Colavito, D. Silage","doi":"10.1109/ReConFig.2009.12","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.12","url":null,"abstract":"Simulation of digital communication systems for the evaluation of bit error rate (BER) and other performance characteristics can be accelerated if the processing is implemented in programmable gate array (PGA) hardware. These simulations often require one or more Gaussian distributed pseudorandom number sources. Although uniformly distributed pseudorandom number sources can be readily implemented, Gaussian sources are not as easily configured. A typical method is to build a uniform source and transform the distribution to Gaussian. The inversion method accomplishes the transformation by the application of the inverse Gaussian cumulative distribution function (IGCDF). The IGCDF is easily obtained by the use of a look-up table (LUT). However, the memory required for the LUT can become large if it is to accurately represent the IGCDF. In this paper we demonstrate a method that can reduce the size of this LUT while allowing for control of the accuracy.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114662628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Scalability Studies of the BLASTn Scan and Ungapped Extension Functions BLASTn扫描和unapping扩展函数的可扩展性研究
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.60
Siddhartha Datta, R. Sass
BLASTn is a ubiquitous and important tool used for large scale DNA analysis. As such, it is a good candidate for acceleration with FPGAs. The aim of this paper is two-fold. First, building upon our prior BLAST work we describe a design composed of multiple cores that can be scaled in two dimensions. The ungapped extension and a second dimension are new in this work. Second, we use this non-trivial example to explore spatially scalable designs. To provide the ability to move the design to a future generation chip, a mathematical model of performance that incorporates all of the system design parameters and the user’s preference (high throughput vs low latency) is developed. We demonstrate here that the model correctly predicts the optimal ratio between the two dimensions on a Xilinx Virtex-4 and measures four to five times faster performance figures as compared to a state of the art general purpose processor.
BLASTn是一种普遍存在的重要工具,用于大规模DNA分析。因此,它是一个很好的候选与fpga加速。本文的目的是双重的。首先,在我们之前的BLAST工作的基础上,我们描述了一个由多个核心组成的设计,可以在两个维度上缩放。在这项工作中,未间隙的扩展和第二次元是新的。其次,我们使用这个不平凡的例子来探索空间可伸缩的设计。为了提供将设计转移到下一代芯片的能力,开发了一个包含所有系统设计参数和用户偏好(高吞吐量与低延迟)的性能数学模型。我们在这里演示了该模型正确地预测了Xilinx Virtex-4上两个维度之间的最佳比例,并且与最先进的通用处理器相比,测量的性能数据快了4到5倍。
{"title":"Scalability Studies of the BLASTn Scan and Ungapped Extension Functions","authors":"Siddhartha Datta, R. Sass","doi":"10.1109/ReConFig.2009.60","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.60","url":null,"abstract":"BLASTn is a ubiquitous and important tool used for large scale DNA analysis. As such, it is a good candidate for acceleration with FPGAs. The aim of this paper is two-fold. First, building upon our prior BLAST work we describe a design composed of multiple cores that can be scaled in two dimensions. The ungapped extension and a second dimension are new in this work. Second, we use this non-trivial example to explore spatially scalable designs. To provide the ability to move the design to a future generation chip, a mathematical model of performance that incorporates all of the system design parameters and the user’s preference (high throughput vs low latency) is developed. We demonstrate here that the model correctly predicts the optimal ratio between the two dimensions on a Xilinx Virtex-4 and measures four to five times faster performance figures as compared to a state of the art general purpose processor.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123468102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Communication Performance Characterization for Reconfigurable Accelerator Design on the XD1000 XD1000上可重构加速器设计的通信性能表征
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.32
Tobias Schumacher, Tim Süß, Christian Plessl, M. Platzner
Providing customized memory architectures is key for achieving high-performance with reconfigurable accelerators. Since reconfigurable computers provide limited possibilities for customizing the organization of external memory, a specific challenge is to make use of the existing memory layout in a flexible, yet efficient way. In this paper we build on IMORC, our architectural template and on-chip network for creating reconfigurable accelerators, and discuss its infrastructure for accessing memory. We characterize the IMORC communication bandwidth on the XtremeData XD1000 reconfigurable computer. Based on this characterization, we present a z-buffer compositing accelerator which is able to double the frame-rate of a parallel renderer.
提供定制的内存架构是实现可重构加速器高性能的关键。由于可重构计算机提供的自定义外部存储器组织的可能性有限,因此一个具体的挑战是以灵活而有效的方式利用现有的存储器布局。在本文中,我们建立了IMORC,我们的架构模板和片上网络,用于创建可重构加速器,并讨论了其访问内存的基础结构。我们对XtremeData XD1000可重构计算机上的IMORC通信带宽进行了表征。基于这一特性,我们提出了一个z-buffer合成加速器,它能够将并行渲染器的帧率提高一倍。
{"title":"Communication Performance Characterization for Reconfigurable Accelerator Design on the XD1000","authors":"Tobias Schumacher, Tim Süß, Christian Plessl, M. Platzner","doi":"10.1109/ReConFig.2009.32","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.32","url":null,"abstract":"Providing customized memory architectures is key for achieving high-performance with reconfigurable accelerators. Since reconfigurable computers provide limited possibilities for customizing the organization of external memory, a specific challenge is to make use of the existing memory layout in a flexible, yet efficient way. In this paper we build on IMORC, our architectural template and on-chip network for creating reconfigurable accelerators, and discuss its infrastructure for accessing memory. We characterize the IMORC communication bandwidth on the XtremeData XD1000 reconfigurable computer. Based on this characterization, we present a z-buffer compositing accelerator which is able to double the frame-rate of a parallel renderer.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129237865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Base-Calling in DNA Pyrosequencing with Reconfigurable Bayesian Network 基于可重构贝叶斯网络的DNA焦磷酸测序碱基调用
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.79
Mingjie Lin, Yaling Ma
A reconfigurable computing method based on dynamic Bayesian learning network is proposed for base-calling in pyrosequencing from microarray gene expression data. Due to long memory and stochastic non-idealities in the pyrosequencing process, exact inference on the proposed dynamic Bayesian learning network is computationally prohibitive in both run-time and memory usage for reasonable problem sizes. To circumvent these issues, we design a reconfigurable Bayesian learning network, whereby processing nodes evaluate posterior probabilities of all states in parallel and crossbar switch realizes network topology that interconnects all processing nodes. The success of the proposed method is demonstrated by a prototype system implemented with Berkeley Emulation Engine 3 (BEE3) board, which achieves close to 2 times increase in read length and about 3 orders of reduction in run-time than previously reported for both experimental and simulated pyrosequencing data.
提出了一种基于动态贝叶斯学习网络的可重构计算方法,用于焦磷酸测序中基因表达数据的碱基调用。由于焦焦测序过程中的长记忆和随机非理想性,在合理的问题规模下,对所提出的动态贝叶斯学习网络进行精确推断在运行时间和内存使用上都是计算上禁止的。为了规避这些问题,我们设计了一个可重构的贝叶斯学习网络,处理节点并行评估所有状态的后验概率,交叉开关实现了所有处理节点互连的网络拓扑结构。采用伯克利仿真引擎3 (BEE3)板实现的原型系统证明了该方法的成功,在实验和模拟焦磷酸测序数据中,读取长度比先前报道的增加了近2倍,运行时间减少了约3个数量级。
{"title":"Base-Calling in DNA Pyrosequencing with Reconfigurable Bayesian Network","authors":"Mingjie Lin, Yaling Ma","doi":"10.1109/ReConFig.2009.79","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.79","url":null,"abstract":"A reconfigurable computing method based on dynamic Bayesian learning network is proposed for base-calling in pyrosequencing from microarray gene expression data. Due to long memory and stochastic non-idealities in the pyrosequencing process, exact inference on the proposed dynamic Bayesian learning network is computationally prohibitive in both run-time and memory usage for reasonable problem sizes. To circumvent these issues, we design a reconfigurable Bayesian learning network, whereby processing nodes evaluate posterior probabilities of all states in parallel and crossbar switch realizes network topology that interconnects all processing nodes. The success of the proposed method is demonstrated by a prototype system implemented with Berkeley Emulation Engine 3 (BEE3) board, which achieves close to 2 times increase in read length and about 3 orders of reduction in run-time than previously reported for both experimental and simulated pyrosequencing data.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124978693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiprocessor Task Migration Implementation in a Reconfigurable Platform 可重构平台中的多处理器任务迁移实现
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.37
L. Gantel, Salah Layouni, M. A. Benkhelifa, F. Verdier, S. Chauvet
Mutiprocessor architecture in embedded computing is becoming widely used. In fact, with specific development tools, platforms such as Xilinx Virtex-5 or Virtex-6 FPGA can implement multiprocessor systems (with soft-core and hard-core processors) {with just a few mouse clicks} and offer the possibility of partial and dynamic reconfiguration. Software tasks are scheduled on these platforms by embedded and distributed Real Time Operating System (RTOS). To provide high performance (execution time, power consumption...) to these Multiprocessor Soc (MPSoC) platforms, the RTOS can enable the migration of software tasks between processors. Our work deals with the study and the development of a software layer (an application programming interface) which allows task migration between soft-core processors. The soft-core can be dynamically loaded on FPGA on demand. In this paper, we present a platform that merges these two aspects, partial reconfiguration and software task migration in the context of MPSoCs. We notably investigate the incurred time and overhead for task migration and partial reconfiguration.
多处理器体系结构在嵌入式计算中的应用越来越广泛。事实上,使用特定的开发工具,Xilinx Virtex-5或Virtex-6 FPGA等平台可以实现多处理器系统(带有软核和硬核处理器){只需点击几下鼠标},并提供部分和动态重新配置的可能性。软件任务通过嵌入式和分布式实时操作系统(RTOS)在这些平台上调度。为了向这些多处理器Soc (MPSoC)平台提供高性能(执行时间,功耗…),RTOS可以在处理器之间迁移软件任务。我们的工作涉及软件层(应用程序编程接口)的研究和开发,该层允许在软核处理器之间进行任务迁移。软核可以根据需要动态加载到FPGA上。在本文中,我们提出了一个融合了这两个方面的平台,在mpsoc的背景下,部分重构和软件任务迁移。我们特别研究了任务迁移和部分重新配置所产生的时间和开销。
{"title":"Multiprocessor Task Migration Implementation in a Reconfigurable Platform","authors":"L. Gantel, Salah Layouni, M. A. Benkhelifa, F. Verdier, S. Chauvet","doi":"10.1109/ReConFig.2009.37","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.37","url":null,"abstract":"Mutiprocessor architecture in embedded computing is becoming widely used. In fact, with specific development tools, platforms such as Xilinx Virtex-5 or Virtex-6 FPGA can implement multiprocessor systems (with soft-core and hard-core processors) {with just a few mouse clicks} and offer the possibility of partial and dynamic reconfiguration. Software tasks are scheduled on these platforms by embedded and distributed Real Time Operating System (RTOS). To provide high performance (execution time, power consumption...) to these Multiprocessor Soc (MPSoC) platforms, the RTOS can enable the migration of software tasks between processors. Our work deals with the study and the development of a software layer (an application programming interface) which allows task migration between soft-core processors. The soft-core can be dynamically loaded on FPGA on demand. In this paper, we present a platform that merges these two aspects, partial reconfiguration and software task migration in the context of MPSoCs. We notably investigate the incurred time and overhead for task migration and partial reconfiguration.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122330996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Signal Processing Domain Application Mapping on the Brick Reconfigurable Array 砖可重构阵列上的信号处理域应用映射
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.85
Juan Fernando Eusse Giraldo, R. Jacobi
This paper introduces the proposal of an Expression Grain Reconfigurable Architecture called BRICK, its functionality and main components. A mapping for three signal processing applications such as a 3x3 2-D convolution, a 16-Tap FIR filter and an 8-point FFT is developed inside the 4x4 Reconfigurable Array. A performance simulation analysis study is developed comparing the BRICK reconfigurable array VHDL implementation to a MIPS and a SPARC V8 simulators in order to validate the Reconfigurable Array proposal. Considerable gains up to an order of magnitude are obtained and important design issues and challenges were discovered when developing this work.
本文介绍了一种名为BRICK的表达式粒度可重构体系结构的提出及其功能和主要组成部分。在4x4可重构阵列内部开发了三个信号处理应用的映射,例如3x3二维卷积,16分导FIR滤波器和8点FFT。为了验证可重构阵列的建议,将BRICK可重构阵列VHDL实现与MIPS和SPARC V8模拟器进行了性能仿真分析研究。在开发这项工作时,获得了高达数量级的可观收益,并发现了重要的设计问题和挑战。
{"title":"Signal Processing Domain Application Mapping on the Brick Reconfigurable Array","authors":"Juan Fernando Eusse Giraldo, R. Jacobi","doi":"10.1109/ReConFig.2009.85","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.85","url":null,"abstract":"This paper introduces the proposal of an Expression Grain Reconfigurable Architecture called BRICK, its functionality and main components. A mapping for three signal processing applications such as a 3x3 2-D convolution, a 16-Tap FIR filter and an 8-point FFT is developed inside the 4x4 Reconfigurable Array. A performance simulation analysis study is developed comparing the BRICK reconfigurable array VHDL implementation to a MIPS and a SPARC V8 simulators in order to validate the Reconfigurable Array proposal. Considerable gains up to an order of magnitude are obtained and important design issues and challenges were discovered when developing this work.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128971339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPGA Implementation for Direct Kinematics of a Spherical Robot Manipulator 球形机器人机械手直接运动学的FPGA实现
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.65
Diego F. Sánchez, Daniel M. Muñoz Arboleda, C. Llanos, J. M. Motta
The sequential behavior of general purpose processors presents limitations in applications that require high processing speeds. One of the advantages of FPGAs implementations is the parallel process capability, allowing acceleration of complex algorithms. Nowadays it is common to find FPGA implementations in applications requiring high speed processing. In this paper a hardware architecture for computing direct kinematics of robot manipulators using floating-point arithmetic is presented for 32, 43 and 64 bit-width representations. Otherwise, the processing time of the hardware architecture is compared with the same formulation implemented in software, using the PowerPC (FPGA embedded processor). The proposed architecture was validated using Matlab results as a statistical estimator in order to compute the Mean Square Error (MSE). Synthesis and simulation results demonstrate the accuracy and high performance of the implemented hardware architecture.
通用处理器的顺序行为在需要高处理速度的应用程序中存在局限性。fpga实现的优点之一是并行处理能力,允许加速复杂算法。如今,在需要高速处理的应用中发现FPGA实现是很常见的。本文提出了一种基于32、43和64位宽度表示的浮点算法计算机器人机械臂直接运动学的硬件体系结构。另外,将硬件架构的处理时间与使用PowerPC (FPGA嵌入式处理器)在软件中实现的相同公式进行比较。利用Matlab结果作为统计估计器验证了所提出的体系结构,以计算均方误差(MSE)。综合和仿真结果验证了所实现硬件架构的准确性和高性能。
{"title":"FPGA Implementation for Direct Kinematics of a Spherical Robot Manipulator","authors":"Diego F. Sánchez, Daniel M. Muñoz Arboleda, C. Llanos, J. M. Motta","doi":"10.1109/ReConFig.2009.65","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.65","url":null,"abstract":"The sequential behavior of general purpose processors presents limitations in applications that require high processing speeds. One of the advantages of FPGAs implementations is the parallel process capability, allowing acceleration of complex algorithms. Nowadays it is common to find FPGA implementations in applications requiring high speed processing. In this paper a hardware architecture for computing direct kinematics of robot manipulators using floating-point arithmetic is presented for 32, 43 and 64 bit-width representations. Otherwise, the processing time of the hardware architecture is compared with the same formulation implemented in software, using the PowerPC (FPGA embedded processor). The proposed architecture was validated using Matlab results as a statistical estimator in order to compute the Mean Square Error (MSE). Synthesis and simulation results demonstrate the accuracy and high performance of the implemented hardware architecture.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127469994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Reconfigurable Design Framework for FPGA Adaptive Computing FPGA自适应计算的可重构设计框架
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.39
Ming Liu, Zhonghai Lu, W. Kuehn, Shuo Yang, A. Jantsch
Partial Reconfiguration (PR) offers the possibility to adaptively change part of the FPGA design without stopping the remaining system. In this paper, we present a comprehensive framework for adaptive computing, in which design key points of hardware processes, system interconnections, Operating Systems (OS), device drivers, scheduler software as well as context switching are respectively concerned in different hardware/software layers. A case study is discussed to demonstrate an example of swapping a Flash memory controller and an SRAM controller in response to diverse memory access needs. Result analysis reveals a more efficient resource utilization of 52.1% I/O pads, 86.5% LUTs and 81.3% Flip-Flops, when compared to the static design with same functionalities. A small reconfiguration overhead of context switching is measured within the range from hundreds of microseconds to milliseconds. Moreover, technical perspectives are analyzed and it is foreseen to obtain great benefits with the proposed design framework in object applications of particle physics experiments.
部分重构(PR)提供了在不停止剩余系统的情况下自适应地改变FPGA设计部分的可能性。在本文中,我们提出了一个综合的自适应计算框架,其中硬件进程、系统互连、操作系统(OS)、设备驱动程序、调度软件以及上下文切换的设计要点分别在不同的硬件/软件层中得到关注。讨论了一个案例研究,以演示交换闪存控制器和SRAM控制器的例子,以响应不同的存储器访问需求。结果分析显示,与具有相同功能的静态设计相比,52.1%的I/O pad、86.5%的lut和81.3%的flip - flop的资源利用率更高。上下文切换的一小部分重新配置开销在几百微秒到几毫秒的范围内进行测量。分析了该设计框架在粒子物理实验实物应用中的技术前景,展望了该设计框架在粒子物理实验实物应用中的应用前景。
{"title":"A Reconfigurable Design Framework for FPGA Adaptive Computing","authors":"Ming Liu, Zhonghai Lu, W. Kuehn, Shuo Yang, A. Jantsch","doi":"10.1109/ReConFig.2009.39","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.39","url":null,"abstract":"Partial Reconfiguration (PR) offers the possibility to adaptively change part of the FPGA design without stopping the remaining system. In this paper, we present a comprehensive framework for adaptive computing, in which design key points of hardware processes, system interconnections, Operating Systems (OS), device drivers, scheduler software as well as context switching are respectively concerned in different hardware/software layers. A case study is discussed to demonstrate an example of swapping a Flash memory controller and an SRAM controller in response to diverse memory access needs. Result analysis reveals a more efficient resource utilization of 52.1% I/O pads, 86.5% LUTs and 81.3% Flip-Flops, when compared to the static design with same functionalities. A small reconfiguration overhead of context switching is measured within the range from hundreds of microseconds to milliseconds. Moreover, technical perspectives are analyzed and it is foreseen to obtain great benefits with the proposed design framework in object applications of particle physics experiments.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130521902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Hotspot Mitigation Using Dynamic Partial Reconfiguration for Improved Performance 使用动态部分重新配置以提高性能的热点缓解
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.80
Adwait Gupte, Phillip H. Jones
As the chips get denser and faster, heat dissipation is fast turning into a major problem in development of ICs. Nonuniform heating of chips due to hotspots is also an area of concern and much research. In this paper, we propose an adaptive method which takes advantage of the self-reconfiguration capability of modern FPGAs to mitigate hotspots. We adapt the floor plan of the IC in response to the current use and ambient conditions on the fly. It is most applicable to paradigms such as Network on Chip (NoC) that allow separation of communication and computation and allow communication between modules to be abstracted away. We achieve a reduction of up to 8 ¿C in the maximum temperature of a hotspot using typical power numbers. Alternatively, by increasing the frequency, we achieve a 2-3 times increase in throughput while maintaining the same maximum temperature.
随着芯片的密度越来越大,速度越来越快,散热问题正迅速成为集成电路开发中的一个主要问题。热点引起的芯片不均匀加热也是人们关注和研究的一个领域。在本文中,我们提出了一种利用现代fpga的自重构能力来缓解热点的自适应方法。我们根据当前的使用情况和环境条件调整了IC的平面图。它最适用于诸如片上网络(NoC)之类的范例,这些范例允许分离通信和计算,并允许抽象模块之间的通信。我们使用典型功率数实现热点最高温度降低高达8°C。或者,通过增加频率,我们可以在保持相同的最高温度的情况下实现2-3倍的吞吐量增加。
{"title":"Hotspot Mitigation Using Dynamic Partial Reconfiguration for Improved Performance","authors":"Adwait Gupte, Phillip H. Jones","doi":"10.1109/ReConFig.2009.80","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.80","url":null,"abstract":"As the chips get denser and faster, heat dissipation is fast turning into a major problem in development of ICs. Nonuniform heating of chips due to hotspots is also an area of concern and much research. In this paper, we propose an adaptive method which takes advantage of the self-reconfiguration capability of modern FPGAs to mitigate hotspots. We adapt the floor plan of the IC in response to the current use and ambient conditions on the fly. It is most applicable to paradigms such as Network on Chip (NoC) that allow separation of communication and computation and allow communication between modules to be abstracted away. We achieve a reduction of up to 8 ¿C in the maximum temperature of a hotspot using typical power numbers. Alternatively, by increasing the frequency, we achieve a 2-3 times increase in throughput while maintaining the same maximum temperature.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133863763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Low Power, Reconfigurable Computing Platform for Spacecraft 低功耗、可重构航天器计算平台
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.71
Guillermo Conde, G. Donohoe, S. Maheswaran
This paper describes a project undertaken to explore reconfigurable computing as a means to achieve high-throughput, low-power on-board computing for spacecraft. The solution consists of a reconfigurable data processor chip, a reconfigurable memory module, reconfigurable interconnect, and dynamic power management. The reconfigurable processor chip was fabricated in a 0.25µ bulk CMOS process using a radiation-hard-by-design standard cell library. Two challenge algorithms were demonstrated in hardware, and a dozen others in software simulation. It was shown to achieve up to 3 giga- operations/second-watt. This architecture is well-suited to future generations of ultra-low-power, low-voltage processors and memories, as the extensibility offsets the loss in throughput due to low-voltage
本文描述了一个探索可重构计算作为实现航天器高吞吐量、低功耗机载计算的手段的项目。该解决方案由可重构数据处理器芯片、可重构内存模块、可重构互连和动态电源管理组成。可重构处理器芯片采用0.25µbulk CMOS工艺,采用辐射硬设计标准单元库制造。在硬件上演示了两种挑战算法,并在软件仿真中演示了其他十几种算法。它被证明可以达到3千兆次/秒瓦。这种架构非常适合未来几代的超低功耗、低电压处理器和存储器,因为可扩展性抵消了由于低电压而导致的吞吐量损失
{"title":"Low Power, Reconfigurable Computing Platform for Spacecraft","authors":"Guillermo Conde, G. Donohoe, S. Maheswaran","doi":"10.1109/ReConFig.2009.71","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.71","url":null,"abstract":"This paper describes a project undertaken to explore reconfigurable computing as a means to achieve high-throughput, low-power on-board computing for spacecraft. The solution consists of a reconfigurable data processor chip, a reconfigurable memory module, reconfigurable interconnect, and dynamic power management. The reconfigurable processor chip was fabricated in a 0.25µ bulk CMOS process using a radiation-hard-by-design standard cell library. Two challenge algorithms were demonstrated in hardware, and a dozen others in software simulation. It was shown to achieve up to 3 giga- operations/second-watt. This architecture is well-suited to future generations of ultra-low-power, low-voltage processors and memories, as the extensibility offsets the loss in throughput due to low-voltage","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128463312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2009 International Conference on Reconfigurable Computing and FPGAs
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1