首页 > 最新文献

2008 International Conference on Field Programmable Logic and Applications最新文献

英文 中文
On the design parameters of runtime reconfigurable systems 运行时可重构系统的设计参数
Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4630039
Thilo Pionteck, C. Albrecht, R. Koch, E. Maehle
This paper explores the design space for runtime reconfigurable systems. A broad range of systems is surveyed and a set of parameters applicable for characterizing runtime reconfigurable systems is proposed. Compared to other surveys the focus is set on the system architecture, not on the underlying hardware structure. This allows a discussion that primarily considers the actual motivation for utilising runtime reconfiguration in system designs instead of discussing the limitations of actual hardware platforms.
本文探讨了运行时可重构系统的设计空间。研究了广泛的系统,并提出了一组适用于描述运行时可重构系统的参数。与其他调查相比,该调查的重点放在系统架构上,而不是底层硬件结构上。这允许讨论主要考虑在系统设计中使用运行时重新配置的实际动机,而不是讨论实际硬件平台的限制。
{"title":"On the design parameters of runtime reconfigurable systems","authors":"Thilo Pionteck, C. Albrecht, R. Koch, E. Maehle","doi":"10.1109/FPL.2008.4630039","DOIUrl":"https://doi.org/10.1109/FPL.2008.4630039","url":null,"abstract":"This paper explores the design space for runtime reconfigurable systems. A broad range of systems is surveyed and a set of parameters applicable for characterizing runtime reconfigurable systems is proposed. Compared to other surveys the focus is set on the system architecture, not on the underlying hardware structure. This allows a discussion that primarily considers the actual motivation for utilising runtime reconfiguration in system designs instead of discussing the limitations of actual hardware platforms.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127375785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An FPGA-based high-speed, low-latency trigger processor for high-energy physics 一种基于fpga的高速、低延迟的高能物理触发处理器
Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4629947
J. Cuveland, F. Rettig, V. Angelov, V. Lindenstruth
An example of an FPGA based application for a high-energy physics experiment is presented which features all facets of modern FPGA design. The special requirements here are high bandwidth (2.16 Tbit/s), low latency, and flexibility in the processing algorithm. The input data come optically via 1 080 links operating at 2.5 Gbit/s. The whole system is partitioned hierarchically in 18 groups of 5+1 modules and one top module. All modules contain the same PCB, FPGA, DDR SRAM and SDRAM, but are equipped with different optional components and additional interface boards, which simplifies the hardware development significantly and reduces the production costs. Embedded PowerPC processors running Linux systems are used to implement a control and monitoring system. The system was installed in the real environment in December 2007 and is in continuous operation for cosmic data taking.
介绍了一个基于FPGA的高能物理实验应用实例,该实例具有现代FPGA设计的所有方面。这里的特殊要求是高带宽(2.16 Tbit/s)、低延迟和处理算法的灵活性。输入数据通过1080光链路传输,传输速率为2.5 Gbit/s。整个系统分层划分为18组5+1模块和1个顶层模块。所有模块都包含相同的PCB, FPGA, DDR SRAM和SDRAM,但配备不同的可选组件和额外的接口板,这大大简化了硬件开发,降低了生产成本。使用运行Linux系统的嵌入式PowerPC处理器来实现控制和监视系统。该系统于2007年12月安装在真实环境中,并持续运行以获取宇宙数据。
{"title":"An FPGA-based high-speed, low-latency trigger processor for high-energy physics","authors":"J. Cuveland, F. Rettig, V. Angelov, V. Lindenstruth","doi":"10.1109/FPL.2008.4629947","DOIUrl":"https://doi.org/10.1109/FPL.2008.4629947","url":null,"abstract":"An example of an FPGA based application for a high-energy physics experiment is presented which features all facets of modern FPGA design. The special requirements here are high bandwidth (2.16 Tbit/s), low latency, and flexibility in the processing algorithm. The input data come optically via 1 080 links operating at 2.5 Gbit/s. The whole system is partitioned hierarchically in 18 groups of 5+1 modules and one top module. All modules contain the same PCB, FPGA, DDR SRAM and SDRAM, but are equipped with different optional components and additional interface boards, which simplifies the hardware development significantly and reduces the production costs. Embedded PowerPC processors running Linux systems are used to implement a control and monitoring system. The system was installed in the real environment in December 2007 and is in continuous operation for cosmic data taking.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125395683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A link removal methodology for Networks-on-Chip on reconfigurable systems 可重构系统上片上网络的链路移除方法
Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4629943
Daihan Wang, Hiroki Matsutani, H. Amano, M. Koibuchi
While the regular 2-D mesh topology has been utilized for most of network-on-chips (NoCs) on FPGAs, spatially biased traffic in some applications make some customization method feasible. A link removal strategy that customizes the router in NoC is proposed for reconfigurable systems in order to minimize required hardware amount. Based on the pre-analyzed traffic information, links on which the communication amount is small are removed to reduce the hardware cost with enough performance being kept. Two policies are proposed to avoid deadlocks and better performance can be achieved compared with up*/down* routing on the irregular topology with links removed. In the image recognition application susan, the proposed method can save 30% of the hardware amount without performance degradation.
虽然大多数fpga上的片上网络(noc)都采用了常规的二维网格拓扑,但在某些应用中,空间偏置流量使得一些定制方法可行。针对可重构系统,提出了一种自定义NoC路由器的链路移除策略,以最小化所需的硬件数量。根据预先分析的流量信息,去除通信量较小的链路,在保证性能的前提下降低硬件成本。提出了两种避免死锁的策略,并且与不规则拓扑下去除链路的up /down路由相比,可以获得更好的性能。在图像识别应用中,该方法在不降低性能的前提下,节省了30%的硬件量。
{"title":"A link removal methodology for Networks-on-Chip on reconfigurable systems","authors":"Daihan Wang, Hiroki Matsutani, H. Amano, M. Koibuchi","doi":"10.1109/FPL.2008.4629943","DOIUrl":"https://doi.org/10.1109/FPL.2008.4629943","url":null,"abstract":"While the regular 2-D mesh topology has been utilized for most of network-on-chips (NoCs) on FPGAs, spatially biased traffic in some applications make some customization method feasible. A link removal strategy that customizes the router in NoC is proposed for reconfigurable systems in order to minimize required hardware amount. Based on the pre-analyzed traffic information, links on which the communication amount is small are removed to reduce the hardware cost with enough performance being kept. Two policies are proposed to avoid deadlocks and better performance can be achieved compared with up*/down* routing on the irregular topology with links removed. In the image recognition application susan, the proposed method can save 30% of the hardware amount without performance degradation.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122870306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
High-performance fpga-based floating-point adder with three inputs 基于高性能fpga的三输入浮点加法器
Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4630025
A. Guntoro, M. Glesner
In this paper, we present the design and the implementation of an FPGA-based floating-point adder with three inputs. The design is based on a 5-level pipeline stage in order to distribute the critical paths and to maximize the performance. We examine the data dependencies to minimize the number of the pipeline stages and to reduce the resource allocation. Our design is parameterisable in order to cope with different floating-point formats, including the standard IEEE 754 formats and the custom configurations. The proposed design with the single precision, 32-bit floating-point format, can be operated at 143 MHz on Xilinx Virtex2Pro XC2VP30-7.
在本文中,我们提出了一个基于fpga的三输入浮点加法器的设计和实现。该设计基于5级管道阶段,以便分配关键路径并最大化性能。我们检查数据依赖性,以尽量减少管道阶段的数量,并减少资源分配。我们的设计是可参数化的,以便处理不同的浮点格式,包括标准的IEEE 754格式和自定义配置。所提出的设计具有单精度,32位浮点格式,可以在Xilinx Virtex2Pro XC2VP30-7上以143 MHz的频率运行。
{"title":"High-performance fpga-based floating-point adder with three inputs","authors":"A. Guntoro, M. Glesner","doi":"10.1109/FPL.2008.4630025","DOIUrl":"https://doi.org/10.1109/FPL.2008.4630025","url":null,"abstract":"In this paper, we present the design and the implementation of an FPGA-based floating-point adder with three inputs. The design is based on a 5-level pipeline stage in order to distribute the critical paths and to maximize the performance. We examine the data dependencies to minimize the number of the pipeline stages and to reduce the resource allocation. Our design is parameterisable in order to cope with different floating-point formats, including the standard IEEE 754 formats and the custom configurations. The proposed design with the single precision, 32-bit floating-point format, can be operated at 143 MHz on Xilinx Virtex2Pro XC2VP30-7.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129261476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
SPP1148 booth: Fine grain reconfigurable architectures SPP1148展位:细粒度可重构架构
Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4629956
Josef Angermeier, Mateusz Majer, Jürgen Teich, L. Braun, T. Schwalb, P. Graf, M. Hubner, Jürgen Becker, Enno Lübbers, M. Platzner, C. Claus, W. Stechele, A. Herkersdorf, M. Rullmann, R. Merker
In this booth on fine grain reconfigurable architectures, several research groups demonstrate their joint work on operating concepts for managing dynamic and partial reconfiguration, visualization of bitstreams and routing, presenting an application applying dynamic reconfiguration for video engines as well as work on minimization of reconfiguration data. Unique is that all the above four projects present their work using the same reconfigurable FPGA-based fabric called Erlangen slot machine that has also been built within one project just the purpose of experimenting with dynamic fine grain reconfiguration as an interdisciplinary platform.
在这个关于细粒度可重构架构的展台上,几个研究小组展示了他们在管理动态和部分重构、比特流可视化和路由的操作概念、视频引擎动态重构的应用以及重构数据最小化方面的联合工作。独特之处在于,上述四个项目都使用了相同的可重构fpga结构(称为Erlangen老虎机)来展示他们的工作,该结构也在一个项目中构建,只是为了实验动态细颗粒重构作为跨学科平台。
{"title":"SPP1148 booth: Fine grain reconfigurable architectures","authors":"Josef Angermeier, Mateusz Majer, Jürgen Teich, L. Braun, T. Schwalb, P. Graf, M. Hubner, Jürgen Becker, Enno Lübbers, M. Platzner, C. Claus, W. Stechele, A. Herkersdorf, M. Rullmann, R. Merker","doi":"10.1109/FPL.2008.4629956","DOIUrl":"https://doi.org/10.1109/FPL.2008.4629956","url":null,"abstract":"In this booth on fine grain reconfigurable architectures, several research groups demonstrate their joint work on operating concepts for managing dynamic and partial reconfiguration, visualization of bitstreams and routing, presenting an application applying dynamic reconfiguration for video engines as well as work on minimization of reconfiguration data. Unique is that all the above four projects present their work using the same reconfigurable FPGA-based fabric called Erlangen slot machine that has also been built within one project just the purpose of experimenting with dynamic fine grain reconfiguration as an interdisciplinary platform.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124722569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Generation of partial FPGA configurations at run-time 在运行时生成部分FPGA配置
Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4629965
M. Silva, J. Ferreira
The paper presents a method for generating partial bitstreams on-line for use in systems with run-time reconfigurable FPGAs. Bitstream creation is performed at run-time by merging partial bitstreams from individual component modules. The process includes the capability to create connections between the modules by selection from a set of routes found during an off-line pre-processing step. Placement and interconnection of modules must follow a precise set of rules. While restricting the number of possible module arrangements, this approach allows bitstream creation to be performed with relatively few computational resources. Using a demonstration system with a Virtex-II Pro FPGA with a PowerPC 405 CPU, the process of creating at run-time a partial bitstream for 22% of the device area takes 24 ms.
本文提出了一种在线生成部分位流的方法,用于具有运行时可重构fpga的系统。比特流创建是在运行时通过合并来自各个组件模块的部分比特流来执行的。该过程包括通过从离线预处理步骤中找到的一组路由中进行选择,在模块之间创建连接的功能。模块的放置和连接必须遵循一套精确的规则。虽然限制了可能的模块排列数量,但这种方法允许用相对较少的计算资源来执行比特流创建。使用带有Virtex-II Pro FPGA和PowerPC 405 CPU的演示系统,在运行时创建占设备面积22%的部分比特流的过程需要24 ms。
{"title":"Generation of partial FPGA configurations at run-time","authors":"M. Silva, J. Ferreira","doi":"10.1109/FPL.2008.4629965","DOIUrl":"https://doi.org/10.1109/FPL.2008.4629965","url":null,"abstract":"The paper presents a method for generating partial bitstreams on-line for use in systems with run-time reconfigurable FPGAs. Bitstream creation is performed at run-time by merging partial bitstreams from individual component modules. The process includes the capability to create connections between the modules by selection from a set of routes found during an off-line pre-processing step. Placement and interconnection of modules must follow a precise set of rules. While restricting the number of possible module arrangements, this approach allows bitstream creation to be performed with relatively few computational resources. Using a demonstration system with a Virtex-II Pro FPGA with a PowerPC 405 CPU, the process of creating at run-time a partial bitstream for 22% of the device area takes 24 ms.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123529963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Convergence analysis of run-time distributed optimization on adaptive systems using game theory 基于博弈论的自适应系统运行时分布式优化收敛性分析
Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4630007
D. Puschini, F. Clermidy, P. Benoit, G. Sassatelli, L. Torres
We consider multiprocessor system-on-chip (MP-SoC) integrating several processing elements (PE). These architectures require distributed and scalable control techniques for run-time optimization of applicative parameters. Our approach is to use the game theory as an optimization model to solve the trade-off issues at run-time. We applied it to the distributed dynamic voltage frequency scaling (DVFS) management, adjusting at run-time the frequency set of each PE based on the synchronization between tasks of the application graph and the PE temperature profile. Results show that the analyzed algorithm converges to a solution in about 94% of the cases and in less than 40 calculation cycles for a 100-processor MP-SoC. It reaches an average optimization of 89% compared to an off-line centralized reference but about 140 times faster when simulating.
我们考虑集成多个处理元件(PE)的多处理器片上系统(MP-SoC)。这些体系结构需要分布式和可扩展的控制技术来对应用程序参数进行运行时优化。我们的方法是使用博弈论作为优化模型来解决运行时的权衡问题。我们将其应用于分布式动态电压频率缩放(DVFS)管理,在运行时基于应用图任务与PE温度曲线之间的同步来调整每个PE的频率集。结果表明,对于100处理器的MP-SoC,所分析的算法在大约94%的情况下收敛到一个解决方案,并且在不到40个计算周期内。与离线集中式参考相比,它达到了89%的平均优化,但在模拟时快了约140倍。
{"title":"Convergence analysis of run-time distributed optimization on adaptive systems using game theory","authors":"D. Puschini, F. Clermidy, P. Benoit, G. Sassatelli, L. Torres","doi":"10.1109/FPL.2008.4630007","DOIUrl":"https://doi.org/10.1109/FPL.2008.4630007","url":null,"abstract":"We consider multiprocessor system-on-chip (MP-SoC) integrating several processing elements (PE). These architectures require distributed and scalable control techniques for run-time optimization of applicative parameters. Our approach is to use the game theory as an optimization model to solve the trade-off issues at run-time. We applied it to the distributed dynamic voltage frequency scaling (DVFS) management, adjusting at run-time the frequency set of each PE based on the synchronization between tasks of the application graph and the PE temperature profile. Results show that the analyzed algorithm converges to a solution in about 94% of the cases and in less than 40 calculation cycles for a 100-processor MP-SoC. It reaches an average optimization of 89% compared to an off-line centralized reference but about 140 times faster when simulating.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121293223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An efficient run-time router for connecting modules in FPGAS 一种高效的运行时路由器,用于连接fpga中的模块
Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4629919
J. Surís, C. Patterson, P. Athanas
It is often desirable to change the logic and/or the connections within an FPGA design on-the-fly without the benefit of a workstation or vendor CAD software. This paper presents a dynamic router for Xilinx FPGAs, designed to run on stand-alone embedded systems. With information obtained from Xilinxpsilas XDL tool, a compact routing database for the Virtex-II/IIP/4 devices is built which only requires 96 KB of storage. A channel routing algorithm is used because of its deterministic execution time and because all routing resources in the channel are available. Sample channels are routed with the router and compared with the Xilinx PAR tool. Improvements in both execution time and in memory usage of several orders of magnitude are observed.
在没有工作站或供应商CAD软件的情况下,动态更改FPGA设计中的逻辑和/或连接通常是可取的。本文介绍了一种用于Xilinx fpga的动态路由器,设计用于独立的嵌入式系统。利用从Xilinxpsilas XDL工具获得的信息,为Virtex-II/IIP/4设备构建了一个紧凑的路由数据库,它只需要96 KB的存储空间。使用通道路由算法是因为它的执行时间是确定的,而且通道中的所有路由资源都是可用的。使用路由器路由采样通道,并与Xilinx PAR工具进行比较。在执行时间和内存使用方面都有几个数量级的改进。
{"title":"An efficient run-time router for connecting modules in FPGAS","authors":"J. Surís, C. Patterson, P. Athanas","doi":"10.1109/FPL.2008.4629919","DOIUrl":"https://doi.org/10.1109/FPL.2008.4629919","url":null,"abstract":"It is often desirable to change the logic and/or the connections within an FPGA design on-the-fly without the benefit of a workstation or vendor CAD software. This paper presents a dynamic router for Xilinx FPGAs, designed to run on stand-alone embedded systems. With information obtained from Xilinxpsilas XDL tool, a compact routing database for the Virtex-II/IIP/4 devices is built which only requires 96 KB of storage. A channel routing algorithm is used because of its deterministic execution time and because all routing resources in the channel are available. Sample channels are routed with the router and compared with the Xilinx PAR tool. Improvements in both execution time and in memory usage of several orders of magnitude are observed.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116318202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Exploring compact design on high throughput coarse grained reconfigurable architectures 探索高吞吐量粗粒度可重构架构上的紧凑设计
Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4630004
K. Tanigawa, Tetsuya Zuyama, Takuro Uchida, T. Hironaka
Aiming toward a compact high- throughput reconfigurable architecture, we propose the reconfigurable processor DS-HIE. In order to achieve the characteristics of compactness and high-throughput, the DS-HIE architecture executes operations following a bit-serial computation scheme and adopts a Benes network as its routing resource. Implementing bit-serial computation brings the advantage of small chip area and high throughput to the DS-HIE architecture, and the Benes network ensures the high availability of the routing paths within a compact chip area. In this paper, we explain several methods, namely two data transfer methods and three feedback path methods, and provide an evaluation of the architecture. The evaluation results showed that the structure which allows for the smallest chip area comprises the dedicated wiring method for data transfer and the area effort method for routing. Further, the transistor count of the DS-HIE processor is notably smaller than that of the core 2 duo processor.
针对一个紧凑的高吞吐量可重构架构,我们提出了可重构处理器DS-HIE。为了实现紧凑和高吞吐量的特点,DS-HIE架构采用位串行计算方案执行操作,并采用Benes网络作为路由资源。实现位串行计算为DS-HIE架构带来了芯片面积小、吞吐量高的优势,而Benes网络保证了紧凑芯片面积内路由路径的高可用性。在本文中,我们解释了几种方法,即两种数据传输方法和三种反馈路径方法,并对体系结构进行了评估。评价结果表明,允许最小芯片面积的结构包括用于数据传输的专用布线方法和用于路由的区域努力方法。此外,DS-HIE处理器的晶体管数量明显小于core 2双核处理器。
{"title":"Exploring compact design on high throughput coarse grained reconfigurable architectures","authors":"K. Tanigawa, Tetsuya Zuyama, Takuro Uchida, T. Hironaka","doi":"10.1109/FPL.2008.4630004","DOIUrl":"https://doi.org/10.1109/FPL.2008.4630004","url":null,"abstract":"Aiming toward a compact high- throughput reconfigurable architecture, we propose the reconfigurable processor DS-HIE. In order to achieve the characteristics of compactness and high-throughput, the DS-HIE architecture executes operations following a bit-serial computation scheme and adopts a Benes network as its routing resource. Implementing bit-serial computation brings the advantage of small chip area and high throughput to the DS-HIE architecture, and the Benes network ensures the high availability of the routing paths within a compact chip area. In this paper, we explain several methods, namely two data transfer methods and three feedback path methods, and provide an evaluation of the architecture. The evaluation results showed that the structure which allows for the smallest chip area comprises the dedicated wiring method for data transfer and the area effort method for routing. Further, the transistor count of the DS-HIE processor is notably smaller than that of the core 2 duo processor.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126287700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Scalable high performance computing on FPGA clusters using message passing FPGA集群上使用消息传递的可扩展高性能计算
Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4629979
Eoin Creedon, M. Manzke
The direct connection of application logic to network logic allows parallel applications to better leverage the network resources. We present a hardware description language message passing application programming interface (HDL MP API) for FPGAs. This allows an application to operate both local and network resources in a uniform, scalable and portable manner, independent of the interconnect. We use the message passing communication paradigm with all necessary communication operations performed by dedicated control hardware, independently of the interconnect. Ethernet has been used as the interconnect to demonstrate the HDL MP API functionality for this proof of concept system. Parallel linear array matrix multiplication has been implemented and tested using the HDL MP API. This application demonstrates the scalability provided by the HDL MP API.
应用程序逻辑与网络逻辑的直接连接允许并行应用程序更好地利用网络资源。提出了一种用于fpga的硬件描述语言消息传递应用程序编程接口(HDL MP API)。这允许应用程序以统一、可扩展和可移植的方式操作本地和网络资源,而不依赖于互连。我们使用消息传递通信范式,所有必要的通信操作都由专用控制硬件执行,独立于互连。以太网已被用作互连来演示这个概念验证系统的HDL MP API功能。并行线性阵列矩阵乘法已经实现和测试使用HDL MP API。这个应用程序演示了HDL MP API提供的可伸缩性。
{"title":"Scalable high performance computing on FPGA clusters using message passing","authors":"Eoin Creedon, M. Manzke","doi":"10.1109/FPL.2008.4629979","DOIUrl":"https://doi.org/10.1109/FPL.2008.4629979","url":null,"abstract":"The direct connection of application logic to network logic allows parallel applications to better leverage the network resources. We present a hardware description language message passing application programming interface (HDL MP API) for FPGAs. This allows an application to operate both local and network resources in a uniform, scalable and portable manner, independent of the interconnect. We use the message passing communication paradigm with all necessary communication operations performed by dedicated control hardware, independently of the interconnect. Ethernet has been used as the interconnect to demonstrate the HDL MP API functionality for this proof of concept system. Parallel linear array matrix multiplication has been implemented and tested using the HDL MP API. This application demonstrates the scalability provided by the HDL MP API.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"30 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125699660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2008 International Conference on Field Programmable Logic and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1