Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)最新文献

英文中文

On acceleration of the check tautology logic synthesis algorithm using an FPGA-based reconfigurable coprocessor 基于fpga的可重构协处理器对校验重言式逻辑综合算法的加速研究

Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)

Pub Date : 1997-04-16 DOI: 10.1109/FPGA.1997.624629

J. Cong, John Peck

We summarize our study on implementing tautology checking, a fundamental logic synthesis algorithm, using an FPGA based reconfigurable application specific coprocessor. The use of the tautology checking algorithm is first discussed followed by the specifics of hardware accelerator implementation and interface to application software. We compare our hardware accelerator for the tautology check algorithm with the software implementation of the tautology check algorithm in Espresso II (R. Rudell and A. Sangiovanni-Vincentelli, 1987). Our experimental results show that our accelerator is capable of achieving a maximum speedup factor of 2.94 and averaging 1.36 on 110 modified industry benchmarks included with the Espresso II package.

本文总结了基于FPGA的可重构应用协处理器实现重言检验这一基本逻辑综合算法的研究。首先讨论了同义校验算法的使用，然后讨论了硬件加速器的具体实现和与应用软件的接口。我们比较了我们的重言检查算法的硬件加速器与Espresso II中重言检查算法的软件实现(R. Rudell和A. Sangiovanni-Vincentelli, 1987)。我们的实验结果表明，我们的加速器能够在Espresso II包中包含的110个修改的行业基准上实现2.94的最大加速因子和1.36的平均加速因子。

引用次数: 4

Acceleration of an FPGA router FPGA路由器的加速

Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)

Pub Date : 1997-04-16 DOI: 10.1109/FPGA.1997.624617

P. K. Chan, M. Schlag

The authors describe their experience and progress in accelerating an FPGA router. Placement and routing is undoubtedly the most time-consuming process in automatic chip design or configuring programmable logic devices as reconfigurable computing elements. Their goal is to accelerate routing of FPGAs by 10 fold with a combination of processor clusters and hardware acceleration. Coarse-grain parallelism is exploited by having several processors route separate groups of nets in parallel. A hardware accelerator is presented which exploits the fine-grain parallelism in routing individual nets.

作者描述了他们在FPGA路由器加速方面的经验和进展。在自动芯片设计或配置可编程逻辑器件作为可重构计算元件时，放置和布线无疑是最耗时的过程。他们的目标是通过结合处理器集群和硬件加速，将fpga的路由速度提高10倍。粗粒度并行性是通过让几个处理器并行地路由不同的网络组来实现的。提出了一种硬件加速器，利用了路由单个网络的细粒度并行性。

引用次数: 30

Automated field-programmable compute accelerator design using partial evaluation 使用部分评估的自动化现场可编程计算加速器设计

Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)

Pub Date : 1997-04-16 DOI: 10.1109/FPGA.1997.624614

Qiang Wang, D. Lewis

This paper describes a compiler that generates both hardware and controlling software for field-programmable compute accelerators. By analyzing a source program together with part of its input, the compiler generates VHDL descriptions of functional units that are mapped on a set of FPGA chips and an optimized sequence of control constructions that run on the customized machine. The primary technique employed in the compiler is partial evaluation, which is used to transform an application program together with part of its input into an optimized program. Further phases in the compiler identify pieces of the program that can be realized in hardware and schedule computations to execute on the resulting hardware. Finally, a set of specialized functional units generated by the compiler for a timing simulation program is used to demonstrate the approach.

本文介绍了一种既能生成现场可编程计算加速器硬件又能生成控制软件的编译器。通过分析源程序及其部分输入，编译器生成映射到一组FPGA芯片上的功能单元的VHDL描述，以及在定制机器上运行的优化控制结构序列。编译器中使用的主要技术是部分求值，它用于将应用程序及其部分输入转换为优化程序。编译器的其他阶段确定可以在硬件中实现的程序片段，并安排计算在生成的硬件上执行。最后，利用编译器为时序仿真程序生成的一组专用功能单元来演示该方法。

引用次数: 19

The RAW benchmark suite: computation structures for general purpose computing RAW基准套件:用于通用计算的计算结构

Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)

Pub Date : 1997-04-16 DOI: 10.1109/FPGA.1997.624613

J. Babb, M. Frank, V. Lee, E. Waingold, R. Barua, M. Taylor, Jang Kim, D. Srikrishna, A. Agarwal

The RAW benchmark suite consists of twelve programs designed to facilitate comparing, validating, and improving reconfigurable computing systems. These benchmarks run the gamut of algorithms found in general purpose computing, including sorting, matrix operations, and graph algorithms. The suite includes an architecture-independent compilation framework, Raw Computation Structures (RawCS), to express each algorithm's dependencies and to support automatic synthesis, partitioning, and mapping to a reconfigurable computer. Within this framework, each benchmark is portably designed in both C and Behavioral Verilog and scalably parameterized to consume a range of hardware resource capacities. To establish initial benchmark ratings, we have targeted a commercial logic emulation system based on virtual wires technology to automatically generate designs up to millions of gates (14 to 379 FPGAs). Because the virtual wires techniques abstract away machine-level details like FPGA capacity and interconnect, our hardware target for this system is an abstract reconfigurable logic fabric with memory-mapped host I/O. We report initial speeds in the range of 2X to 1800X faster than a 2.82 SPECint95 SparcStation 20 and encourage others in the field to run these benchmarks on other systems to provide a standard comparison.

RAW基准套件由12个程序组成，旨在促进比较、验证和改进可重构计算系统。这些基准测试运行通用计算中的所有算法，包括排序、矩阵操作和图算法。该套件包括一个独立于体系结构的编译框架Raw Computation Structures (RawCS)，用于表达每个算法的依赖关系，并支持自动合成、分区和映射到可重构计算机。在这个框架中，每个基准都可移植地用C语言和Behavioral Verilog设计，并可扩展地参数化，以消耗一系列硬件资源容量。为了建立初始基准评级，我们针对基于虚拟线技术的商业逻辑仿真系统，自动生成多达数百万个门(14到379个fpga)的设计。因为虚拟线技术抽象了机器级的细节，比如FPGA容量和互连，所以我们这个系统的硬件目标是一个抽象的可重构逻辑结构，具有内存映射的主机I/O。我们报告的初始速度比2.82 SPECint95 SparcStation 20快2倍到1800倍，并鼓励该领域的其他人在其他系统上运行这些基准测试，以提供标准比较。

{"title":"The RAW benchmark suite: computation structures for general purpose computing","authors":"J. Babb, M. Frank, V. Lee, E. Waingold, R. Barua, M. Taylor, Jang Kim, D. Srikrishna, A. Agarwal","doi":"10.1109/FPGA.1997.624613","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624613","url":null,"abstract":"The RAW benchmark suite consists of twelve programs designed to facilitate comparing, validating, and improving reconfigurable computing systems. These benchmarks run the gamut of algorithms found in general purpose computing, including sorting, matrix operations, and graph algorithms. The suite includes an architecture-independent compilation framework, Raw Computation Structures (RawCS), to express each algorithm's dependencies and to support automatic synthesis, partitioning, and mapping to a reconfigurable computer. Within this framework, each benchmark is portably designed in both C and Behavioral Verilog and scalably parameterized to consume a range of hardware resource capacities. To establish initial benchmark ratings, we have targeted a commercial logic emulation system based on virtual wires technology to automatically generate designs up to millions of gates (14 to 379 FPGAs). Because the virtual wires techniques abstract away machine-level details like FPGA capacity and interconnect, our hardware target for this system is an abstract reconfigurable logic fabric with memory-mapped host I/O. We report initial speeds in the range of 2X to 1800X faster than a 2.82 SPECint95 SparcStation 20 and encourage others in the field to run these benchmarks on other systems to provide a standard comparison.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124555226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 113

Real-time stereo vision on the PARTS reconfigurable computer PARTS可重构计算机上的实时立体视觉

Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)

Pub Date : 1997-04-16 DOI: 10.1109/FPGA.1997.624620

J. Woodfill, B. V. Herzen

The paper describes a powerful, scalable, reconfigurable computer called the PARTS engine. The PARTS engine consists of 16 Xilinx 4025 FPGAs, and 16 one-megabyte SRAMs. The FPGAs are connected in a partial torus-each associated with two adjacent SRAMs. The SRAMs are tightly coupled to the FPGAs so that all the SRAMs can be accessed concurrently. The PARTS engine fits on a standard PCI card in a personal computer or workstation. The first application implemented on the PARTS engine is a depth from stereo vision algorithm that computes 24 stereo disparities on 320 by 240 pixel images at 42 frames per second. Running at this speed, the engine is performing approximately 2.3 billion RISC-equivalent operations per second, accessing memory at a rate of 500 million bytes per second and attaining throughput of over 70 million point/spl times/disparity measurements per second.

本文描述了一种功能强大、可扩展、可重构的计算机，称为PARTS引擎。PARTS引擎由16个Xilinx 4025 fpga和16个1m ram组成。fpga以部分环形连接，每个fpga与两个相邻的ram相关联。sram与fpga紧密耦合，因此所有sram都可以并发访问。PARTS引擎安装在个人计算机或工作站的标准PCI卡上。在PARTS引擎上实现的第一个应用程序是立体视觉深度算法，该算法以每秒42帧的速度在320 × 240像素的图像上计算24个立体差异。在这种速度下，该引擎每秒执行大约23亿次risc等效操作，以每秒5亿字节的速率访问内存，每秒获得超过7000万点/spl时间/视差测量的吞吐量。

引用次数: 227

An FPGA-based coprocessor for ATM firewalls 基于fpga的ATM防火墙协处理器

Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)

Pub Date : 1997-04-16 DOI: 10.1109/FPGA.1997.624602

J. T. McHenry, P. Dowd, F. Pellegrino, T. M. Carrozzi, W. B. Cocks

This implementation of the firewall enables a high degree of traffic selectability yet avoids the usual performance penalty associated with IP level firewalls. This approach is applicable to high-speed broadband networks, and asynchronous transfer mode (ATM) networks are addressed in particular. Security management is achieved through a new technique of active connection management with authentication. Past approaches to network security involve firewalls providing selection based on packet filtering and application level proxy gateways. IP level firewalling was sufficient for traditional networks but causes a severe performance degradation in high speed broadband environments. The approach described in this paper discusses the use of an FPGA-based front end processor that filters relevant signaling information to the firewall host while at the same time allowing friendly connections to proceed at line speed with no performance degradation.

这种防火墙的实现支持高度的流量可选性，同时避免了与IP级防火墙相关的通常的性能损失。这种方法适用于高速宽带网络，特别是异步传输模式(ATM)网络。安全管理是通过一种带有身份验证的主动连接管理新技术来实现的。过去的网络安全方法包括防火墙提供基于包过滤和应用程序级代理网关的选择。IP级防火墙对于传统网络来说已经足够了，但在高速宽带环境中会导致严重的性能下降。本文中描述的方法讨论了基于fpga的前端处理器的使用，该处理器将相关的信令信息过滤到防火墙主机，同时允许友好的连接以线速进行，而不会导致性能下降。

引用次数: 51

The swappable logic unit: a paradigm for virtual hardware 可交换逻辑单元:虚拟硬件的范例

Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)

Pub Date : 1997-04-16 DOI: 10.1109/FPGA.1997.624607

G. Brebner

Swappable Logic Units (SLUs) were introduced by the author previously (1996) to play a role in virtual hardware subsystems that is analogous to the role of pages or segments in virtual memory subsystems. The intention is that a conventional operating system can be extended to manage SLU circuitry implemented using FPGA real estate. In order to minimise operating system overheads, two particular SLU-based virtual hardware models were deemed practical: a "sea of accelerators" model and a "parallel harness" model. This paper looks in some detail at how SLUs will fit within the overall environment of a fairly conventional hardware/software system. First, there is a discussion of the FPGA-based hardware environment for SLUs, followed by a discussion of the software environment from which SLUs might be used. After this, there is a description of the operational properties that SLUs can have, and how these fit in with the two virtual hardware models. Finally, proposals for standard interfaces between SLUs and their environment are discussed. These interfaces can be regarded as constraints on the designers of SLU circuitry or, more positively, as suppliers of an enriched context within which such circuitry operates. The overall impact of the work presented in the paper is to show that it is feasible to incorporate configurable hardware within traditional computer systems that use high-level language programs and computer operating systems. That is, it should not always be necessary to devise special-purpose hardware/software systems to realise custom computing.

可交换逻辑单元(slu)是作者之前(1996年)引入的，它在虚拟硬件子系统中扮演类似于虚拟内存子系统中的页面或段的角色。其目的是将传统的操作系统扩展到管理使用FPGA实现的SLU电路。为了最小化操作系统开销，两种特定的基于slu的虚拟硬件模型被认为是实用的:“加速器海洋”模型和“并行线束”模型。本文详细介绍了slu如何适应相当传统的硬件/软件系统的整体环境。首先，讨论了基于fpga的slu硬件环境，然后讨论了可能使用slu的软件环境。在此之后，将描述slu可以具有的操作属性，以及这些属性如何与两种虚拟硬件模型相适应。最后，讨论了slu与其环境之间的标准接口的建议。这些接口可以被视为对SLU电路设计者的约束，或者更积极地说，作为丰富电路运行环境的供应商。本文提出的工作的总体影响是表明在使用高级语言程序和计算机操作系统的传统计算机系统中合并可配置硬件是可行的。也就是说，不应该总是需要设计专用的硬件/软件系统来实现自定义计算。

{"title":"The swappable logic unit: a paradigm for virtual hardware","authors":"G. Brebner","doi":"10.1109/FPGA.1997.624607","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624607","url":null,"abstract":"Swappable Logic Units (SLUs) were introduced by the author previously (1996) to play a role in virtual hardware subsystems that is analogous to the role of pages or segments in virtual memory subsystems. The intention is that a conventional operating system can be extended to manage SLU circuitry implemented using FPGA real estate. In order to minimise operating system overheads, two particular SLU-based virtual hardware models were deemed practical: a \"sea of accelerators\" model and a \"parallel harness\" model. This paper looks in some detail at how SLUs will fit within the overall environment of a fairly conventional hardware/software system. First, there is a discussion of the FPGA-based hardware environment for SLUs, followed by a discussion of the software environment from which SLUs might be used. After this, there is a description of the operational properties that SLUs can have, and how these fit in with the two virtual hardware models. Finally, proposals for standard interfaces between SLUs and their environment are discussed. These interfaces can be regarded as constraints on the designers of SLU circuitry or, more positively, as suppliers of an enriched context within which such circuitry operates. The overall impact of the work presented in the paper is to show that it is feasible to incorporate configurable hardware within traditional computer systems that use high-level language programs and computer operating systems. That is, it should not always be necessary to devise special-purpose hardware/software systems to realise custom computing.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134284886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 124

A dynamic reconfiguration run-time system 动态重新配置运行时系统

Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)

Pub Date : 1997-04-16 DOI: 10.1109/FPGA.1997.624606

Jim Burns, A. Donlin, Jonathan D. Hogg, Satnam Singh, Mark de Wit

The feasibility of run-time reconfiguration of FPGAs has been established by a large number of case studies. However, these systems have typically involved an ad hoc combination of hardware and software. The software that manages the dynamic reconfiguration is typically specialised to one application and one hardware configuration. We present three different applications of dynamic reconfiguration, based on research activities at Glasgow University, and extract a set of common requirements. We present the design of an extensible run-time system for managing the dynamic reconfiguration of FPGAs, motivated by these requirements. The system is called RAGE, and incorporates operating-system style services that permit sophisticated and high level operations on circuits.

通过大量的实例研究，验证了fpga运行时重构的可行性。然而，这些系统通常涉及硬件和软件的特别组合。管理动态重新配置的软件通常专门用于一个应用程序和一个硬件配置。基于格拉斯哥大学的研究活动，我们提出了三种不同的动态重构应用，并提取了一组共同的需求。基于这些需求，我们设计了一个可扩展的运行时系统来管理fpga的动态重构。这个系统被称为RAGE，它包含了操作系统风格的服务，允许在电路上进行复杂和高级的操作。

引用次数: 136

Automated target recognition on SPLASH 2 在溅射2上自动识别目标

Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)

Pub Date : 1997-04-16 DOI: 10.1109/FPGA.1997.624619

Michael Rencher, B. Hutchings

Automated target recognition is an application area that requires special-purpose hardware to achieve reasonable performance. FPGA-based platforms can provide a high level of performance for ATR systems if the implementation can be adapted to the limited FPGA and routing resources of these architectures. The paper discusses a mapping experiment where a linear-systolic implementation of an ATR algorithm is mapped to the SPLASH 2 platform. Simple column oriented processors were used throughout the design to achieve high performance with limited nearest neighbor communication. The distributed SPLASH 2 memories are also exploited to achieve a high degree of parallelism. The resulting design is scalable and can be spread across multiple SPLASH 2 boards with a linear increase in performance.

自动目标识别是一个需要专用硬件来实现合理性能的应用领域。如果实现能够适应这些架构的有限FPGA和路由资源，基于FPGA的平台可以为ATR系统提供高水平的性能。本文讨论了一个映射实验，其中一个ATR算法的线性收缩实现映射到SPLASH 2平台。在整个设计中使用了简单的面向列的处理器，以在限制最近邻通信的情况下实现高性能。分布式的SPLASH 2内存也被用来实现高度的并行性。由此产生的设计是可扩展的，可以分布在多个SPLASH 2板上，性能呈线性增长。

引用次数: 70

FPGA synthesis on the XC6200 using IRIS and Trianus/Hades (or from heaven to hell and back again) 在XC6200上使用IRIS和Trianus/Hades(或从天堂到地狱再回来)的FPGA合成

Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)

Pub Date : 1997-04-16 DOI: 10.1109/FPGA.1997.624615

Roger Francis Woods, S. Ludwig, J. Heron, D. Trainor, Stephan W. Gehring

The implementation of a number of FIR filter structures in the Xilinx XC6200 technology is presented. The designs have been implemented using a combination of IRIS, an architectural synthesis tool and Trianus/Hades a set of integrated tools for implementing algorithms on Custom Computing Machines. The main attraction of this approach is that it allows algorithms to be compiled quickly allowing performance changes to be made at the architectural level in IRIS rather than at the FPGA layout level.

介绍了几种FIR滤波器结构在Xilinx XC6200技术上的实现。这些设计已经使用IRIS(一个架构综合工具)和Trianus/Hades(一组用于在自定义计算机上实现算法的集成工具)的组合来实现。这种方法的主要吸引力在于它允许快速编译算法，从而允许在IRIS的体系结构级别而不是在FPGA布局级别进行性能更改。

引用次数: 6

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀