2007 3rd Southern Conference on Programmable Logic最新文献_第5页

Merging FPGA and FPAA Reconfiguration Capabilities for IEEE 1451.4 Compliant Smart Sensor Applications 融合FPGA和FPAA重构能力的IEEE 1451.4智能传感器应用

2007 3rd Southern Conference on Programmable Logic

Pub Date : 2007-06-18 DOI: 10.1109/SPL.2007.371753

D. Morales, A. García, A. Palma, A. Martínez-Olmos

This work focuses on the application of both field programmable analog arrays (FPAAs) and field programmable gate arrays (FPGAs) as an unique system for implementing IEEE 1451.4 sensor interfaces. The inherent reconfigurability of these two hardware platforms allows increasing the versatility of the overall system, leading to a variety of sensor connectivity and remote measurement and control options.

这项工作的重点是现场可编程模拟阵列(FPAAs)和现场可编程门阵列(fpga)作为实现IEEE 1451.4传感器接口的独特系统的应用。这两个硬件平台固有的可重构性允许增加整个系统的多功能性，从而实现各种传感器连接和远程测量和控制选项。

引用次数: 15

An Efficient Scalable Parallel Hardware Architecture for Multilayer Spiking Neural Networks 多层脉冲神经网络的高效可扩展并行硬件架构

2007 3rd Southern Conference on Programmable Logic

Pub Date : 2007-06-18 DOI: 10.1109/SPL.2007.371742

M.A. Nuho-Maganda, M. Arias-Estrada, C. Torres-Huitzil

Artificial neural networks (ANNs) are processing models widely explored due to their computational capabilities for solving problems. Recently, spiking neural networks (SNNs) are being studied as more biological plausible models that resemble closer to biological neurons than classical ANNs. In spite of SNNs offer richer dynamics, their full utilization in practical systems is still limited due to high computational demand on microprocessors-based software implementations. In order to overcome this drawback, an efficient scalable parallel hardware architecture for SNNs is proposed to map efficiently area demanding and dense interconnection requirements of neural processing. The SNNs models have the advantage of reducing the bandwidth needed for interchanging information among neurons, making them more suitable for hardware implementation, due to the communication scheme based on digital spikes. The hardware implementation is divided into two main phases: recall and learning. Timing, hardware resources and performance comparison are mainly shown for the recall phase in this paper.

人工神经网络(ann)由于其解决问题的计算能力而被广泛探索。最近，尖峰神经网络(SNNs)作为一种比经典人工神经网络更接近生物神经元的生物模型而受到研究。尽管snn提供了更丰富的动态，但由于对基于微处理器的软件实现的高计算需求，它们在实际系统中的充分利用仍然受到限制。为了克服这一缺点，提出了一种高效的可扩展的snn并行硬件架构，以有效地映射神经处理的面积要求和密集互连要求。snn模型的优点是减少了神经元之间交换信息所需的带宽，由于基于数字尖峰的通信方案，使其更适合硬件实现。硬件实现分为两个主要阶段:回忆和学习。本文主要介绍了召回阶段的时间、硬件资源和性能比较。

引用次数: 1

FPGA-Based Acceleration of Fingerprint Minutiae Matching 基于fpga的指纹特征匹配加速

2007 3rd Southern Conference on Programmable Logic

Pub Date : 2007-06-18 DOI: 10.1109/SPL.2007.371728

A. Lindoso, L. Entrena, J. Izquierdo

Fingerprint is the most widely used and studied biometric technique because of its universality, distinctiveness, and decreasing cost of the sensing devices. Among the fingerprint identification techniques, minutiae-based algorithms are the most mature. However, these methods are computationally expensive, particularly for comparison with large databases. This work is devoted to study the performance gains that can be achieved with the use of FPGAs. To this purpose, two minutia-based fingerprint matching algorithms have been selected and implemented in a FPGA in order to compare the requirements and performance of software and hardware implementations. Experimental results demonstrate the feasibility of implementing fingerprint matching algorithms in current FPGA devices achieving speed-ups of one or two orders of magnitude. Customization of the proposed implementations can lead to several architectures optimized in size, price, speed or accuracy.

指纹识别技术以其通用性、独特性和较低的传感设备成本成为应用和研究最广泛的生物识别技术。在指纹识别技术中，基于微特征的算法是最成熟的。然而，这些方法在计算上非常昂贵，特别是与大型数据库进行比较时。这项工作致力于研究使用fpga可以实现的性能提升。为此，选择了两种基于细节的指纹匹配算法并在FPGA中实现，以比较软件和硬件实现的要求和性能。实验结果证明了在现有FPGA器件上实现指纹匹配算法的可行性，实现了一到两个数量级的速度提升。所提出的实现的定制可以导致在大小、价格、速度或准确性方面优化的几个架构。

引用次数: 21

A Novel Reconfigurable Architecture for Temporal and Spatial Application Mapping 一种新的可重构时空应用映射体系结构

2007 3rd Southern Conference on Programmable Logic

Pub Date : 2007-06-18 DOI: 10.1109/SPL.2007.371726

A. Danilin, S. Sawitzki

This paper introduces a novel FPGA-like architecture that can perform operations in space (for maximum performance) or in time (for minimum hardware area) at logic-cell level. Based on our previous work concerning DSP applications mapping onto ASTRA reconfigurable architecture, this paper describes the microarchitecture in more detail and introduces some significant improvements. The silicon area of the logic tile is reduced by 40%. The area figures of the benchmarks are only factor 10-25 worse than the ASIC implementation - a very competitive ratio for a reconfigurable architecture.

本文介绍了一种新颖的类fpga架构，该架构可以在逻辑单元级别上执行空间(最大性能)或时间(最小硬件面积)操作。基于我们之前关于DSP应用映射到ASTRA可重构架构的工作，本文更详细地描述了微架构，并介绍了一些重要的改进。逻辑瓦片的硅面积减少了40%。基准测试的面积数据仅比ASIC实现差10-25倍——对于可重构架构来说，这是一个非常有竞争力的比例。

引用次数: 0

Compact FPGA-Based Systolic Array Architecture for Motion Estimation Using Full Search Block Matching 基于全搜索块匹配的紧凑型fpga收缩阵列运动估计结构

2007 3rd Southern Conference on Programmable Logic

Pub Date : 2007-02-01 DOI: 10.1109/SPL.2007.371756

G. Saldaha, M. Arias-Estrada

Motion estimation constitutes a significant computational part of video compression standards such as MPEG4. The present work focuses on the development of a reconfigurable systolic-based architecture implementing the full search block matching algorithm which is highly computing intensive and requires a large bandwidth to obtain real-time performance. The architecture comprises a smart memory scheme to reduce the number of access to image memory and router elements to handle data movement among different structures inside the same architecture, adding the possibility of chaining interconnection of multiple processing blocks. Every PE in the array includes a double ALU in order to search multiple macro-blocks in parallel. Results show that a peak performance in the order of 9 GOPS can be achieved.

运动估计是MPEG4等视频压缩标准中重要的计算部分。目前的工作重点是开发一种可重构的基于收缩的架构，实现全搜索块匹配算法，该算法计算量大，需要大带宽才能获得实时性能。该体系结构包括一种智能存储器方案，用于减少对映像存储器的访问次数，以及一种路由器元件，用于处理同一体系结构内不同结构之间的数据移动，增加了多个处理块的链式互连的可能性。数组中的每个PE都包含一个双ALU，以便并行搜索多个宏块。结果表明，该方法可以达到9 GOPS左右的峰值性能。

引用次数: 1