首页 > 最新文献

8th Euromicro Conference on Digital System Design (DSD'05)最新文献

英文 中文
Capturing processor architectures from protocol processing applications: a case study 从协议处理应用程序中捕获处理器架构:一个案例研究
Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.23
S. Virtanen, J. Paakkulainen, T. Nurmi
We present a case study in finding optimized processor architectures for a given protocol processing application. The process involves application analysis, hardware/software partitioning and optimization, and evaluation of design quality through simulations, estimations and synthesis. The case study was targeted at processing key IPv6 routing functions at 200 MHz using 0.18 /spl mu/m CMOS technology. A comparison to an implementation on a commercial processor revealed that the captured architectures provided similar or better performance. Especially checksum calculation was efficient in the captured architectures.
我们提出了一个案例研究,为给定的协议处理应用程序寻找优化的处理器架构。这个过程包括应用分析、硬件/软件划分和优化,以及通过模拟、估计和综合来评估设计质量。该案例研究的目标是使用0.18 /spl mu/m CMOS技术处理200 MHz的关键IPv6路由功能。与商业处理器上的实现进行比较表明,捕获的体系结构提供了类似或更好的性能。特别是在捕获的体系结构中,校验和计算是有效的。
{"title":"Capturing processor architectures from protocol processing applications: a case study","authors":"S. Virtanen, J. Paakkulainen, T. Nurmi","doi":"10.1109/DSD.2005.23","DOIUrl":"https://doi.org/10.1109/DSD.2005.23","url":null,"abstract":"We present a case study in finding optimized processor architectures for a given protocol processing application. The process involves application analysis, hardware/software partitioning and optimization, and evaluation of design quality through simulations, estimations and synthesis. The case study was targeted at processing key IPv6 routing functions at 200 MHz using 0.18 /spl mu/m CMOS technology. A comparison to an implementation on a commercial processor revealed that the captured architectures provided similar or better performance. Especially checksum calculation was efficient in the captured architectures.","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115128355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Power-composition profile driven co-synthesis with power management selection for dynamic and leakage energy reduction 功率成分驱动的协同合成与电源管理选择,以减少动态和泄漏能量
Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.62
Dong Wu, B. Al-Hashimi, M. Schmitz, P. Eles
Recent research has shown that the combination of dynamic voltage scaling (DVS) and adaptive body biasing (ABB) yields high energy reductions in embedded systems. Nevertheless, the implementation of DVS and ABB requires a significant system cost, making it less attractive for many small systems. In this paper we demonstrate that it is possible to reduce this system cost and to achieve comparable energy saving to that obtained using combined DVS and ABB scheme through a co-synthesis methodology which is aware of the tasks' power-composition profile (the ratio of the dynamic power to the leakage power). In particular, the presented methodology performs a power management selection at the architectural level, i. e., it decides upon which processing elements to be equipped with which power management scheme (DVS, ABB, or combined DVS and ABB) - with the aim to achieve high energy savings at a reduced implementation cost. The proposed technique maps, schedules, and voltage scales applications specified as task graphs with timing constraints. Detailed experiments including a real-life benchmark are conducted to demonstrate the effectiveness of the proposed methodology.
最近的研究表明,动态电压缩放(DVS)和自适应体偏置(ABB)的组合在嵌入式系统中产生了很高的能量降低。然而,分布式交换机和ABB的实施需要大量的系统成本,使其对许多小型系统不那么有吸引力。在本文中,我们证明,通过一种了解任务功率组成曲线(动态功率与泄漏功率之比)的协同综合方法,有可能降低该系统成本,并实现与使用分布式交换机和ABB方案相媲美的节能效果。特别是,所提出的方法在架构级别执行电源管理选择,即决定哪些处理元件配备哪种电源管理方案(DVS, ABB,或DVS和ABB的组合),目的是在降低实施成本的情况下实现高能效。建议的技术映射、调度和电压刻度应用程序指定为带有时间约束的任务图。详细的实验,包括现实生活中的基准进行了证明所提出的方法的有效性。
{"title":"Power-composition profile driven co-synthesis with power management selection for dynamic and leakage energy reduction","authors":"Dong Wu, B. Al-Hashimi, M. Schmitz, P. Eles","doi":"10.1109/DSD.2005.62","DOIUrl":"https://doi.org/10.1109/DSD.2005.62","url":null,"abstract":"Recent research has shown that the combination of dynamic voltage scaling (DVS) and adaptive body biasing (ABB) yields high energy reductions in embedded systems. Nevertheless, the implementation of DVS and ABB requires a significant system cost, making it less attractive for many small systems. In this paper we demonstrate that it is possible to reduce this system cost and to achieve comparable energy saving to that obtained using combined DVS and ABB scheme through a co-synthesis methodology which is aware of the tasks' power-composition profile (the ratio of the dynamic power to the leakage power). In particular, the presented methodology performs a power management selection at the architectural level, i. e., it decides upon which processing elements to be equipped with which power management scheme (DVS, ABB, or combined DVS and ABB) - with the aim to achieve high energy savings at a reduced implementation cost. The proposed technique maps, schedules, and voltage scales applications specified as task graphs with timing constraints. Detailed experiments including a real-life benchmark are conducted to demonstrate the effectiveness of the proposed methodology.","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115261275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Reconfigurable parallel approximate string matching on FPGAs fpga上的可重构并行近似字符串匹配
Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.66
J. H. Park
This paper presents a design and implementation of a reconfigurable parallel approximate string matching hardware on FPGAs. The design is based on a linear systolic dataflow algorithm, and control logic is added to reconfigure the resulting hardware. For the k-differences version of the approximate string matching problem, the proposed approach finds all approximate occurrences of a pattern in the reference string, with the time complexity O(n+m) where n and m are lengths of the reference string and the pattern, respectively. Unlike other hardware approaches found in the literature, the design is size optimized since it uses only m PEs that are independent on the reference string length. Also the design is flexible for handling arbitrary size pattern strings within the maximum bound. The design is implemented and tested on the target device Xilinx Spartan 2S XC2S200EPQ208.
本文提出了一种基于fpga的可重构并行近似字符串匹配硬件的设计与实现。该设计基于线性收缩数据流算法,并添加控制逻辑来重新配置生成的硬件。对于近似字符串匹配问题的k差版本,提出的方法查找参考字符串中模式的所有近似出现,时间复杂度为O(n+m),其中n和m分别是参考字符串和模式的长度。与文献中发现的其他硬件方法不同,该设计是尺寸优化的,因为它只使用与参考字符串长度无关的m个pe。此外,该设计还可以灵活地处理最大范围内任意大小的模式字符串。该设计在目标设备Xilinx Spartan 2S XC2S200EPQ208上进行了实现和测试。
{"title":"Reconfigurable parallel approximate string matching on FPGAs","authors":"J. H. Park","doi":"10.1109/DSD.2005.66","DOIUrl":"https://doi.org/10.1109/DSD.2005.66","url":null,"abstract":"This paper presents a design and implementation of a reconfigurable parallel approximate string matching hardware on FPGAs. The design is based on a linear systolic dataflow algorithm, and control logic is added to reconfigure the resulting hardware. For the k-differences version of the approximate string matching problem, the proposed approach finds all approximate occurrences of a pattern in the reference string, with the time complexity O(n+m) where n and m are lengths of the reference string and the pattern, respectively. Unlike other hardware approaches found in the literature, the design is size optimized since it uses only m PEs that are independent on the reference string length. Also the design is flexible for handling arbitrary size pattern strings within the maximum bound. The design is implemented and tested on the target device Xilinx Spartan 2S XC2S200EPQ208.","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"303 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122822484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Wireless sensor network implementation for industrial linear position metering 无线传感器网络实现工业线性位置计量
Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.78
M. Kohvakka, Marko Hännikäinen, T. Hämäläinen
This paper presents the design and performance measurements of a prototype wireless sensor network (WSN) for industrial linear position metering. Design includes two different prototype platforms and a user application. Prototypes combine energy efficient commercial off-the-shelf components including a 2.4 GHz radio, and the custom TUTWSN communication protocols resulting high robustness, autonomous operation and very low power consumption. The user application displays sensor data graphically and enables further data analysis. Measurements contain component power analysis and prototype performance measurements. The measurements indicate 200 /spl mu/W to 400 /spl mu/W average node power consumption, as 16-bit sample is measured with 1 Hz sample rate, and routed to a WSN gateway with 1 s latency per hop and 512 bps throughput between nodes. Predicted lifetime of implemented WSN is 2 months with a small rechargeable battery or over 2 years with two AA batteries.
本文介绍了一种用于工业线性位置测量的无线传感器网络原型的设计和性能测量。设计包括两个不同的原型平台和一个用户应用程序。原型机结合了节能的商业现成组件,包括2.4 GHz无线电,以及定制的TUTWSN通信协议,具有高鲁棒性,自主操作和非常低的功耗。用户应用程序以图形方式显示传感器数据,并支持进一步的数据分析。测量包括组件功率分析和原型性能测量。测量结果表明,节点平均功耗为200 /spl mu/W至400 /spl mu/W,因为16位采样以1hz采样率测量,并且路由到WSN网关,每跳1 s延迟,节点之间的吞吐量为512bps。预计实现的WSN使用寿命为2个月,使用一个小的可充电电池,或超过2年,使用两个AA电池。
{"title":"Wireless sensor network implementation for industrial linear position metering","authors":"M. Kohvakka, Marko Hännikäinen, T. Hämäläinen","doi":"10.1109/DSD.2005.78","DOIUrl":"https://doi.org/10.1109/DSD.2005.78","url":null,"abstract":"This paper presents the design and performance measurements of a prototype wireless sensor network (WSN) for industrial linear position metering. Design includes two different prototype platforms and a user application. Prototypes combine energy efficient commercial off-the-shelf components including a 2.4 GHz radio, and the custom TUTWSN communication protocols resulting high robustness, autonomous operation and very low power consumption. The user application displays sensor data graphically and enables further data analysis. Measurements contain component power analysis and prototype performance measurements. The measurements indicate 200 /spl mu/W to 400 /spl mu/W average node power consumption, as 16-bit sample is measured with 1 Hz sample rate, and routed to a WSN gateway with 1 s latency per hop and 512 bps throughput between nodes. Predicted lifetime of implemented WSN is 2 months with a small rechargeable battery or over 2 years with two AA batteries.","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129721179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Multi-media applications and imprecise computation 多媒体应用和不精确计算
Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.58
M. Breuer
As feature sizes continue to decrease and clock rates and device count on a VLSI chip increase, it becomes increasingly more difficult to maintain yields at their present levels. Process variation, noise and spot defects create very costly problems for our industry. Luckily, in the domain of multi-media, there exists a large body of functions where computational results need not always be correct. We show that for many VLSI implementations of signal processing algorithms, such as MPEG and JPEG encoders, a significant proportion of chips having low levels of defects provide erroneous but acceptable results. We introduce the concept of error-tolerance, and mention related issues needed to support this concept, including ways for specifying performance, design techniques that consider yield, test techniques for quantifying erroneous behavior, and finally the issue of marketing. The motivation for this work is to significantly increase the effective yield of a process, encourage the implementation of complex data processing chips, and drastically reduce chip costs.
随着特征尺寸的不断减小以及VLSI芯片上的时钟速率和器件数量的增加,将产量维持在当前水平变得越来越困难。工艺变化、噪音和斑点缺陷给我们的行业带来了非常昂贵的问题。幸运的是,在多媒体领域,存在大量的函数,其中的计算结果并不总是正确的。我们表明,对于许多信号处理算法的VLSI实现,如MPEG和JPEG编码器,具有低水平缺陷的芯片的很大一部分提供错误但可接受的结果。我们介绍了容错的概念,并提到了支持这一概念所需的相关问题,包括指定性能的方法、考虑产量的设计技术、量化错误行为的测试技术,以及最后的营销问题。这项工作的动机是显著提高工艺的有效产率,鼓励复杂数据处理芯片的实施,并大幅降低芯片成本。
{"title":"Multi-media applications and imprecise computation","authors":"M. Breuer","doi":"10.1109/DSD.2005.58","DOIUrl":"https://doi.org/10.1109/DSD.2005.58","url":null,"abstract":"As feature sizes continue to decrease and clock rates and device count on a VLSI chip increase, it becomes increasingly more difficult to maintain yields at their present levels. Process variation, noise and spot defects create very costly problems for our industry. Luckily, in the domain of multi-media, there exists a large body of functions where computational results need not always be correct. We show that for many VLSI implementations of signal processing algorithms, such as MPEG and JPEG encoders, a significant proportion of chips having low levels of defects provide erroneous but acceptable results. We introduce the concept of error-tolerance, and mention related issues needed to support this concept, including ways for specifying performance, design techniques that consider yield, test techniques for quantifying erroneous behavior, and finally the issue of marketing. The motivation for this work is to significantly increase the effective yield of a process, encourage the implementation of complex data processing chips, and drastically reduce chip costs.","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124787326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 97
A processor for testing mixed-signal cores in system-on-chip 用于测试片上系统中混合信号核心的处理器
Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.11
Francisco Duarte, J. M. D. Silva, J. Alves, G. Pinho, J. S. Matos
This paper describes the design of a processor specific for testing cores embedded in system-on-chip. This processor, which can be implemented within a system's reconfigurable area, shall be responsible for scheduling and control test operations and perform preliminary data processing, as well as to provide the interface with an external tester. Building these test operations on-chip allows for simplifying external tester interface and to reduce testing time. The testing procedure and the infrastructure required to test an A/D converter is described as an example.
本文介绍了一种用于测试片上系统内核的专用处理器的设计。该处理器可在系统的可重构区域内实施,负责测试操作的调度和控制,进行初步数据处理,并提供与外部测试器的接口。在芯片上构建这些测试操作可以简化外部测试器接口并减少测试时间。测试过程和测试A/D转换器所需的基础设施被描述为一个例子。
{"title":"A processor for testing mixed-signal cores in system-on-chip","authors":"Francisco Duarte, J. M. D. Silva, J. Alves, G. Pinho, J. S. Matos","doi":"10.1109/DSD.2005.11","DOIUrl":"https://doi.org/10.1109/DSD.2005.11","url":null,"abstract":"This paper describes the design of a processor specific for testing cores embedded in system-on-chip. This processor, which can be implemented within a system's reconfigurable area, shall be responsible for scheduling and control test operations and perform preliminary data processing, as well as to provide the interface with an external tester. Building these test operations on-chip allows for simplifying external tester interface and to reduce testing time. The testing procedure and the infrastructure required to test an A/D converter is described as an example.","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114367918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Remote path delay fault simulation 远程路径延迟故障仿真
Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.68
Øystein Gjermundnes, E. Aas
This paper describes the design of a remote fault simulator for delay faults that can be used by students to investigate the effect of different stimuli generators for different types of delay fault models.
本文介绍了一种远程延迟故障模拟器的设计,学生可以使用它来研究不同的激励发生器对不同类型的延迟故障模型的影响。
{"title":"Remote path delay fault simulation","authors":"Øystein Gjermundnes, E. Aas","doi":"10.1109/DSD.2005.68","DOIUrl":"https://doi.org/10.1109/DSD.2005.68","url":null,"abstract":"This paper describes the design of a remote fault simulator for delay faults that can be used by students to investigate the effect of different stimuli generators for different types of delay fault models.","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134582050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Embedded object architecture 嵌入式对象体系结构
Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.39
Tero Vallius, J. Röning
Traditionally, the embedded system design process demands a considerable amount of expertise, time and money. This makes developing embedded systems impossible for many companies, and in research facilities it hinders the testing of new research results with real embedded systems. We previously presented an easy and fast embedded system development concept based on embedded objects. The embedded object concept (EOC) utilizes common object oriented methods used in software by applying them to combined Lego-like software-hardware entities. This concept enables people without comprehensive knowledge in electronics design to create new embedded systems. In this paper we present a physical and logical architecture for this concept.
传统上,嵌入式系统设计过程需要大量的专业知识、时间和金钱。这使得开发嵌入式系统对许多公司来说是不可能的,并且在研究设施中,它阻碍了用真正的嵌入式系统测试新的研究结果。我们提出了一种基于嵌入式对象的简单快速的嵌入式系统开发概念。嵌入式对象概念(EOC)利用软件中常用的面向对象方法,将它们应用于类似乐高的软硬件组合实体。这个概念使没有全面电子设计知识的人能够创建新的嵌入式系统。在本文中,我们提出了这个概念的物理和逻辑架构。
{"title":"Embedded object architecture","authors":"Tero Vallius, J. Röning","doi":"10.1109/DSD.2005.39","DOIUrl":"https://doi.org/10.1109/DSD.2005.39","url":null,"abstract":"Traditionally, the embedded system design process demands a considerable amount of expertise, time and money. This makes developing embedded systems impossible for many companies, and in research facilities it hinders the testing of new research results with real embedded systems. We previously presented an easy and fast embedded system development concept based on embedded objects. The embedded object concept (EOC) utilizes common object oriented methods used in software by applying them to combined Lego-like software-hardware entities. This concept enables people without comprehensive knowledge in electronics design to create new embedded systems. In this paper we present a physical and logical architecture for this concept.","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129375685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Efficient MLP digital implementation on FPGA 基于FPGA的高效MLP数字化实现
Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.38
S. Vitabile, V. Conti, Fulvio Gennaro, F. Sorbello
The efficiency and the accuracy of a digital feedforward neural networks must be optimized to obtain both high classification rate and minimum area on chip. In this paper an efficient MLP digital implementation. The key features of the hardware implementation are the virtual neuron based architecture and the use of the sinusoidal activation function for the hidden layer. The effectiveness of the proposed solutions has been evaluated developing different FPGA based neural prototypes for the high energy physics domain and the automatic road sign recognition domain. The use of the sinusoidal activation function decreases hardware resource employment of about 32% when compared with the standard sigmoid based neuron implementation. The virtual neuron implementation makes efficient the mapping of a neural network into hardware devices since it leads to a significant decreasing of concurrent memory access.
为了获得较高的分类率和最小的片上面积,必须对数字前馈神经网络的效率和精度进行优化。本文提出了一种高效的MLP数字化实现方法。硬件实现的关键特点是基于虚拟神经元的结构和使用正弦激活函数作为隐藏层。在高能物理领域和自动道路标志识别领域开发了不同的基于FPGA的神经网络原型,对所提出的解决方案的有效性进行了评估。与标准的基于s形的神经元实现相比,正弦激活函数的使用减少了约32%的硬件资源使用。虚拟神经元的实现使得神经网络映射到硬件设备的效率大大提高,因为它导致并发内存访问的显著减少。
{"title":"Efficient MLP digital implementation on FPGA","authors":"S. Vitabile, V. Conti, Fulvio Gennaro, F. Sorbello","doi":"10.1109/DSD.2005.38","DOIUrl":"https://doi.org/10.1109/DSD.2005.38","url":null,"abstract":"The efficiency and the accuracy of a digital feedforward neural networks must be optimized to obtain both high classification rate and minimum area on chip. In this paper an efficient MLP digital implementation. The key features of the hardware implementation are the virtual neuron based architecture and the use of the sinusoidal activation function for the hidden layer. The effectiveness of the proposed solutions has been evaluated developing different FPGA based neural prototypes for the high energy physics domain and the automatic road sign recognition domain. The use of the sinusoidal activation function decreases hardware resource employment of about 32% when compared with the standard sigmoid based neuron implementation. The virtual neuron implementation makes efficient the mapping of a neural network into hardware devices since it leads to a significant decreasing of concurrent memory access.","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127060638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Efficient Implementation of digital filters with use of advanced synthesis methods targeted FPGA architectures 利用针对FPGA架构的先进合成方法高效实现数字滤波器
Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.81
M. Rawski, P. Tomaszewicz, H. Selvaraj, T. Luba
This paper presents an efficient method for implementation of digital filters targeted FPGA architectures. The traditional approach is based on application of general purpose multipliers. However, performance of multipliers implemented in FPGA architectures does not allow to constructs high performance digital filters. In this paper application of distributed arithmetic is demonstrated. Since in this approach combinational LUT blocks replace general purpose multipliers, it is possible to construct digital filters of very high performance. However LUT blocks can be of considerable size thus advanced synthesis methods have to be used to map them efficiently into FPGA resources. In this paper and application of the functional decomposition based synthesis has been investigated. This method is recognised as the best synthesis method targeted FPGA architectures and allows significant improvements in digital filters implementation. The paper presents many examples confirming that decomposition allows reduction of logic cell utilisation of filter implementation based on distributed arithmetic concept with no performance degradation and even increasing it.
本文提出了一种针对FPGA架构实现数字滤波器的有效方法。传统的方法是基于通用乘法器的应用。然而,在FPGA架构中实现的乘法器的性能不允许构建高性能数字滤波器。本文演示了分布式算法的应用。由于在这种方法中,组合LUT块取代了通用乘法器,因此可以构建非常高性能的数字滤波器。然而,LUT块可能相当大,因此必须使用先进的合成方法将它们有效地映射到FPGA资源中。本文对基于功能分解的合成方法及其应用进行了研究。该方法被认为是针对FPGA架构的最佳合成方法,并且可以显著改进数字滤波器的实现。本文给出了许多实例,证实了分解可以减少基于分布式算术概念的滤波器实现的逻辑单元利用率,而不会降低性能,甚至会提高性能。
{"title":"Efficient Implementation of digital filters with use of advanced synthesis methods targeted FPGA architectures","authors":"M. Rawski, P. Tomaszewicz, H. Selvaraj, T. Luba","doi":"10.1109/DSD.2005.81","DOIUrl":"https://doi.org/10.1109/DSD.2005.81","url":null,"abstract":"This paper presents an efficient method for implementation of digital filters targeted FPGA architectures. The traditional approach is based on application of general purpose multipliers. However, performance of multipliers implemented in FPGA architectures does not allow to constructs high performance digital filters. In this paper application of distributed arithmetic is demonstrated. Since in this approach combinational LUT blocks replace general purpose multipliers, it is possible to construct digital filters of very high performance. However LUT blocks can be of considerable size thus advanced synthesis methods have to be used to map them efficiently into FPGA resources. In this paper and application of the functional decomposition based synthesis has been investigated. This method is recognised as the best synthesis method targeted FPGA architectures and allows significant improvements in digital filters implementation. The paper presents many examples confirming that decomposition allows reduction of logic cell utilisation of filter implementation based on distributed arithmetic concept with no performance degradation and even increasing it.","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124906374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
期刊
8th Euromicro Conference on Digital System Design (DSD'05)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1