2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools最新文献

英文中文

A Packet Classifier Using a Parallel Branching Program Machine 使用并行分支程序机的包分类器

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

Pub Date : 2010-09-01 DOI: 10.1109/DSD.2010.18

Hiroki Nakahara, Tsutomu Sasao, M. Matsuura

A branching program machine (BM) is a special purpose processor that uses only two kinds of instructions: Branch and output instructions. Thus, the architecture for the BM is much simpler than that for a general purpose processor (MPU). Since the BM uses the dedicated instructions for a special purpose application, it is faster than the MPU. This paper presents a packet classifier using a parallel branching program machine (PBM). To reduce computation time and code size, first, a set of rules for the packet classifier is partitioned into groups. Then, they are evaluated by the PBM in parallel. Also, this paper shows a method to estimate the number of necessary BMs to realize the packet classifier. The PBM32 consisting of 32 BMs has been implemented on an FPGA, and compared with the Intel's Core2Duo@1.2GHz. The PBM32 is 8.1-11.1 times faster than the Core2Duo, and the PBM32 requires only 0.2-10.3 percent of the memory for the Core2Duo.

分支程序机(BM)是一种专用处理器，它只使用两种指令:分支指令和输出指令。因此，BM的体系结构比通用处理器(MPU)的要简单得多。由于BM使用专用指令用于特殊用途的应用，因此它比MPU快。提出了一种基于并行分支程序机(PBM)的分组分类器。为了减少计算时间和代码大小，首先，将分组分类器的一组规则分成若干组。然后，它们被PBM并行计算。此外，本文还给出了一种估计实现分组分类器所需bm数量的方法。在FPGA上实现了由32个bm组成的PBM32，并与Intel的Core2Duo@1.2GHz进行了比较。PBM32的速度是Core2Duo的8.1-11.1倍，而PBM32所需的内存仅为Core2Duo的0.2- 10.3%。

引用次数: 11

System Level Synthesis for Ultra Low-Power Wireless Sensor Nodes 超低功耗无线传感器节点的系统级综合

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

Pub Date : 2010-09-01 DOI: 10.1109/DSD.2010.88

Muhammad Adeel Pasha, Steven Derrien, O. Sentieys

Engineering hardware platform for a Wireless Sensor Network (WSN) node is known to be a tough challenge, as the design must enforce many severe constraints, among which energy dissipation is by far the most challenging one. Today, most of the WSN node platforms are based on low cost and low-power programmable micro controllers, even if it is acknowledged that their energy efficiency remains limited and hinders the wide-spreading of WSN to new applications. In this paper, we propose a complete system level flow for an alternative approach based on the concept of hardware micro-tasks, which relies on hardware specialization and power gating to dramatically improve the energy efficiency of the computational part of the node. Early estimates show power saving by more than one order of magnitude over MCU-based implementations.

无线传感器网络(WSN)节点的工程硬件平台是一项艰巨的挑战，因为设计必须执行许多严格的约束，其中能量消耗是迄今为止最具挑战性的。目前，大多数WSN节点平台都是基于低成本和低功耗的可编程微控制器，尽管人们承认它们的能源效率仍然有限，阻碍了WSN向新应用的广泛传播。在本文中，我们提出了一种基于硬件微任务概念的替代方法的完整系统级流程，该方法依赖于硬件专门化和功率门控来显着提高节点计算部分的能量效率。早期的估计显示，与基于mcu的实现相比，功耗节省了一个数量级以上。

引用次数: 12

High Level Validation of an Optimization Algorithm for the Implementation of Adaptive Wavelet Transforms in FPGAs fpga中实现自适应小波变换的优化算法的高级验证

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

Pub Date : 2010-09-01 DOI: 10.1109/DSD.2010.96

R. Salvador, F. Moreno, T. Riesgo, L. Sekanina

The work reported in this paper describes the steps given towards an FPGA-based implementation of evolvable wavelet transforms for image compression in embedded systems. An Evolutionary Algorithm (EA) for the design and optimization of the transform coefficients is tailored for a suitable System on Chip implementation. Several cut downs on the computing requirements have been done to the original algorithm, adapting it for the FPGA implementation. What this paper addresses more specifically is the validation of the algorithm using fixed point arithmetic for the whole optimization process. The results show how high quality transforms are evolved from scratch with limited precision arithmetic. Also, preliminary results of the implementation in an FPGA device are included.

本文所报道的工作描述了在嵌入式系统中实现基于fpga的可进化小波变换图像压缩的步骤。一种进化算法(EA)的设计和优化的变换系数是量身定制的一个合适的片上系统的实现。对原始算法的计算要求进行了一些削减，使其适应FPGA的实现。本文更具体地讨论的是在整个优化过程中使用不动点算法对算法进行验证。结果表明，高质量的变换是如何在有限精度的算法下从零开始演化的。此外，还包括在FPGA器件中实现的初步结果。

引用次数: 4

Design of Testable Universal Logic Gate Targeting Minimum Wire-Crossings in QCA Logic Circuit QCA逻辑电路中以最小导线交叉为目标的可测试通用逻辑门设计

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

Pub Date : 2010-09-01 DOI: 10.1109/DSD.2010.114

B. Sen, Anik Sengupta, M. Dalui, B. Sikdar

This work proposes a testable QCA (Quantum-Dot Cellular Automata) logic gate (UQCALG) realizing the universal functions. The design of UQCALG is based on the Coupled Majority Minority (CMVMIN) QCA structure with the target to reduce wire crossings as well as the number of clock cycles required to operate a QCA circuit. The characterization of defects in such design leads to synthesis of a test block, realized with the majority and minority voters, that ensures the desired testability of a circuit. The experimental designs establish that the UQCALG can result in cost effective design of testable QCA logic circuits that may not be possible with conventional ULG (Universal Logic Gate).

本文提出了一种可测试的量子点元胞自动机逻辑门(UQCALG)，实现了通用功能。UQCALG的设计基于耦合多数少数(CMVMIN) QCA结构，其目标是减少导线交叉以及操作QCA电路所需的时钟周期数。这种设计中的缺陷特征导致测试块的合成，通过多数和少数选民实现，确保电路的预期可测试性。实验设计表明，UQCALG可以设计出具有成本效益的可测试QCA逻辑电路，这可能是传统ULG(通用逻辑门)所无法实现的。

引用次数: 18

Power Consumption Modeling for DVFS Exploitation DVFS开发中的功耗建模

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

Pub Date : 2010-09-01 DOI: 10.1109/DSD.2010.55

A. Castagnetti, C. Belleudy, S. Bilavarn, M. Auguin

A lot of task scheduling algorithms and power management policies have been developed based on simplistic power models, which rarely take into account the effects of the power consumptions of the different components of a real system. Most of the models on which the study of the DVFS scheduling is based, make the assumption that the power consumption of a processor could be modelled as a E ∝ V 2 model. This hypothesis, even if partly true, is not generally applicable when considering the complete system, which consists of the processor, memories and power conversion circuits. In this paper we present a power and energy model for a DVFS enabled mobile computing platform. The platform is based on a low power SoC, which integrates both the processor core and memory, as well as other hardware accelerators. We include in our analisys the study of the power conversion components, which supply the SoC. Starting from measures, we first characterize the power consumption of the SoC and the converters, then a power and energy model for the processor is proposed. The model is able to predict the power consumption of the processor core with an average error less than 10%. This is then used to analyse two DVFS scheduling techniques based on the EDF algorithm, Cycle Conserving and Look Ahead. The results show that the CPU energy saving computed using our model, is far less than what would be expected using a model that does not take into account the effect of the static power.

许多任务调度算法和电源管理策略都是基于简单的功耗模型开发的，很少考虑实际系统中不同组件功耗的影响。研究DVFS调度的大多数模型都假设处理器的功耗可以用E∝v2模型来建模。这种假设即使部分正确，但在考虑由处理器、存储器和电源转换电路组成的完整系统时并不普遍适用。在本文中，我们提出了一个支持DVFS的移动计算平台的功率和能量模型。该平台基于低功耗SoC，集成了处理器核心和内存以及其他硬件加速器。在我们的分析中包括了对供电SoC的功率转换组件的研究。从测量的角度出发，首先对SoC和转换器的功耗进行了表征，然后提出了处理器的功耗和能量模型。该模型能够预测处理器核心的功耗，平均误差小于10%。然后分析了两种基于EDF算法的DVFS调度技术，循环保护和前瞻性。结果表明，使用我们的模型计算的CPU节能远远小于使用不考虑静态功率影响的模型所期望的。

{"title":"Power Consumption Modeling for DVFS Exploitation","authors":"A. Castagnetti, C. Belleudy, S. Bilavarn, M. Auguin","doi":"10.1109/DSD.2010.55","DOIUrl":"https://doi.org/10.1109/DSD.2010.55","url":null,"abstract":"A lot of task scheduling algorithms and power management policies have been developed based on simplistic power models, which rarely take into account the effects of the power consumptions of the different components of a real system. Most of the models on which the study of the DVFS scheduling is based, make the assumption that the power consumption of a processor could be modelled as a E ∝ V 2 model. This hypothesis, even if partly true, is not generally applicable when considering the complete system, which consists of the processor, memories and power conversion circuits. In this paper we present a power and energy model for a DVFS enabled mobile computing platform. The platform is based on a low power SoC, which integrates both the processor core and memory, as well as other hardware accelerators. We include in our analisys the study of the power conversion components, which supply the SoC. Starting from measures, we first characterize the power consumption of the SoC and the converters, then a power and energy model for the processor is proposed. The model is able to predict the power consumption of the processor core with an average error less than 10%. This is then used to analyse two DVFS scheduling techniques based on the EDF algorithm, Cycle Conserving and Look Ahead. The results show that the CPU energy saving computed using our model, is far less than what would be expected using a model that does not take into account the effect of the static power.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123694235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

Instantiating GENESYS Application Architecture Modeling via UML 2.0 Constructs and MARTE Profile 通过UML 2.0构造和MARTE概要文件实例化GENESYS应用程序体系结构建模

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

Pub Date : 2010-09-01 DOI: 10.1109/DSD.2010.36

Subayal Khan, Kari Tiensyrjä, J. Nurmi

Modeling of complex and computationally intense applications supported by modern mobile devices via standard modeling languages is a challenging task. Within the GENESYS process model the application modeling phase is thus of key importance. GENESYS manages complexity by employing cross domain and platform-based application design. The main contribution of this article is to describe the instantiation of GENESYS application architecture modeling via MARTE profile and describe a methodology for validation of nonfunctional properties annotated in the application model.

通过标准建模语言对现代移动设备支持的复杂和计算密集型应用程序进行建模是一项具有挑战性的任务。因此，在GENESYS流程模型中，应用程序建模阶段非常重要。GENESYS通过采用跨领域和平台的应用程序设计来管理复杂性。本文的主要贡献是描述了通过MARTE概要文件对GENESYS应用程序体系结构建模的实例化，并描述了一种验证应用程序模型中注释的非功能属性的方法。

引用次数: 3

Description-Level Optimisation of Synthesisable Asynchronous Circuits 可合成异步电路的描述级优化

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

Pub Date : 2010-09-01 DOI: 10.1109/DSD.2010.71

L. Tarazona, D. Edwards, A. Bardsley, L. Plana

The syntax-directed synthesis paradigm has shown to be a powerful synthesis approach. However, its control-driven nature results in significant performance overhead. Some methods to reduce this overhead include peephole optimisations, control resynthesis and component optimisations. This work explores new methods of improving the performance of syntax-directed synthesised asynchronous circuits, using the Balsa synthesis system as the research framework. This includes investigating description styles and the usage of language constructs that exploit the directness of the synthesis method to obtain more concurrent and faster circuits. The techniques and optimisations presented here has been tested in a set of non-trivial examples including a 32-bit processor, a Viterbi decoder, and a channel-sliced wormhole router.

语法导向的合成范式已被证明是一种强大的合成方法。然而，其控制驱动的特性导致了显著的性能开销。减少这种开销的一些方法包括窥视孔优化、控制再合成和组件优化。这项工作探索了提高语法定向合成异步电路性能的新方法，使用Balsa合成系统作为研究框架。这包括研究描述风格和语言结构的使用，这些语言结构利用合成方法的直接性来获得更多并发和更快的电路。本文介绍的技术和优化已经在一系列重要的示例中进行了测试，包括32位处理器、Viterbi解码器和通道切片虫洞路由器。

引用次数: 2

Cyclic Redundancy Checking (CRC) Accelerator for the FlexCore Processor 循环冗余检查(CRC)加速器的FlexCore处理器

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

Pub Date : 2010-09-01 DOI: 10.1109/DSD.2010.51

M. Azhar, T. Hoang, P. Larsson-Edefors

A proven approach to increase performance of general-purpose processors is to add hardware accelerators. In its basic configuration, the FlexCore processor has a limited set of datapath units. But thanks to a flexible datapath interconnect and a wide control word, the FlexCore datapath is explicitly designed to support integration of special units that, on demand, can accelerate certain data-intensive applications. We present the integration of a versatile accelerator for several Cyclic Redundancy Checking (CRC) keys. Furthermore, we investigate the accelerator’s impact on processor execution time and energy efficiency, using the Power Stone CRC benchmark. Our evaluation shows that the accelerated 65-nm 2.7-ns FlexCore datapath is, for example, 86% more energy and cycle efficient than a datapath lacking the CRC accelerator.

提高通用处理器性能的一种经过验证的方法是添加硬件加速器。在其基本配置中，FlexCore处理器有一组有限的数据路径单元。但是由于灵活的数据路径互连和广泛的控制字，FlexCore数据路径被明确设计为支持特殊单元的集成，可以根据需要加速某些数据密集型应用程序。我们提出了一个多功能加速器的几个循环冗余校验(CRC)密钥的集成。此外，我们研究了加速器对处理器执行时间和能源效率的影响，使用Power Stone CRC基准。我们的评估表明，例如，加速的65纳米2.7 ns FlexCore数据路径比缺乏CRC加速器的数据路径的能量和循环效率高86%。

引用次数: 9

On Reducing Error Rate of Data Protected Using Systematic Unordered Codes in Asymmetric Channels 非对称信道中使用系统无序码降低数据保护错误率的研究

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

Pub Date : 2010-09-01 DOI: 10.1109/DSD.2010.117

S. Piestrak

Berger-invert codes are coding schemes used to protect communication channels against all asymmetric errors and to decrease power consumption. This paper proposes a method of constructing modified Berger-invert codes that relies on the choice of check parts with the smallest possible total weight and assignment of low-weight check parts to the most numerous subsets of data with the largest Hamming weights. As a result, the error rate of the transmitted data can be reduced by up to about 23.5% for a 8-bit bus at no cost (no extra bus lines or increase of hardware to implement encoding and decoding/checking circuitry).

伯杰反码是一种用于保护通信信道免受所有非对称错误和降低功耗的编码方案。本文提出了一种构造改进的berger -反转码的方法，该方法依赖于选择具有尽可能小的总权值的校验部分，并将低权值的校验部分分配给具有最大汉明权值的最多的数据子集。因此，对于8位总线，传输数据的错误率可以降低约23.5%，而无需成本(不需要额外的总线线路或增加硬件来实现编码和解码/检查电路)。

引用次数: 1

A Multicore Embedded Processor for Fingerprint Recognition 用于指纹识别的多核嵌入式处理器

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

Pub Date : 2010-09-01 DOI: 10.1109/DSD.2010.101

G. Danese, Mauro Giachero, F. Leporati, Nelson Nazzicari

Biometric identification systems exploit automated methods of recognition based on physiological or behavioural people characteristics. Among these, fingerprints are very affordable biometric identifiers. In order to build embedded systems performing real-time authentication, a fast computational unit for image processing is required. In this paper we propose a parallel architecture that efficiently implements the high computationally demanding core of a matching algorithm based on Band Limited Phase Only spatial Correlation (BLPOC), elaborated by two concurrent computational units implemented onto Stratix II family Altera FPGA. The realised device is competitive with those provided by similar hardware solutions described in literature and outperforms the elaboration capabilities of general purpose PC processors.

生物识别系统利用基于人的生理或行为特征的自动识别方法。其中，指纹是非常实惠的生物识别标识。为了构建执行实时身份验证的嵌入式系统，需要一个快速的图像处理计算单元。在本文中，我们提出了一种并行架构，该架构有效地实现了基于带限相位空间相关(BLPOC)匹配算法的高计算要求核心，由两个并行计算单元实现在Stratix II系列Altera FPGA上。实现的设备与文献中描述的类似硬件解决方案提供的设备具有竞争力，并且优于通用PC处理器的细化能力。

引用次数: 9

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀