8th Euromicro Conference on Digital System Design (DSD'05)最新文献

英文中文

Designing a binary neural network co-processor 二值神经网络协处理器设计

8th Euromicro Conference on Digital System Design (DSD'05)

Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.34

M. Freeman, J. Austin

A correlation matrix memory (CMM) is a form of binary neural network, that can be used for high-speed approximate search and match operations on large unstructured datasets. Typically, the processing requirements for a CMM do not map efficiently onto a modern processor based system. Therefore, an application specific co-processor is normally used to improve performance. This paper outlines two possible FPGA based co-processors for executing core CMM operations based upon a compact bit vector (CBV) data format. This representation significantly increases a system's storage capacity, but reduces processing performance.

相关矩阵记忆(CMM)是二值神经网络的一种形式，可用于大型非结构化数据集的高速近似搜索和匹配操作。通常，CMM的处理需求不能有效地映射到基于处理器的现代系统。因此，通常使用特定于应用程序的协处理器来提高性能。本文概述了两种可能的基于FPGA的协处理器，用于执行基于紧凑位矢量(CBV)数据格式的核心CMM操作。这种表示方式显著增加了系统的存储容量，但降低了处理性能。

引用次数: 6

Functional vectors generation for RT-level Verilog descriptions based on path enumeration and constraint logic programming 基于路径枚举和约束逻辑编程的rt级Verilog描述的功能向量生成

8th Euromicro Conference on Digital System Design (DSD'05)

Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.43

Tun Li, Yang Guo, GongJie Liu, Sikun Li

This paper presents a novel method for automatic functional vectors generation from RT-level HDL descriptions based on path coverage and constraint solving. Compared with existing method, the advantage of this method includes: 1) it avoids generating redundant constraints, which will accelerate the test generation process, 2) it solves the problem of how to propagate the internal values to the primary inputs with decision models, 3) it can handle various HDL description styles, and various styles of designs. Experimental results conduct on several practical designs show that our method can efficiently improve the functional vectors generation process. The prototype system has been applied to verify RTL description of a real 32-bits microprocessor core and complex bugs remained hidden in the RTL descriptions are detected.

提出了一种基于路径覆盖和约束求解的rt级HDL描述自动生成功能向量的新方法。与现有方法相比，该方法的优点在于:1)避免了生成冗余约束，从而加快了测试生成过程;2)解决了如何将内部值通过决策模型传播到主要输入的问题;3)可以处理各种HDL描述风格和各种设计风格。几个实际设计的实验结果表明，我们的方法可以有效地改进功能向量的生成过程。将该原型系统应用于实际32位微处理器内核的RTL描述验证，发现了RTL描述中隐藏的复杂错误。

引用次数: 8

Vital signs remote management system for PDAs pda生命体征远程管理系统

8th Euromicro Conference on Digital System Design (DSD'05)

Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.76

Danielly Cruz, E. Barros

It is a challenge to provide an efficient healthcare service for countries with continental dimensions. Mechanisms for a more efficient and better attendance of patients are necessary due to the increasing costs of health care systems. This work proposes the development of a system for monitoring vital signs (including ECG) through PDAs. Therefore, this has made possible the local attendance of patients by medical practitioners (here called health agents) with the support of specialist's physicians through a second opinion system. The proposed approach supports: recording and visualization of ECG waveforms. Moreover, patient's information can be transmitted to and from a remote health care server. In order to make easier the use by doctors and health agents, a user-friendly graphical interface has been developed. Methods for an efficient data access have been also developed to cope with storage constraints of PDAs.

为大陆范围的国家提供有效的医疗保健服务是一项挑战。由于卫生保健系统的费用不断增加，有必要建立更有效和更好的病人护理机制。本工作提出了一个通过pda监测生命体征(包括心电图)的系统的开发。因此，这使得医生(在这里称为健康代理人)通过第二意见系统在专科医生的支持下在当地为病人提供服务成为可能。该方法支持心电波形的记录和可视化。此外，患者的信息可以在远程医疗保健服务器之间来回传输。为了使医生和保健人员更容易使用，开发了一个用户友好的图形界面。为了解决pda的存储限制，还开发了一种有效的数据访问方法。

引用次数: 8

Improvement of the fault coverage of the pseudo-random phase in column-matching BIST 改进列匹配BIST中伪随机相位的故障覆盖率

8th Euromicro Conference on Digital System Design (DSD'05)

Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.51

Peter Filter, H. Kubátová

Several methods improving the fault coverage in mixed-mode BIST are presented in this paper. The test is divided into two phases: the pseudo-random and deterministic. Maximum of faults should be detected by the pseudo-random phase, to reduce the number of faults to be covered in the deterministic one. We study the properties of different pseudo-random pattern generators. Their successful ness in fault covering strictly depends on the tested circuit. We examine properties of LFSRs and cellular automata. Four methods enhancing the pseudo-random fault coverage have been proposed. Then we propose a universal method to efficiently compute test weights. The observations are documented on some of the standard ISCAS benchmarks and the final BIST circuitry is synthesized using the column-matching method.

提出了几种提高混合模式BIST故障覆盖率的方法。测试分为两个阶段:伪随机和确定性。伪随机相位检测故障的最大值，以减少确定性相位所覆盖的故障数量。研究了不同伪随机模式发生器的性质。其故障覆盖的成功与否完全取决于被测电路。我们研究了lfsr和元胞自动机的性质。提出了四种提高伪随机故障覆盖率的方法。然后，我们提出了一种通用的方法来高效地计算测试权值。在一些标准ISCAS基准测试中记录了观察结果，并使用列匹配方法合成了最终的BIST电路。

引用次数: 4

High-quality sub-function construction in the information-driven circuit synthesis with gates 带门的信息驱动电路合成中高质量子功能的构建

8th Euromicro Conference on Digital System Design (DSD'05)

Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.48

L. Józwiak, S. Bieganski

The opportunities created by modern microelectronic technology cannot effectively be exploited, because of weaknesses in traditional circuit synthesis methods used in today's CAD tools. In this paper, a new information-driven circuit synthesis method is discussed that targets combinational circuits implemented with gates. The synthesis method is based on our original information-driven approach to circuit synthesis, bottom-up general functional decomposition and theory of information relationship measures, and considerably differs from all other known methods. The discussion is focused on various sub-function construction methods used during the synthesis. The experimental results from the automatic circuit synthesis tool that implements the method show that the developed by us specific sub-function construction methods targeted to the gate-based circuits deliver much better circuits than the other methods and demonstrate that the information-driven general decomposition produces very fast and compact gate-based circuits.

现代微电子技术创造的机会不能有效地利用，因为在今天的CAD工具中使用的传统电路合成方法的弱点。本文讨论了一种新的信息驱动电路合成方法，其目标是实现有门的组合电路。该综合方法基于我们原始的信息驱动电路综合方法，自下而上的一般功能分解和信息关系度量理论，与所有其他已知方法有很大不同。重点讨论了合成过程中使用的各种子功能构建方法。实现该方法的自动电路综合工具的实验结果表明，我们开发的针对门电路的特定子功能构建方法比其他方法提供了更好的电路，并且证明了信息驱动的一般分解可以产生非常快速和紧凑的门电路。

引用次数: 2

Implementation of a block based neural branch predictor 基于块的神经分支预测器的实现

8th Euromicro Conference on Digital System Design (DSD'05)

Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.49

O. Cadenas, G. Megson, Daniel Jones

This paper contributes to a dynamic branch predictor algorithm based on a perceptron in two directions: Firstly, a new block form of computation is introduced that reduces theoretically by half the combinational critical path for computing a prediction. Secondly, implementation in FPGA hardware is fully developed for quantitative comparison purposes. FPGA circuits for a one-cycle block predictor produces 1.7 faster clock rates than a direct implementation of the original perceptron predictor. This faster clock allows to realize predictions with longer history lengths for the same hardware budget.

本文从两个方面提出了一种基于感知机的动态分支预测算法:首先，引入了一种新的块计算形式，理论上将计算预测的组合关键路径减少了一半;其次，充分开发了FPGA硬件的实现，以便进行定量比较。用于单周期块预测器的FPGA电路产生的时钟速率比原始感知器预测器的直接实现快1.7。这个更快的时钟允许在相同的硬件预算下实现更长的历史记录长度的预测。

引用次数: 4

ARPA - a technology independent and synthetizable system-on-chip model for real-time applications ARPA -一个技术独立的、可合成的用于实时应用的片上系统模型

8th Euromicro Conference on Digital System Design (DSD'05)

Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.19

Arnaldo S. R. Oliveira, V. Sklyarov, A. Ferrari

This paper describes the advanced real-time processor architecture (ARPA) system-on-chip. The goal of this work is to create a technology independent and synthetizable system-on-chip (SoC) model for real-time applications. The main component of the SoC is a MIPS32-based RISC processor. It is implemented using a pipelined simultaneous multithreading structure that supports the execution of more than one thread or task at a time. The synergy between pipelining and simultaneous multithreading allows combining the exploration of Instruction level parallelism and task level parallelism, hide the context switching time and reduce the need of employing complex speculative execution techniques to improve the performance of the pipelined processor. A fundamental component of the ARPA SoC is the operating system coprocessor, which implements in hardware some of the operating systems functions, such as task scheduling, switching, communication and timing. The proposed architecture allows building flexible, high performance, time predictable and power efficient processors optimized for embedded real-time systems.

本文介绍了先进的实时处理器体系结构(ARPA)片上系统。这项工作的目标是为实时应用创建一个技术独立和可合成的片上系统(SoC)模型。SoC的主要组件是基于mips32的RISC处理器。它是使用流水线同步多线程结构实现的，该结构支持一次执行多个线程或任务。流水线和同步多线程之间的协同作用允许结合对指令级并行性和任务级并行性的探索，隐藏上下文切换时间，减少使用复杂的推测执行技术来提高流水线处理器的性能。ARPA SoC的一个基本组件是操作系统协处理器，它在硬件上实现了一些操作系统功能，如任务调度、交换、通信和定时。所提出的架构允许构建灵活，高性能，时间可预测和节能的处理器，优化嵌入式实时系统。

{"title":"ARPA - a technology independent and synthetizable system-on-chip model for real-time applications","authors":"Arnaldo S. R. Oliveira, V. Sklyarov, A. Ferrari","doi":"10.1109/DSD.2005.19","DOIUrl":"https://doi.org/10.1109/DSD.2005.19","url":null,"abstract":"This paper describes the advanced real-time processor architecture (ARPA) system-on-chip. The goal of this work is to create a technology independent and synthetizable system-on-chip (SoC) model for real-time applications. The main component of the SoC is a MIPS32-based RISC processor. It is implemented using a pipelined simultaneous multithreading structure that supports the execution of more than one thread or task at a time. The synergy between pipelining and simultaneous multithreading allows combining the exploration of Instruction level parallelism and task level parallelism, hide the context switching time and reduce the need of employing complex speculative execution techniques to improve the performance of the pipelined processor. A fundamental component of the ARPA SoC is the operating system coprocessor, which implements in hardware some of the operating systems functions, such as task scheduling, switching, communication and timing. The proposed architecture allows building flexible, high performance, time predictable and power efficient processors optimized for embedded real-time systems.","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126578234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Improved fault emulation for synchronous sequential circuits 改进的同步时序电路故障仿真

8th Euromicro Conference on Digital System Design (DSD'05)

Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.50

J. Raik, P. Ellervee, Valentin Tihhomirov, R. Ubar

Current paper presents new alternatives for accelerating the task of fault simulation for sequential circuits by hardware emulation on FPGA. Fault simulation is an important subtask in test pattern generation and it is frequently used throughout the test generation process. The problems associated to fault emulation for sequential circuits are explained and alternative implementations are discussed. An environment for hardware emulation of fault simulation is presented. It incorporates hardware support for fault dropping. The proposed approach allows simulation speed-up of 40 to 500 times as compared to the state-of-the-art in fault simulation. Average speedup provided by the method is 250 that is about an order of magnitude higher than previously cited in the literature. Based on the experiments, we can conclude that it is beneficial to use emulation when large numbers of test vectors is required.

本文提出了利用FPGA硬件仿真加速时序电路故障仿真的新方法。故障仿真是测试模式生成中的一项重要子任务，在测试模式生成过程中经常用到。对时序电路的故障仿真相关的问题进行了解释，并讨论了替代实现。给出了故障仿真的硬件仿真环境。它集成了硬件对故障排除的支持。与目前最先进的故障模拟方法相比，所提出的方法可以使仿真速度提高40到500倍。该方法提供的平均加速为250，比先前文献中引用的大约高一个数量级。实验表明，在需要大量测试向量的情况下，使用仿真是有益的。

引用次数: 6

Optimization of electronic power consumption in wireless sensor nodes 无线传感器节点电子功耗优化

8th Euromicro Conference on Digital System Design (DSD'05)

Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.60

S. Jayapal, S. Ramachandran, R. Bhutada, Y. Manoli

Due to the power limitation in wireless sensor nodes, special attention is required in optimizing the power consumption of the necessary electronics on a node. This can be done at different levels of abstraction, while architectural level optimization brings a major power reduction due to the fact that any changes made at this level of abstraction will be reflected back to the lower levels, all other levels must be also considered in an overall power reduction strategy. This paper discusses different possibilities of power reduction at system, architectural, and circuit level of the node's electronics. It also addresses different communication protocols and their effect on the power consumption of a wireless sensor node.

由于无线传感器节点的功率限制，需要特别注意优化节点上必要电子设备的功耗。这可以在不同的抽象级别上完成，虽然架构级别优化带来了主要的功耗降低，因为在此抽象级别上所做的任何更改都将反映回较低的级别，但所有其他级别也必须在总体功耗降低策略中考虑。本文讨论了节点电子器件在系统、架构和电路层面上降低功耗的不同可能性。它还讨论了不同的通信协议及其对无线传感器节点功耗的影响。

引用次数: 9

Dynamic split: flexible border between instruction and data cache 动态分割:指令和数据缓存之间的灵活边界

8th Euromicro Conference on Digital System Design (DSD'05)

Pub Date : 2005-08-30 DOI: 10.1109/DSD.2005.35

P. Trancoso

Current microprocessors are optimized for the average use. Nevertheless, it is known that different applications impose different demands on the system. This work focuses on the reconfiguration of the first-level caches. In order to achieve good performance, the first-level cache is split physically into two parts, one for instruction and one for data. This separation has the benefit of avoiding interference between instructions and data. Nevertheless, this separation is strict and determined at design-time. In this work we show a cache design that is able to change the split dynamically at runtime. The proposed design was tested using simulation of a variety of benchmark applications from the MiBench suite on two baseline architectures: embeddedXScale and high-end PowerPC. The results show that, while the average misses rate reduction may seem small; certain applications show a benefit larger than 90%. For miss rate reduction, the dynamic split cache seems to be more relevant for the cache with the smaller associativity (PowerPC). Lastly, the dynamic split cache was also used to reduce the energy consumption without loss of performance. This feature resulted in a significant energy reduction and the results showed that it has a bigger impact for the caches with larger associativity (42% energy reduction for the XScale and 28% for the PowerPC for a large data set size).

当前的微处理器是针对一般使用进行优化的。然而，众所周知，不同的应用程序会对系统提出不同的要求。这项工作的重点是第一级缓存的重新配置。为了获得良好的性能，第一级缓存被物理地分成两部分，一部分用于指令，另一部分用于数据。这种分离的好处是避免了指令和数据之间的干扰。然而，这种分离在设计时是严格且确定的。在这项工作中，我们展示了一种能够在运行时动态更改分割的缓存设计。我们在两个基准架构(embeddedXScale和高端PowerPC)上对MiBench套件中的各种基准应用程序进行了模拟测试。结果表明，虽然平均脱靶率的降低可能看起来很小;某些应用程序显示收益大于90%。为了降低丢失率，动态分割缓存似乎更适合具有较小关联性的缓存(PowerPC)。最后，在不损失性能的情况下，还采用了动态分割缓存来降低能耗。这个特性显著降低了能耗，结果表明，它对具有较大关联性的缓存有更大的影响(对于大数据集大小的XScale减少42%的能耗，对于PowerPC减少28%的能耗)。

{"title":"Dynamic split: flexible border between instruction and data cache","authors":"P. Trancoso","doi":"10.1109/DSD.2005.35","DOIUrl":"https://doi.org/10.1109/DSD.2005.35","url":null,"abstract":"Current microprocessors are optimized for the average use. Nevertheless, it is known that different applications impose different demands on the system. This work focuses on the reconfiguration of the first-level caches. In order to achieve good performance, the first-level cache is split physically into two parts, one for instruction and one for data. This separation has the benefit of avoiding interference between instructions and data. Nevertheless, this separation is strict and determined at design-time. In this work we show a cache design that is able to change the split dynamically at runtime. The proposed design was tested using simulation of a variety of benchmark applications from the MiBench suite on two baseline architectures: embeddedXScale and high-end PowerPC. The results show that, while the average misses rate reduction may seem small; certain applications show a benefit larger than 90%. For miss rate reduction, the dynamic split cache seems to be more relevant for the cache with the smaller associativity (PowerPC). Lastly, the dynamic split cache was also used to reduce the energy consumption without loss of performance. This feature resulted in a significant energy reduction and the results showed that it has a bigger impact for the caches with larger associativity (42% energy reduction for the XScale and 28% for the PowerPC for a large data set size).","PeriodicalId":119054,"journal":{"name":"8th Euromicro Conference on Digital System Design (DSD'05)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116176449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

8th Euromicro Conference on Digital System Design (DSD'05)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀