Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools最新文献

英文中文

Embedded software: how to make it efficient? 嵌入式软件:如何使其高效?

Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools

Pub Date : 2002-09-04 DOI: 10.1109/DSD.2002.1115370

P. Marwedel

This paper stresses the importance of designing efficient embedded software and it provides a global view of some of the techniques that have been developed to meet this goal. These techniques include high-level transformations, compiler optimizations reducing the energy consumption of embedded programs and optimizations exploiting architectural features of embedded processors. Such optimizations lead to significant reductions of the execution time, the required energy and the memory size of embedded applications. Despite this, they can hardly be found in any available compiler.

本文强调了设计高效嵌入式软件的重要性，并提供了为实现这一目标而开发的一些技术的全局视图。这些技术包括高级转换、降低嵌入式程序能耗的编译器优化以及利用嵌入式处理器的体系结构特性的优化。这种优化可以显著减少嵌入式应用程序的执行时间、所需的能量和内存大小。尽管如此，在任何可用的编译器中都很难找到它们。

引用次数: 11

Integration of instruction set simulators into SystemC high level models 将指令集模拟器集成到SystemC高级模型中

Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools

Pub Date : 2002-09-04 DOI: 10.1109/DSD.2002.1115360

Ilia Oussorov, W. Raab, J. Hachmann, A. Kravtsov

This paper discusses the integration of instruction set simulators (ISS) for processor cores into highlevel system models. The approaches to providing data communication between high level modules and ISS are addressed as well as the synchronization between these parts.

本文讨论了处理器内核指令集模拟器(ISS)与高级系统模型的集成。讨论了高层模块与ISS之间数据通信的实现方法以及各部分之间的同步。

引用次数: 9

A hybrid evolutionary algorithm for Multi-FPGA systems design 一种多fpga系统设计的混合进化算法

Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools

Pub Date : 2002-09-04 DOI: 10.1109/DSD.2002.1115352

J. Hidalgo, J. Lanchares, Aitor Ibarra, R. Hermida

Genetic algorithms (GAs) are stochastic optimization heuristics in which searches in solution space are carried out by imitating the population genetics stated in Darwin's theory of evolution. The compact genetic algorithm (cGA) does not manage a population of solutions but only mimics its existence. The combination of genetic and local search heuristic has been shown to be an effective approach to solve some optimization problems more efficiently than with a single GA or a cGA. multi-FPGA systems design flow has three major tasks: partitioning, placement and routing. In this paper we present a new hybrid algorithm that exploits a cGA in order to generate high quality partitioning and placement solutions and, by means of a local search heuristic, improves the solutions obtained using a cGA or a GA.

遗传算法(GAs)是一种随机优化启发式算法，通过模仿达尔文进化论中所述的群体遗传学，在解空间中进行搜索。紧凑遗传算法(cGA)不管理解群，而只是模拟解群的存在。遗传启发式算法与局部搜索启发式算法相结合是一种比单一遗传算法或遗传算法更有效地解决某些优化问题的有效方法。多fpga系统设计流程有三个主要任务:划分、放置和路由。本文提出了一种新的混合算法，该算法利用cGA来生成高质量的分区和放置解，并通过局部搜索启发式来改进使用cGA或GA获得的解。

引用次数: 8

Bit-level allocation of multiple-precision specifications 多精度规格的位级分配

Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools

Pub Date : 2002-09-04 DOI: 10.1109/DSD.2002.1115396

M. Molina, J. Mendias, R. Hermida

This paper proposes an allocation algorithm able to perform the combined resource selection and operation binding of multiple-precision specifications that maximizes the bit-level reuse of hardware resources. Additionally, it presents an analytic method to estimate the amount of area that our approach could save in comparison with traditional allocation algorithms. In order to minimize the cost of the implementations obtained, the proposed algorithm produces circuits only influenced by the maximum number of bits calculated per cycle. This approach contrasts with the cost of implementations designed by traditional algorithms, which also depends on the number and widths of the operations executed in every cycle.

本文提出了一种能够实现多精度规格的资源选择和操作绑定相结合的分配算法，最大限度地实现了硬件资源的位级复用。此外，还提出了一种分析方法来估计与传统分配算法相比，我们的方法可以节省的面积。为了使所获得的实现成本最小化，所提出的算法产生的电路仅受每个周期计算的最大比特数的影响。这种方法与传统算法设计的实现成本形成对比，传统算法也取决于每个周期中执行的操作的数量和宽度。

引用次数: 0

An asynchronous victim cache 异步受害者缓存

Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools

Pub Date : 2002-09-04 DOI: 10.1109/DSD.2002.1115345

D. Hormdee, J. Garside, S. Furber

Memory bandwidth is a limiting factor with many modem microprocessors and it is usual to include a cache to reduce the amount of memory traffic. Of the two commonly used cache write-policies, the copy-back approach is better than the write-through approach in this respect. The performance of both approaches can be further aided by the inclusion of a small buffer in the path of outgoing writes to the main memory, especially if this buffer is capable of forwarding its contents back into the main cache if they are needed again before they are emptied from the buffer This is what is known as a victim cache. For an asynchronous microprocessor it is logical that the cache system should be asynchronous as well; since a large degree of the flexibility of an asynchronous microprocessor would be lost if it were to use a standard synchronous memory interface. However implementing a forwarding mechanism in an asynchronous system is more difficult because the data to be forwarded is flowing in a manner unsynchronised to the process which requires it. This paper presents an architecture for a victim cache to resolve forwarding in a totally asynchronous environment. The resultant structure forms a key part of an asynchronous copy-back cache system for the Amulet3, a third generation asynchronous implementation of the ARM processor.

内存带宽是许多调制解调器微处理器的限制因素，通常包括缓存以减少内存流量。在两种常用的缓存写策略中，在这方面，回拷方法优于透写方法。这两种方法的性能都可以通过在向主内存发出写操作的路径中包含一个小缓冲区来进一步提高，特别是如果这个缓冲区能够在从缓冲区中清空之前再次需要它的内容时将其转发回主缓存，这就是所谓的受害者缓存。对于异步微处理器，逻辑上缓存系统也应该是异步的;因为如果使用标准的同步存储器接口，异步微处理器将失去很大程度的灵活性。然而，在异步系统中实现转发机制更加困难，因为要转发的数据以与需要它的进程不同步的方式流动。本文提出了一种在完全异步环境下解析转发的受害者缓存体系结构。所得到的结构构成了Amulet3的异步回拷缓存系统的关键部分，Amulet3是ARM处理器的第三代异步实现。

{"title":"An asynchronous victim cache","authors":"D. Hormdee, J. Garside, S. Furber","doi":"10.1109/DSD.2002.1115345","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115345","url":null,"abstract":"Memory bandwidth is a limiting factor with many modem microprocessors and it is usual to include a cache to reduce the amount of memory traffic. Of the two commonly used cache write-policies, the copy-back approach is better than the write-through approach in this respect. The performance of both approaches can be further aided by the inclusion of a small buffer in the path of outgoing writes to the main memory, especially if this buffer is capable of forwarding its contents back into the main cache if they are needed again before they are emptied from the buffer This is what is known as a victim cache. For an asynchronous microprocessor it is logical that the cache system should be asynchronous as well; since a large degree of the flexibility of an asynchronous microprocessor would be lost if it were to use a standard synchronous memory interface. However implementing a forwarding mechanism in an asynchronous system is more difficult because the data to be forwarded is flowing in a manner unsynchronised to the process which requires it. This paper presents an architecture for a victim cache to resolve forwarding in a totally asynchronous environment. The resultant structure forms a key part of an asynchronous copy-back cache system for the Amulet3, a third generation asynchronous implementation of the ARM processor.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124309801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Speeding up elliptic cryptosystems using a new signed binary representation for integers 使用新的带符号二进制整数表示法加快椭圆密码系统速度

Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools

Pub Date : 2002-09-04 DOI: 10.1109/DSD.2002.1115395

R. Katti

Two new signed binary representations are presented that simplify hardware or software necessary in elliptic curve cryptosystems. Simplified algorithms are presented for computing the new binary representations. This speeds up elliptic cryptosystems. The algorithms are useful for smart card and digital signature verification applications. The first algorithm computes a new representation for an integer d and speeds up the computation of d/spl times/P, where P is a point on an elliptic curve. The second algorithm computes a new representation for two integers g and h and speeds up the computation of (g/spl times/P)+(h/spl times/Q), where P and Q are points on an elliptic curve.

本文介绍了两种新的带符号二进制表示法，可简化椭圆曲线密码系统所需的硬件或软件。介绍了计算新二进制表示法的简化算法。这加快了椭圆密码系统的速度。这些算法适用于智能卡和数字签名验证应用。第一种算法计算整数 d 的新表示，加快 d/spl times/P 的计算速度，其中 P 是椭圆曲线上的一个点。第二种算法为两个整数 g 和 h 计算一种新的表示方法，并加快 (g/spl times/P)+(h/spl times/Q) 的计算速度，其中 P 和 Q 是椭圆曲线上的点。

引用次数: 10

Reconfigurable hardware implementation of Montgomery modular multiplication and parallel binary exponentiation 蒙哥马利模乘法和并行二进制幂的可重构硬件实现

Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools

Pub Date : 2002-09-04 DOI: 10.1109/DSD.2002.1115373

N. Nedjah, L. M. Mourelle

Modular exponentiation and modular multiplication are the cornerstone computations performed in public-key cryptography systems such as RSA cryptosystem. The operations are time consuming for large operands. Much research effort is directed towards an efficient hardware implementation of both operations. This paper describes the characteristics of two architectures: the first one implements modular multiplication using a systolic version of the fast Montgomery algorithm and the other to implement the parallel binary exponentiation algorithm. The latter uses two Montgomery modular multipliers. Results in terms of space and time requirements for an FPGA prototype are given.

模幂和模乘法是RSA密码系统等公钥密码系统的基础计算。对于大操作数，这些操作非常耗时。许多研究工作都是针对这两种操作的有效硬件实现。本文描述了两种架构的特点:第一种架构使用快速Montgomery算法的收缩版本实现模块化乘法，另一种架构实现并行二进制幂算法。后者使用两个Montgomery模乘法器。给出了FPGA原型在空间和时间上的要求。

引用次数: 25

Simplifying instruction issue logic in superscalar processors 简化超标量处理器中的指令发布逻辑

Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools

Pub Date : 2002-09-04 DOI: 10.1109/DSD.2002.1115388

Toshinori Sato, I. Arita

Modern microprocessors schedule instructions dynamically in order to exploit instruction-level parallelism. It is necessary to increase instruction window size for improving instruction scheduling capability. However, it is difficult to increase the size without any serious impact on processor performance, since the instruction window is one of the dominant determiners of processor cycle time. The instruction window is critical because it is realized using content addressable memory (CAM). In general, RAMs are faster in access time and lower in power dissipation than CAMs. Therefore, it is desirable that the CAM instruction window is replaced by the RAM instruction window. This paper proposes such an instruction window, named the explicit data forwarding instruction window. The principle behind our proposal is to make result forwarding explicit. It is possible to dynamically construct explicit relationships between instructions, since it is expected that each execution result is forwarded to a limited number of dependent instructions. Simulation results show that the explicit data forwarding instruction window achieves a level of performance comparable to that of the conventional instruction window, while also providing benefit in terms of shorter cycle time.

现代微处理器动态调度指令，以利用指令级并行性。为了提高指令调度能力，有必要增加指令窗口的大小。然而，增加指令窗口的大小而不严重影响处理器的性能是很难的，因为指令窗口是处理器周期时间的主要决定因素之一。指令窗口至关重要，因为它是使用内容可寻址内存(CAM)实现的。一般来说，ram的存取时间比cam快，功耗比cam低。因此，CAM指令窗口被RAM指令窗口所取代是可取的。本文提出了这样一种指令窗口，称为显式数据转发指令窗口。我们的建议背后的原则是使结果转发明确。动态地构建指令之间的显式关系是可能的，因为期望每个执行结果被转发到有限数量的依赖指令。仿真结果表明，该显式数据转发指令窗口的性能水平与传统指令窗口相当，同时在更短的周期时间方面也具有优势。

{"title":"Simplifying instruction issue logic in superscalar processors","authors":"Toshinori Sato, I. Arita","doi":"10.1109/DSD.2002.1115388","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115388","url":null,"abstract":"Modern microprocessors schedule instructions dynamically in order to exploit instruction-level parallelism. It is necessary to increase instruction window size for improving instruction scheduling capability. However, it is difficult to increase the size without any serious impact on processor performance, since the instruction window is one of the dominant determiners of processor cycle time. The instruction window is critical because it is realized using content addressable memory (CAM). In general, RAMs are faster in access time and lower in power dissipation than CAMs. Therefore, it is desirable that the CAM instruction window is replaced by the RAM instruction window. This paper proposes such an instruction window, named the explicit data forwarding instruction window. The principle behind our proposal is to make result forwarding explicit. It is possible to dynamically construct explicit relationships between instructions, since it is expected that each execution result is forwarded to a limited number of dependent instructions. Simulation results show that the explicit data forwarding instruction window achieves a level of performance comparable to that of the conventional instruction window, while also providing benefit in terms of shorter cycle time.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130089207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Reachability analysis for formal verification of SystemC SystemC正式验证的可达性分析

Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools

Pub Date : 2002-09-04 DOI: 10.1109/DSD.2002.1115387

R. Drechsler, Daniel Große

With ever increasing design sizes, verification becomes the bottleneck in modem design flows. Up to 80% of the overall costs are due to the verification task. Formal methods have been proposed to overcome the limitations of simulation approaches. But these techniques have mainly been applied to lower levels of abstraction. With more and more design complexity the need for hardware description languages with a high level of abstraction becomes obvious. We present a formal verification approach for circuits described in SystemC, an extension of C that allows the modeling of hardware. An algorithm for reachability analysis is proposed and a case study of a scalable bus arbiter cell is given.

随着设计规模的不断扩大，验证成为现代设计流程中的瓶颈。高达80%的总成本是由于验证任务。已经提出了形式化方法来克服模拟方法的局限性。但是这些技术主要应用于较低层次的抽象。随着设计复杂性的增加，对具有高度抽象的硬件描述语言的需求变得明显。我们提出了SystemC中描述的电路的形式化验证方法，这是C的扩展，允许对硬件进行建模。提出了一种可达性分析算法，并给出了可扩展总线仲裁单元的实例研究。

引用次数: 43

An evaluation of an FPGA run-time support system FPGA运行时支持系统的评估

Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools

Pub Date : 2002-09-04 DOI: 10.1109/DSD.2002.1115382

P. Green, M. Vakondios, M. Edwards

In a previous paper we presented the concept, design and implementation of an FPGA run-time support system (FSS) for a dynamically reconfigurable FPGA. In this paper we discuss our experiences in running applications on the system. Problems with tool support meant that the full capability of the device could not be exploited; nevertheless, a significant application was executed under FSS control on the system. We discuss how both the application itself and the FSS were tuned to improve overall performance. The paper concludes by considering how our experience impacts upon the development of FSS-like software for the latest generation of reconfigurable devices.

在之前的一篇论文中，我们提出了FPGA运行时支持系统(FSS)的概念、设计和实现，用于动态可重构FPGA。本文讨论了我们在该系统上运行应用程序的经验。工具支持方面的问题意味着无法充分发挥设备的功能;然而，一个重要的应用程序是在FSS的控制下对系统执行的。我们将讨论如何对应用程序本身和FSS进行调优以提高整体性能。本文最后考虑了我们的经验如何影响最新一代可重构设备的fss类软件的开发。

引用次数: 3

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀