2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation最新文献

英文中文

An Evolutionary Approach to Area-Time Optimization of FPGA designs FPGA设计区域时间优化的进化方法

2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation

Pub Date : 2007-07-16 DOI: 10.1109/ICSAMOS.2007.4285745

Fabrizio Ferrandi, P. Lanzi, G. Palermo, C. Pilato, D. Sciuto, Antonino Tumeo

This paper presents a new methodology based on evolutionary multi-objective optimization (EMO) to synthesize multiple complex modules on programmable devices (FPGAs). It starts from a behavioral description written in a common high-level language (for instance C) to automatically produce the register-transfer level (RTL) design in a hardware description language (e.g. Verilog). Since all high-level synthesis problems (scheduling, allocation and binding) are notoriously NP-complete and interdependent, the three problems should be considered simultaneously. This drives to a wide design space, that needs to be thoroughly explored to obtain solutions able to satisfy the design constraints. Evolutionary algorithms are good candidates to tackle such complex explorations. In this paper we provide a solution based on the non-dominated sorting genetic algorithm (NSGA-II) to explore the design space in order obtain the best solutions in terms of performance given the area constraints of a target FPGA device. Moreover, it has been integrated a good cost estimation model to guarantee the quality of the solutions found without requiring a complete synthesis for the validation of each generation, an impractical and time consuming operation. We show on the JPEG case study that the proposed approach provides good results in terms of trade-off between total area occupied and execution time.

提出了一种基于进化多目标优化(EMO)的可编程器件多复杂模块综合方法。它从用通用高级语言(例如C语言)编写的行为描述开始，以硬件描述语言(例如Verilog)自动生成寄存器传输层(RTL)设计。由于所有高级综合问题(调度、分配和绑定)都是np完全且相互依赖的，因此应同时考虑这三个问题。这推动了一个广阔的设计空间，需要彻底探索，以获得能够满足设计约束的解决方案。进化算法是解决这类复杂探索的好选择。在本文中，我们提供了一个基于非支配排序遗传算法(NSGA-II)的解决方案来探索设计空间，以便在给定目标FPGA器件的面积约束下获得性能方面的最佳解决方案。此外，它还集成了一个良好的成本估算模型，以保证所找到的解决方案的质量，而不需要对每一代的验证进行完整的综合，这是一种不切实际且耗时的操作。我们在JPEG案例研究中表明，所建议的方法在总占用面积和执行时间之间的权衡方面提供了良好的结果。

{"title":"An Evolutionary Approach to Area-Time Optimization of FPGA designs","authors":"Fabrizio Ferrandi, P. Lanzi, G. Palermo, C. Pilato, D. Sciuto, Antonino Tumeo","doi":"10.1109/ICSAMOS.2007.4285745","DOIUrl":"https://doi.org/10.1109/ICSAMOS.2007.4285745","url":null,"abstract":"This paper presents a new methodology based on evolutionary multi-objective optimization (EMO) to synthesize multiple complex modules on programmable devices (FPGAs). It starts from a behavioral description written in a common high-level language (for instance C) to automatically produce the register-transfer level (RTL) design in a hardware description language (e.g. Verilog). Since all high-level synthesis problems (scheduling, allocation and binding) are notoriously NP-complete and interdependent, the three problems should be considered simultaneously. This drives to a wide design space, that needs to be thoroughly explored to obtain solutions able to satisfy the design constraints. Evolutionary algorithms are good candidates to tackle such complex explorations. In this paper we provide a solution based on the non-dominated sorting genetic algorithm (NSGA-II) to explore the design space in order obtain the best solutions in terms of performance given the area constraints of a target FPGA device. Moreover, it has been integrated a good cost estimation model to guarantee the quality of the solutions found without requiring a complete synthesis for the validation of each generation, an impractical and time consuming operation. We show on the JPEG case study that the proposed approach provides good results in terms of trade-off between total area occupied and execution time.","PeriodicalId":106933,"journal":{"name":"2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128888071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Prototyping Efficient Interprocessor Communication Mechanisms 原型有效的处理器间通信机制

2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation

Pub Date : 2007-07-16 DOI: 10.1109/ICSAMOS.2007.4285730

Vassilis D. Papaefstathiou, D. Pnevmatikatos, M. Marazakis, Giorgos Kalokairinos, Aggelos D. Ioannou, Michael Papamichael, S. Kavadias, Giorgos Mihelogiannakis, M. Katevenis

Parallel computing systems are becoming widespread and grow in sophistication. Besides simulation, rapid system prototyping becomes important in designing and evaluating their architecture. We present an efficient FPGA-based platform that we developed and use for research and experimentation on high speed interprocessor communication, network interfaces and interconnects. Our platform supports advanced communication capabilities such as remote DMA, remote queues, zero-copy data delivery and flexible notification mechanisms, as well as link bundling for increased performance. We report on the platform architecture, its design cost, complexity and performance (latency and throughput). We also report our experiences from implementing benchmarking kernels and a user-level benchmark application, and show how software can take advantage of the provided features, but also expose the weaknesses of the system.

并行计算系统正变得越来越广泛和复杂。除了仿真之外，快速系统原型在设计和评估其体系结构方面也很重要。我们提出了一个基于fpga的高效平台，并将其用于高速处理器间通信、网络接口和互连的研究和实验。我们的平台支持先进的通信功能，如远程DMA、远程队列、零拷贝数据传输和灵活的通知机制，以及提高性能的链接捆绑。我们报告了平台架构、设计成本、复杂性和性能(延迟和吞吐量)。我们还报告了实现基准测试内核和用户级基准测试应用程序的经验，并展示了软件如何利用所提供的特性，同时也暴露了系统的弱点。

引用次数: 6

Online Prediction of Applications Cache Utility 在线预测应用程序缓存实用程序

2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation

Pub Date : 2007-07-16 DOI: 10.1109/ICSAMOS.2007.4285748

Miquel Moretó, F. Cazorla, Alex Ramírez, M. Valero

General purpose architectures are designed to offer average high performance regardless of the particular application that is being run. Performance and power inefficiencies appear as a consequence for some programs. Reconfigurable hardware (cache hierarchy, branch predictor, execution units, bandwidth, etc.) has been proposed to overcome these inefficiencies by dynamically adapting the architecture to the application needs. However, nearly all the proposals use indirect measures or heuristics of performance to decide new configurations, what may lead to inefficiencies. In this paper we propose a runtime mechanism that allows to predict the throughput of an application on an architecture using a reconfigurable L2 cache. L2 cache size varies at a way granularity and we predict the performance of the same application on all other L2 cache sizes at the same time. We obtain for different L2 cache sizes an average error of 3.11%, a maximum error of 16.4% and standard deviation of 3.7%. No profiling or operating system participation is needed in this mechanism. We also give a hardware implementation that allows to reduce the hardware cost under 0.4% of the total L2 size and maintains high accuracy. This mechanism can be used to reduce power consumption in single threaded architectures and improve performance in multithreaded architectures that dynamically partition shared L2 caches.

通用架构旨在提供平均的高性能，而不考虑正在运行的特定应用程序。性能和电源效率低下是某些程序的结果。可重构硬件(缓存层次结构、分支预测器、执行单元、带宽等)已经被提出，通过动态调整架构以适应应用程序的需求来克服这些低效率问题。然而，几乎所有的建议都使用间接度量或性能启发式来决定可能导致效率低下的新配置。在本文中，我们提出了一种运行时机制，该机制允许使用可重构L2缓存来预测架构上应用程序的吞吐量。二级缓存大小以某种粒度变化，我们同时预测同一应用程序在所有其他二级缓存大小上的性能。我们得到不同二级缓存大小的平均误差为3.11%，最大误差为16.4%，标准差为3.7%。在这种机制中不需要分析或操作系统参与。我们还提供了一个硬件实现，可以将硬件成本降低到L2总尺寸的0.4%以下，并保持高精度。此机制可用于减少单线程架构中的功耗，并提高动态划分共享L2缓存的多线程架构中的性能。

{"title":"Online Prediction of Applications Cache Utility","authors":"Miquel Moretó, F. Cazorla, Alex Ramírez, M. Valero","doi":"10.1109/ICSAMOS.2007.4285748","DOIUrl":"https://doi.org/10.1109/ICSAMOS.2007.4285748","url":null,"abstract":"General purpose architectures are designed to offer average high performance regardless of the particular application that is being run. Performance and power inefficiencies appear as a consequence for some programs. Reconfigurable hardware (cache hierarchy, branch predictor, execution units, bandwidth, etc.) has been proposed to overcome these inefficiencies by dynamically adapting the architecture to the application needs. However, nearly all the proposals use indirect measures or heuristics of performance to decide new configurations, what may lead to inefficiencies. In this paper we propose a runtime mechanism that allows to predict the throughput of an application on an architecture using a reconfigurable L2 cache. L2 cache size varies at a way granularity and we predict the performance of the same application on all other L2 cache sizes at the same time. We obtain for different L2 cache sizes an average error of 3.11%, a maximum error of 16.4% and standard deviation of 3.7%. No profiling or operating system participation is needed in this mechanism. We also give a hardware implementation that allows to reduce the hardware cost under 0.4% of the total L2 size and maintains high accuracy. This mechanism can be used to reduce power consumption in single threaded architectures and improve performance in multithreaded architectures that dynamically partition shared L2 caches.","PeriodicalId":106933,"journal":{"name":"2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125346142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Secure and Authenticated Communication in Chip-Level Microcomputer Bus Systems with Tree Parity Machines 具有树校验机的芯片级微机总线系统中的安全与认证通信

2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation

Pub Date : 2007-07-16 DOI: 10.1109/ICSAMOS.2007.4285752

Sascha Mühlbach, S. Wallner

The protection of chip-level microcomputer bus systems in embedded devices is essential to prevent the growing number of hardware hacking attacks. This paper presents an authenticated key exchange and encryption solution in order to ensure chip-level microcomputer bus systems via the tree parity machine rekeying architecture (TPMRA). Due to this intention, a scalable TPMRA IP-core is designed and implemented in order to meet variable bus performance requirements. It allows the authentication of the bus participants as well as the encryption of chip-to-chip buses from a single primitive. The solution is transparent and easy applicable to an arbitrary microcomputer bus system for embedded devices on the market. A proof of concept implementation shows the applicability of the TPMRA in the standardized advanced microprocessor bus architecture (AMBA) by implementing the IP-core into the peripheral bus-to-bus interface (AHB-APB-bridge). It will be shown that the solution is latency free and can be used in order to protect the ARM bus system with a low hardware overhead considering all AMBA bus features.

为了防止越来越多的硬件黑客攻击，对嵌入式设备中芯片级微机总线系统的保护至关重要。为了保证芯片级微机总线系统的安全，提出了一种基于树校验机密钥更新体系结构(TPMRA)的认证密钥交换和加密方案。出于这个目的，设计和实现了一个可扩展的TPMRA ip核，以满足可变总线性能需求。它允许对总线参与者进行身份验证，以及从单个原语对芯片到芯片总线进行加密。该方案透明，易于应用于市场上任意嵌入式设备的微机总线系统。通过将ip核实现到外围总线对总线接口(AHB-APB-bridge)中，概念验证实现显示了TPMRA在标准化高级微处理器总线体系结构(AMBA)中的适用性。该解决方案无延迟，考虑到所有AMBA总线特性，可以使用低硬件开销来保护ARM总线系统。

{"title":"Secure and Authenticated Communication in Chip-Level Microcomputer Bus Systems with Tree Parity Machines","authors":"Sascha Mühlbach, S. Wallner","doi":"10.1109/ICSAMOS.2007.4285752","DOIUrl":"https://doi.org/10.1109/ICSAMOS.2007.4285752","url":null,"abstract":"The protection of chip-level microcomputer bus systems in embedded devices is essential to prevent the growing number of hardware hacking attacks. This paper presents an authenticated key exchange and encryption solution in order to ensure chip-level microcomputer bus systems via the tree parity machine rekeying architecture (TPMRA). Due to this intention, a scalable TPMRA IP-core is designed and implemented in order to meet variable bus performance requirements. It allows the authentication of the bus participants as well as the encryption of chip-to-chip buses from a single primitive. The solution is transparent and easy applicable to an arbitrary microcomputer bus system for embedded devices on the market. A proof of concept implementation shows the applicability of the TPMRA in the standardized advanced microprocessor bus architecture (AMBA) by implementing the IP-core into the peripheral bus-to-bus interface (AHB-APB-bridge). It will be shown that the solution is latency free and can be used in order to protect the ARM bus system with a low hardware overhead considering all AMBA bus features.","PeriodicalId":106933,"journal":{"name":"2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127155293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀