2011 Electronic System Level Synthesis Conference (ESLsyn)最新文献

英文中文

Increasing computational density of application-specific systems 增加特定应用程序系统的计算密度

2011 Electronic System Level Synthesis Conference (ESLsyn)

Pub Date : 2011-06-05 DOI: 10.1109/ESLSYN.2011.5952293

Michael D. Wilder, R. Rinker

Application-specific systems are increasingly being deployed on reconfigurable computing platforms such as the field-programmable gate array (FPGA). These systems can integrate many disparate computing elements, and often contain soft processors hosting application components. Soft processors are sequential, synchronous devices with low computational density, and are not capable of exploiting the concurrency available on the FPGA. We present a method for increasing the computational density of application-specific systems by eliminating soft processors within these systems. This method eliminates soft processors by replacing programs that would be hosted on soft processors with custom, self-contained, circuitizable finite-state machine with datapath (FSMD) components that are automatically generated. We show that FS-MDs produced using this method eliminate the computational overhead associated with fetching and decoding instructions. We further show that this method, when applied to interrupt-driven programs, can produce concurrent FSMDs that arbitrate for shared datapath resources. We discuss how these FSMDs are capable of leveraging the spatial computational capabilities of the FPGA and are therefore more aptly suited for deployment within application-specific systems. We show that these FSMDs eliminate overhead associated with interrupt context switching, decrease interrupt servicing latencies, and eliminate interrupt livelock. We discuss implications and limitations of this method, and describe a prototype that implements the method for programs targeted for the Intel 8051.

特定应用系统越来越多地部署在可重构计算平台上，如现场可编程门阵列(FPGA)。这些系统可以集成许多不同的计算元素，并且通常包含承载应用程序组件的软处理器。软处理器是具有低计算密度的顺序、同步设备，并且不能利用FPGA上可用的并发性。我们提出了一种通过消除这些系统中的软处理器来增加特定应用系统的计算密度的方法。这种方法通过使用自动生成的数据路径(FSMD)组件的定制的、自包含的、可循环的有限状态机替换将驻留在软处理器上的程序，从而消除了软处理器。我们表明，使用这种方法产生的fs - md消除了与获取和解码指令相关的计算开销。我们进一步表明，这种方法，当应用于中断驱动的程序，可以产生并发fsmd仲裁共享数据路径资源。我们讨论了这些fsmd如何能够利用FPGA的空间计算能力，因此更适合在特定应用系统中部署。我们表明，这些fsmd消除了与中断上下文切换相关的开销，减少了中断服务延迟，并消除了中断活动锁。我们讨论了这种方法的含义和局限性，并描述了一个原型，实现了针对Intel 8051的程序的方法。

{"title":"Increasing computational density of application-specific systems","authors":"Michael D. Wilder, R. Rinker","doi":"10.1109/ESLSYN.2011.5952293","DOIUrl":"https://doi.org/10.1109/ESLSYN.2011.5952293","url":null,"abstract":"Application-specific systems are increasingly being deployed on reconfigurable computing platforms such as the field-programmable gate array (FPGA). These systems can integrate many disparate computing elements, and often contain soft processors hosting application components. Soft processors are sequential, synchronous devices with low computational density, and are not capable of exploiting the concurrency available on the FPGA. We present a method for increasing the computational density of application-specific systems by eliminating soft processors within these systems. This method eliminates soft processors by replacing programs that would be hosted on soft processors with custom, self-contained, circuitizable finite-state machine with datapath (FSMD) components that are automatically generated. We show that FS-MDs produced using this method eliminate the computational overhead associated with fetching and decoding instructions. We further show that this method, when applied to interrupt-driven programs, can produce concurrent FSMDs that arbitrate for shared datapath resources. We discuss how these FSMDs are capable of leveraging the spatial computational capabilities of the FPGA and are therefore more aptly suited for deployment within application-specific systems. We show that these FSMDs eliminate overhead associated with interrupt context switching, decrease interrupt servicing latencies, and eliminate interrupt livelock. We discuss implications and limitations of this method, and describe a prototype that implements the method for programs targeted for the Intel 8051.","PeriodicalId":253939,"journal":{"name":"2011 Electronic System Level Synthesis Conference (ESLsyn)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123727521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A framework for generic HW/SW communication using remote method invocation 使用远程方法调用的通用硬件/软件通信框架

2011 Electronic System Level Synthesis Conference (ESLsyn)

Pub Date : 2011-06-05 DOI: 10.1109/ESLSYN.2011.5952289

Philipp A. Hartmann, K. Gruttner, Philipp Ittershagen, A. Rettberg

Implementation of communication between different tasks of a concurrent embedded system is a challenging problem. The aim of our work is to support the refinement and relocation of tasks onto different execution units, such as processors running different operating system or even dedicated hardware. For this purpose communication should be transparent and as independent as possible from the underlying middleware or embedded operating system. Moreover, communication should also be transparent accros the HW/SW boundary. In this work we present a generic framework for seamless communication of (software) tasks with shared resources, called Shared Objects. Communication is implemented using a method-based interface realizing a Remote Method Invocation (RMI) protocol. Our shared communication resources can either be implemented as dedicated hardware, as shared memory or local tasks. The presented framework is a first step towards the unification of shared resource access based on embedded Linux. The effectiveness of our approach is be evaluated with different task mappings and shared resource access implementation styles.

嵌入式并发系统中不同任务间通信的实现是一个具有挑战性的问题。我们工作的目的是支持任务的细化和重新定位到不同的执行单元，例如运行不同操作系统甚至专用硬件的处理器。为此目的，通信应该是透明的，并且尽可能独立于底层中间件或嵌入式操作系统。此外，跨硬件/软件边界的通信也应该是透明的。在这项工作中，我们提出了一个通用框架，用于(软件)任务与共享资源(称为共享对象)的无缝通信。通信使用基于方法的接口实现远程方法调用(RMI)协议。我们的共享通信资源可以作为专用硬件、共享内存或本地任务来实现。提出的框架是实现基于嵌入式Linux的共享资源访问统一的第一步。用不同的任务映射和共享资源访问实现风格来评估我们方法的有效性。

{"title":"A framework for generic HW/SW communication using remote method invocation","authors":"Philipp A. Hartmann, K. Gruttner, Philipp Ittershagen, A. Rettberg","doi":"10.1109/ESLSYN.2011.5952289","DOIUrl":"https://doi.org/10.1109/ESLSYN.2011.5952289","url":null,"abstract":"Implementation of communication between different tasks of a concurrent embedded system is a challenging problem. The aim of our work is to support the refinement and relocation of tasks onto different execution units, such as processors running different operating system or even dedicated hardware. For this purpose communication should be transparent and as independent as possible from the underlying middleware or embedded operating system. Moreover, communication should also be transparent accros the HW/SW boundary. In this work we present a generic framework for seamless communication of (software) tasks with shared resources, called Shared Objects. Communication is implemented using a method-based interface realizing a Remote Method Invocation (RMI) protocol. Our shared communication resources can either be implemented as dedicated hardware, as shared memory or local tasks. The presented framework is a first step towards the unification of shared resource access based on embedded Linux. The effectiveness of our approach is be evaluated with different task mappings and shared resource access implementation styles.","PeriodicalId":253939,"journal":{"name":"2011 Electronic System Level Synthesis Conference (ESLsyn)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127673961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Application-specific codesign platform generation for digital mockups in cyber-physical systems 网络物理系统中数字模型的特定应用协同设计平台生成

2011 Electronic System Level Synthesis Conference (ESLsyn)

Pub Date : 2011-06-05 DOI: 10.1109/ESLSYN.2011.5952295

Bailey Miller, Frank VahicK, T. Givargis

The testing of cyber-physical systems requires validating device functionality for a wide range of operating conditions. The environment with which the cyber-physical device interacts, such as lungs for a medical ventilator device or a busy freeway for an autonomous vehicle, may be complex and subsequently difficult to explore all possible configurations. Computer simulations that utilize device and environment behavioral models may be used as a first stage of testing, but at some point development must occur using the real device running in real-time. We present a codesign framework for aiding cyber-physical device development where real devices or prototypes are connected to real-time models that simulate the interacting environment. Such test setups are known as digital mockups and allow for testing environment scenarios that are hard to capture with commonly-used but limited physical mockups. The framework supports model hardware/software codesign to enable models of varying speed and accuracy to be implemented within an embedded processor or as a custom coprocessor circuit on an FPGA. We describe an accompanying tool that generates code templates to reduce the time required to develop digital mockup test setups. We utilize the framework to build a digital mockup test setup for a commercial ventilator, and showcase codesign capabilities by implementing environmental models as both circuits and as instructions on a processor.

网络物理系统的测试需要在广泛的操作条件下验证设备功能。网络物理设备与之交互的环境可能很复杂，因此很难探索所有可能的配置，例如医疗呼吸机设备的肺部或自动驾驶汽车的繁忙高速公路。利用设备和环境行为模型的计算机模拟可以作为测试的第一阶段，但在某些时候，必须使用实时运行的真实设备进行开发。我们提出了一个协同设计框架，用于帮助网络物理设备开发，其中真实设备或原型连接到模拟交互环境的实时模型。这样的测试设置被称为数字模型，并允许使用常用的但有限的物理模型难以捕获的测试环境场景。该框架支持模型硬件/软件协同设计，使不同速度和精度的模型能够在嵌入式处理器内实现或作为FPGA上的自定义协处理器电路。我们描述了一个附带的工具，它生成代码模板，以减少开发数字模型测试设置所需的时间。我们利用该框架为商业通风机构建了一个数字模型测试装置，并通过将环境模型作为电路和处理器上的指令来展示协同设计能力。

{"title":"Application-specific codesign platform generation for digital mockups in cyber-physical systems","authors":"Bailey Miller, Frank VahicK, T. Givargis","doi":"10.1109/ESLSYN.2011.5952295","DOIUrl":"https://doi.org/10.1109/ESLSYN.2011.5952295","url":null,"abstract":"The testing of cyber-physical systems requires validating device functionality for a wide range of operating conditions. The environment with which the cyber-physical device interacts, such as lungs for a medical ventilator device or a busy freeway for an autonomous vehicle, may be complex and subsequently difficult to explore all possible configurations. Computer simulations that utilize device and environment behavioral models may be used as a first stage of testing, but at some point development must occur using the real device running in real-time. We present a codesign framework for aiding cyber-physical device development where real devices or prototypes are connected to real-time models that simulate the interacting environment. Such test setups are known as digital mockups and allow for testing environment scenarios that are hard to capture with commonly-used but limited physical mockups. The framework supports model hardware/software codesign to enable models of varying speed and accuracy to be implemented within an embedded processor or as a custom coprocessor circuit on an FPGA. We describe an accompanying tool that generates code templates to reduce the time required to develop digital mockup test setups. We utilize the framework to build a digital mockup test setup for a commercial ventilator, and showcase codesign capabilities by implementing environmental models as both circuits and as instructions on a processor.","PeriodicalId":253939,"journal":{"name":"2011 Electronic System Level Synthesis Conference (ESLsyn)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131914165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Enabling the synthesis of very long operation properties 能够合成非常长的操作属性

2011 Electronic System Level Synthesis Conference (ESLsyn)

Pub Date : 2011-06-05 DOI: 10.1109/ESLSYN.2011.5952292

Jan Langer, Thomas Horn, U. Heinkel

In previous work, the high-level synthesis of operation properties has been proposed. In this work, we improve the existing algorithms in order to allow the synthesis of more efficient hardware models. Especially for very long properties no model could be generated before, because both the runtime of the synthesis process and the amount of used hardware resources were prohibitively high. The proposed improvements are threefold. First, the generated non-deterministic control automaton is replaced by a deterministic automaton using an optimized power set construction algorithm. This significantly reduces the number of registers in the generated model. Second, a property can contain local variables (freeze variables), that capture a value at a specific time step and provide this value throughout a property's life span. The scheduling of storage registers for these variables has been optimized. The last improvement merges equivalent assignments to output or state variables (commitments). The merging avoids not only the generation of redundant hardware resources but also simplifies the output multiplexer of the model. Finally, a case study is presented that involves an industrial design of a framer component. The design properties describe the processing of a complete data frame of 19440 cycles length. High-level synthesis and subsequent logic synthesis have been successful and show that the design methodology and synthesis algorithms result in a design with resource usage similar to the industrial component.

在以前的工作中，已经提出了操作性质的高级综合。在这项工作中，我们改进了现有的算法，以便能够合成更有效的硬件模型。特别是对于非常长的属性，以前不能生成任何模型，因为合成过程的运行时和使用的硬件资源量都非常高。建议的改进有三个方面。首先，利用优化的幂集构造算法将生成的不确定性自动机替换为确定性自动机。这大大减少了生成模型中的寄存器数量。其次，属性可以包含局部变量(冻结变量)，这些局部变量在特定的时间步捕获一个值，并在属性的整个生命周期内提供这个值。对这些变量的存储寄存器的调度进行了优化。最后一个改进合并了对输出或状态变量(承诺)的等价赋值。合并不仅避免了冗余硬件资源的产生，而且简化了模型的输出复用器。最后，提出了一个涉及框架组件工业设计的案例研究。设计属性描述了对长度为19440个周期的完整数据帧的处理。高级综合和随后的逻辑综合已经成功，并表明设计方法和综合算法导致设计的资源使用类似于工业组件。

{"title":"Enabling the synthesis of very long operation properties","authors":"Jan Langer, Thomas Horn, U. Heinkel","doi":"10.1109/ESLSYN.2011.5952292","DOIUrl":"https://doi.org/10.1109/ESLSYN.2011.5952292","url":null,"abstract":"In previous work, the high-level synthesis of operation properties has been proposed. In this work, we improve the existing algorithms in order to allow the synthesis of more efficient hardware models. Especially for very long properties no model could be generated before, because both the runtime of the synthesis process and the amount of used hardware resources were prohibitively high. The proposed improvements are threefold. First, the generated non-deterministic control automaton is replaced by a deterministic automaton using an optimized power set construction algorithm. This significantly reduces the number of registers in the generated model. Second, a property can contain local variables (freeze variables), that capture a value at a specific time step and provide this value throughout a property's life span. The scheduling of storage registers for these variables has been optimized. The last improvement merges equivalent assignments to output or state variables (commitments). The merging avoids not only the generation of redundant hardware resources but also simplifies the output multiplexer of the model. Finally, a case study is presented that involves an industrial design of a framer component. The design properties describe the processing of a complete data frame of 19440 cycles length. High-level synthesis and subsequent logic synthesis have been successful and show that the design methodology and synthesis algorithms result in a design with resource usage similar to the industrial component.","PeriodicalId":253939,"journal":{"name":"2011 Electronic System Level Synthesis Conference (ESLsyn)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125427591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unifying process networks for design of cyber physical systems 统一网络物理系统设计过程网络

2011 Electronic System Level Synthesis Conference (ESLsyn)

Pub Date : 2011-06-05 DOI: 10.1109/ESLSYN.2011.5952280

C. Grimm, Jiong Ou

Design of cyber-physical systems poses new challenges. Design at the level of a whole cyber-physical system includes design issues such as formal and abstract specification, design space exploration, optimization, and verification. A particular challenge is the formal and abstract representation of whole cyber-physical systems including both physical and cyber components. The objective of this paper is to show ways to unify process networks in order to enable representation of cyber-physical systems within the above mentioned design issues.

网络物理系统的设计提出了新的挑战。整个信息物理系统层面的设计包括形式和抽象规范、设计空间探索、优化和验证等设计问题。一个特别的挑战是整个网络物理系统的形式化和抽象表示，包括物理和网络组件。本文的目的是展示统一过程网络的方法，以便在上述设计问题中实现网络物理系统的表示。

引用次数: 4

From design-time concurrency to effective implementation parallelism: The multi-clock reactive case 从设计时并发性到有效的实现并行性:多时钟响应性案例

2011 Electronic System Level Synthesis Conference (ESLsyn)

Pub Date : 2011-06-05 DOI: 10.1109/ESLSYN.2011.5952287

V. Papailiopoulou, D. Potop-Butucaru, Y. Sorel, R. De Simone, L. Besnard, J. Talpin

We have defined a full design flow starting from high-level domain specific languages (Simulink, SCADE, AADL, SysML, MARTE, SystemC) and going all the way to the generation of deterministic concurrent (multi-threaded) executable code for (distributed) simulation or implementation. Based on the theory of weakly endochronous systems, our flow allows the automatic detection of potential parallelism in the functional specification, which is then used to allow the generation of concurrent (multi-thread) code for parallel, possibly distributed implementations.

我们已经定义了一个完整的设计流程，从高级领域特定语言(Simulink、SCADE、AADL、SysML、MARTE、SystemC)开始，一直到为(分布式)模拟或实现生成确定性并发(多线程)可执行代码。基于弱内同步系统理论，我们的流程允许在功能规范中自动检测潜在的并行性，然后使用它为并行(可能是分布式实现)生成并发(多线程)代码。

引用次数: 16

A hardware/software codesign template library for design space exploration 用于设计空间探索的硬件/软件协同设计模板库

2011 Electronic System Level Synthesis Conference (ESLsyn)

Pub Date : 2011-06-05 DOI: 10.1109/ESLSYN.2011.5952279

P. Brunmayr, Jan Haase, C. Grimm

The ability to map a high level algorithm either to hardware or software simplifies design space exploration of cyber-physical systems. Thereby, low level tools can be utilized for accurate design parameter estimation, which helps to evaluate the effect of system level design decisions. Especially complex data structures pose a problem in this context. The different structure of memory in hardware and software requires different data structure implementations. With the presented data structure library a consistent design flow from a high level system model to either a hardware or software implementation is enabled. The concept extends the idea of abstract data types across the hardware/software boundary. Container adapters with appertaining implementations for system level simulation, hardware and software implementation support the designer throughout the whole design process. The benefit of the presented library is demonstrated and evaluated by a case study. With very little effort seven different hardware solutions were generated and compared concerning their power consumption and their resource usage.

将高级算法映射到硬件或软件的能力简化了网络物理系统的设计空间探索。因此，可以利用低级工具进行精确的设计参数估计，这有助于评估系统级设计决策的效果。在这种情况下，复杂的数据结构尤其会造成问题。硬件和软件中不同的内存结构需要不同的数据结构实现。使用所提供的数据结构库，可以实现从高级系统模型到硬件或软件实现的一致设计流。该概念将抽象数据类型的思想扩展到硬件/软件边界。具有系统级仿真、硬件和软件实现的容器适配器在整个设计过程中为设计者提供支持。通过一个案例研究，演示和评估了所提出的库的好处。只需很少的努力，就生成了七种不同的硬件解决方案，并比较了它们的功耗和资源使用情况。

引用次数: 2

Kahn process networks applied to distributed heterogeneous HW/SW cosimulation Kahn过程网络在分布式异构软硬件协同仿真中的应用

2011 Electronic System Level Synthesis Conference (ESLsyn)

Pub Date : 2011-06-05 DOI: 10.1109/ESLSYN.2011.5952290

Dylan Pfeifer, J. Valvano

Heterogeneous, distributed hardware/software cosimulation techniques using the backplane method encounter complex interface protocols for simulator communication and synchronization, limiting their adoption or abstraction. We simplify the dynamics of backplane cosimulation to the properties of a Kahn Process Network (KPN), such that tokens of the KPN are interpolated events. This simplifies the backplane API and reduces the coordination problem to a parameterization of token update rates. The performance of this method is reported on a timed ISS model for Freescale HC12 microcontrollers (TExaS) coordinated with a Spice (Ngspice) circuit simulation.

使用背板方法的异构、分布式硬件/软件协同仿真技术遇到了用于模拟器通信和同步的复杂接口协议，限制了它们的采用或抽象。我们将背板协同仿真的动力学简化为Kahn过程网络(KPN)的性质，使得KPN的令牌是内插事件。这简化了背板API，并将协调问题减少为令牌更新速率的参数化。报道了该方法在飞思卡尔HC12微控制器(TExaS)的定时ISS模型上的性能，并与Spice (Ngspice)电路仿真相协调。

引用次数: 7

Just-in-time compilation for FPGA processor cores FPGA处理器核心的实时编译

2011 Electronic System Level Synthesis Conference (ESLsyn)

Pub Date : 2011-06-05 DOI: 10.1109/ESLSYN.2011.5952282

Andrew Becker, Scott Sirowy, F. Vahid

Portability benefits have encouraged the trend of distributing applications using processor-independent instructions, a.k.a. bytecode, and executing that bytecode on an emulator running on a target processor. Transparent just-in-time (JIT) compilation of bytecode to native instructions is often used to increase application execution speed without sacrificing portability. Recent work has proposed distributing FPGA circuit applications in a SystemC bytecode to be emulated on a processor with portions possibly dynamically migrated to custom bytecode accelerator circuits or to native circuits on the FPGA. We introduce a novel JIT compiler for bytecode executing on a soft-core FPGA processor. During an iterative process of JIT compiler and emulator architecture codesign, we added JIT-aware resources on a soft-core processor's surrounding FPGA fabric, including a JIT memory, a signal queue, and an emulation memory controller — all unique to JIT compilation for FPGA processors versus traditional processors. Experiments show that regular JIT compilation achieved 3.0× average speedup over emulation, while our JIT-aware FPGA resources yielded an additional 5.2× average speedup, for a total of 15.7× average speedup, at a cost of 21% of a MicroBlaze processor core's slice usage.

可移植性的好处鼓励了使用处理器独立指令(又称字节码)分发应用程序并在目标处理器上运行的模拟器上执行该字节码的趋势。将字节码透明的即时(JIT)编译为本机指令通常用于在不牺牲可移植性的情况下提高应用程序的执行速度。最近的工作建议将FPGA电路应用程序分布在SystemC字节码中，以便在处理器上进行仿真，其中部分可能动态迁移到自定义字节码加速器电路或FPGA上的本地电路。我们介绍了一种新的JIT编译器，用于在软核FPGA处理器上执行字节码。在JIT编译器和模拟器体系结构协同设计的迭代过程中，我们在软核处理器周围的FPGA结构上添加了JIT感知资源，包括JIT内存、信号队列和仿真内存控制器——与传统处理器相比，所有这些都是针对FPGA处理器的JIT编译所特有的。实验表明，常规JIT编译比仿真实现了3.0倍的平均加速，而我们支持JIT的FPGA资源产生了额外的5.2倍平均加速，总共平均加速达到15.7倍，而代价是MicroBlaze处理器核心切片使用的21%。

{"title":"Just-in-time compilation for FPGA processor cores","authors":"Andrew Becker, Scott Sirowy, F. Vahid","doi":"10.1109/ESLSYN.2011.5952282","DOIUrl":"https://doi.org/10.1109/ESLSYN.2011.5952282","url":null,"abstract":"Portability benefits have encouraged the trend of distributing applications using processor-independent instructions, a.k.a. bytecode, and executing that bytecode on an emulator running on a target processor. Transparent just-in-time (JIT) compilation of bytecode to native instructions is often used to increase application execution speed without sacrificing portability. Recent work has proposed distributing FPGA circuit applications in a SystemC bytecode to be emulated on a processor with portions possibly dynamically migrated to custom bytecode accelerator circuits or to native circuits on the FPGA. We introduce a novel JIT compiler for bytecode executing on a soft-core FPGA processor. During an iterative process of JIT compiler and emulator architecture codesign, we added JIT-aware resources on a soft-core processor's surrounding FPGA fabric, including a JIT memory, a signal queue, and an emulation memory controller — all unique to JIT compilation for FPGA processors versus traditional processors. Experiments show that regular JIT compilation achieved 3.0× average speedup over emulation, while our JIT-aware FPGA resources yielded an additional 5.2× average speedup, for a total of 15.7× average speedup, at a cost of 21% of a MicroBlaze processor core's slice usage.","PeriodicalId":253939,"journal":{"name":"2011 Electronic System Level Synthesis Conference (ESLsyn)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131908211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A unifying interface abstraction for accelerated computing in sensor nodes 一种用于加速传感器节点计算的统一接口抽象

2011 Electronic System Level Synthesis Conference (ESLsyn)

Pub Date : 2011-06-05 DOI: 10.1109/ESLSYN.2011.5952296

Srikrishna Iyer, Jingyao Zhang, Yaling Yang, P. Schaumont

Hardware-software co-design techniques are very suitable to develop the next generation of sensornet applications, which have high computational demands. By making use of a low-power FPGA, the peak computational performance of a sensor node can be improved without significant degradation of the standby power dissipation. In this contribution, we present a methodology and tool to enable hardware/software codesign for sensor node application development. We present the integration of nesC, a sensornet programming language, with GEZEL, an easy-to-use hardware description language. We describe the hardware/software interface at different levels of abstraction: at the level of the design language, at the level of the co-simulator, and in the hardware implementation. We use a layered, uniform approach that is particularly suited to deal with the heterogeneous interfaces typically found on small embedded processors. We illustrate the strengths of our approach by means of a prototype application: the integration of a hardware-accelerated crypto-application in a nesC application.

硬件软件协同设计技术非常适合开发对计算量要求很高的下一代传感器应用。通过使用低功耗FPGA，可以在不显著降低待机功耗的情况下提高传感器节点的峰值计算性能。在这篇文章中，我们提出了一种方法和工具来实现传感器节点应用程序开发的硬件/软件协同设计。我们提出了传感器编程语言nesC与易于使用的硬件描述语言GEZEL的集成。我们在不同的抽象层次上描述硬件/软件接口:在设计语言的层次上，在联合模拟器的层次上，以及在硬件实现的层次上。我们使用一种分层的、统一的方法，这种方法特别适合处理通常在小型嵌入式处理器上发现的异构接口。我们通过一个原型应用程序来说明我们方法的优势:在nesC应用程序中集成硬件加速的加密应用程序。

引用次数: 6

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2011 Electronic System Level Synthesis Conference (ESLsyn)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀