M-SCOPES最新文献

英文中文

Generating hardware specific code at different abstraction levels using Averest 使用Averest在不同抽象级别生成特定于硬件的代码

M-SCOPES

Pub Date : 2013-06-19 DOI: 10.1145/2463596.2486154

Omair Rafique, Manuel Gesell, K. Schneider

In general, embedded systems can be designed at different levels of abstraction, e.g., as pure hardware circuit designs, as bare-iron level programs (without an operating system), as programs based on a real-time operating system, and as models of a model-driven development. This paper focuses on a synchronous model-driven development tool called Averest. Using Averest, we describe how we consider and combine system descriptions at the mentioned four levels of abstraction. We discuss a case study targeting a distributed embedded system where these different levels have been used.

一般来说，嵌入式系统可以在不同的抽象层次上设计，例如，作为纯硬件电路设计，作为裸机级程序(没有操作系统)，作为基于实时操作系统的程序，以及作为模型驱动开发的模型。本文关注的是一个名为Averest的同步模型驱动开发工具。使用Averest，我们描述了如何在上述四个抽象层次上考虑和组合系统描述。我们讨论了一个针对分布式嵌入式系统的案例研究，其中使用了这些不同的级别。

引用次数: 7

GPU-CC: a reconfigurable GPU architecture with communicating cores GPU- cc:具有通信核心的可重构GPU架构

M-SCOPES

Pub Date : 2013-06-19 DOI: 10.1145/2463596.2486153

Gert-Jan van den Braak, H. Corporaal

GPUs have evolved to programmable, energy efficient compute accelerators for massively parallel applications. Still, compute power is lost in many applications because of cycles spent on data movement and control instead of computations on actual data. Additional cycles can be lost as well on pipeline stalls due to long latency operations. To improve performance and energy efficiency, we introduce GPU-CC: a reconfigurable GPU architecture with communicating cores. It is based on a contemporary GPU, which can still be used as such, but also has the ability to reorganize the cores of a GPU in a reconfigurable network. In GPU-CC data movement and control is implicit in the configuration of the communication network. Additionally each core executes a fixed instruction, reducing instruction decode count and increasing energy efficiency. We show a large performance potential for GPU-CC, e.g. 1.9x and 2.4x for a 3x3 and 5x5 convolution application. The hardware cost of GPU-CC is mainly determined by the buffers in the added network, which amounts to 12.4% of extra memory space.

gpu已经发展成为可编程的、节能的计算加速器，用于大规模并行应用。尽管如此，在许多应用程序中，由于将周期花在数据移动和控制上，而不是在实际数据上进行计算，因此会损失计算能力。由于长时间的延迟操作，额外的周期也可能在管道停机时丢失。为了提高性能和能源效率，我们引入了GPU- cc:一种可重构的GPU架构，带有通信核心。它是基于一个当代的GPU，仍然可以这样使用，但也有能力重组GPU的核心在一个可重构的网络。在GPU-CC中，数据移动和控制隐含在通信网络的配置中。此外，每个核心执行一个固定的指令，减少指令解码计数和提高能源效率。我们展示了GPU-CC的巨大性能潜力，例如对于3x3和5x5卷积应用程序的1.9x和2.4x。GPU-CC的硬件成本主要取决于增加的网络中的缓冲区，占额外内存空间的12.4%。

引用次数: 6

Solving the simple offset assignment problem as a traveling salesman 解决旅行推销员的简单偏移分配问题

M-SCOPES

Pub Date : 2013-06-19 DOI: 10.1145/2463596.2463601

M. Jünger, Sven Mallach

In this paper, we present an exact approach to the Simple Offset Assignment problem arising in the domain of address code generation for digital signal processors. It is based on transformations to weighted Hamiltonian cycle problems and integer linear programming. To the best of our knowledge, it is the rst approach capable to solve all instances of the established OffsetStone benchmark set to optimality within reasonable time. It therefore enables the rst evaluation of the quality of several heuristics relative to the optimum solutions. Further, using the same transformations, we present a novel improvement heuristic that provides a well-tunable trade-off between running time and solution quality.

本文提出了一种精确的方法来解决数字信号处理器地址码生成领域中出现的简单偏移分配问题。它是基于对加权哈密顿循环问题和整数线性规划的变换。据我们所知，它是能够在合理的时间内解决OffsetStone基准设置的所有实例的最佳方法。因此，它能够对几个启发式方法相对于最佳解决方案的质量进行第一次评估。此外，使用相同的转换，我们提出了一种新的改进启发式，它在运行时间和解决方案质量之间提供了一种可调的权衡。

引用次数: 5

OpenStream: a data-flow approach to solving the von Neumann bottlenecks OpenStream:解决von Neumann瓶颈的数据流方法

M-SCOPES

Pub Date : 2013-06-19 DOI: 10.1145/2463596.2486782

Antoniu Pop

As single-threaded performance is reaching its limits, the prevailing trend in multi-core and embedded MPSoC architectures is to provide an ever increasing number of processing units. This convergence leads to shared concerns, like scalability and programmability. Exploiting such architectures poses tremendous challenges to application programmers and to compiler/runtime developers alike. Uncovering raw parallelism is often insufficient in and of itself: improving performance requires changing the code structure to harness complex parallel hardware and memory hierarchies; translating more processing units into effective performance gains involves a combination of target-specific optimizations, subtle concurrency concepts and non-deterministic algorithms. In this presentation, we examine the limitations of current, von Neumann architectures and the impact on programmability of the drift from hardware-managed complexity to an increasing reliance on software solutions. We first propose OpenStream, a high-level data-flow programming model, as a pragmatic answer from the application programmer's perspective. Recognizing that the burden cannot be borne by either programmers or compilers alone, OpenStream is designed to strike a fair balance: programmers provide abstract information about their applications and leave the compiler and runtime system with the responsibility of lowering these abstractions to well-orchestrated threads and memory management. In the second part, we adopt the runtime developer's perspective and examine these impacts through the example of the implementation and proof of concurrent lock-free algorithms, a cornerstone of runtime system implementation, critically important in the context of relaxed memory consistency models.

随着单线程性能达到极限，多核和嵌入式MPSoC架构的流行趋势是提供越来越多的处理单元。这种融合导致了共同的关注点，比如可伸缩性和可编程性。利用这样的体系结构对应用程序程序员和编译器/运行时开发人员都提出了巨大的挑战。揭示原始的并行性本身往往是不够的:提高性能需要改变代码结构来利用复杂的并行硬件和内存层次结构;将更多的处理单元转化为有效的性能增益，需要结合特定于目标的优化、微妙的并发概念和非确定性算法。在本次演讲中，我们研究了当前冯·诺伊曼架构的局限性，以及从硬件管理的复杂性到越来越依赖软件解决方案的漂移对可编程性的影响。我们首先提出OpenStream，一个高级数据流编程模型，作为从应用程序员角度出发的实用答案。认识到这个负担不能由程序员或编译器单独承担，OpenStream的设计是为了达到一个公平的平衡:程序员提供有关其应用程序的抽象信息，而让编译器和运行时系统负责将这些抽象降低到精心编排的线程和内存管理。在第二部分中，我们采用运行时开发人员的观点，并通过并发无锁算法的实现和证明的示例来检查这些影响，并发无锁算法是运行时系统实现的基石，在宽松内存一致性模型的背景下至关重要。

{"title":"OpenStream: a data-flow approach to solving the von Neumann bottlenecks","authors":"Antoniu Pop","doi":"10.1145/2463596.2486782","DOIUrl":"https://doi.org/10.1145/2463596.2486782","url":null,"abstract":"As single-threaded performance is reaching its limits, the prevailing trend in multi-core and embedded MPSoC architectures is to provide an ever increasing number of processing units. This convergence leads to shared concerns, like scalability and programmability. Exploiting such architectures poses tremendous challenges to application programmers and to compiler/runtime developers alike. Uncovering raw parallelism is often insufficient in and of itself: improving performance requires changing the code structure to harness complex parallel hardware and memory hierarchies; translating more processing units into effective performance gains involves a combination of target-specific optimizations, subtle concurrency concepts and non-deterministic algorithms.\u0000 In this presentation, we examine the limitations of current, von Neumann architectures and the impact on programmability of the drift from hardware-managed complexity to an increasing reliance on software solutions. We first propose OpenStream, a high-level data-flow programming model, as a pragmatic answer from the application programmer's perspective. Recognizing that the burden cannot be borne by either programmers or compilers alone, OpenStream is designed to strike a fair balance: programmers provide abstract information about their applications and leave the compiler and runtime system with the responsibility of lowering these abstractions to well-orchestrated threads and memory management. In the second part, we adopt the runtime developer's perspective and examine these impacts through the example of the implementation and proof of concurrent lock-free algorithms, a cornerstone of runtime system implementation, critically important in the context of relaxed memory consistency models.","PeriodicalId":344517,"journal":{"name":"M-SCOPES","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129167519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Design of safety-critical Java level 1 applications using affine abstract clocks 使用仿射抽象时钟设计安全关键型Java 1级应用程序

M-SCOPES

Pub Date : 2013-06-19 DOI: 10.1145/2463596.2463600

A. Bouakaz, J. Talpin

Safety-critical Java (SCJ) is designed to enable development of applications that are amenable to certification under safety-critical standards. However, its shared-memory concurrency model causes several problems such as data races, deadlocks, and priority inversion. We propose therefore a dataflow design model of SCJ applications in which periodic and aperiodic tasks communicate only through lock-free channels. We provide the necessary tools that compute scheduling parameters of tasks (i.e. periods, phases, priorities, etc) so that uniprocessor/multiprocessor preemptive fixed-priority schedulability is ensured and the throughput is maximized. Furthermore, the resulted schedule together with the computed channel sizes ensure underflow/overflow-free communications. The scheduling approach consists in constructing an abstract affine schedule of the dataflow graph and then concretizing it.

安全关键型Java (SCJ)旨在支持开发符合安全关键标准认证的应用程序。然而，它的共享内存并发模型会导致一些问题，比如数据竞争、死锁和优先级反转。因此，我们提出了一种SCJ应用程序的数据流设计模型，其中周期性和非周期性任务仅通过无锁通道进行通信。我们提供了必要的工具来计算任务的调度参数(即周期，阶段，优先级等)，以确保单处理器/多处理器抢占固定优先级的可调度性和吞吐量最大化。此外，结果调度与计算通道大小一起确保无下溢/溢出通信。调度方法是构造数据流图的抽象仿射调度，然后将其具体化。

引用次数: 12

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

M-SCOPES

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀