2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)最新文献

英文中文

Bio-inspired distributed task remapping for multiple video stream decoding on homogeneous NoCs 同构noc上多视频流解码的仿生分布式任务重映射

2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)

Pub Date : 2015-12-17 DOI: 10.1109/ESTIMedia.2015.7351765

H. R. Mendis, L. Indrusiak, N. Audsley

Centralised management of distributed systems require a significant amount of monitoring traffic to maintain an accurate view of the system global state. The communication overhead of these systems becomes a bottleneck as the number of processing elements in the network and workload increase. State-of-the art in decentralised resource management techniques address this issue by allowing individual or clusters of nodes to make decisions at runtime to manage the dynamic workload. The primary contribution of this paper is using a bio-inspired, distributed, task remapping technique to manage dynamic multiple video stream decoding workloads. Our proposed technique has a low-communication overhead and is used to reduce the cumulative job lateness of the video streams. Secondary contributions include, several improvements to an existing clusterbased resource management approach to introduce awareness of task blocking and relocation distance. We evaluate these two remapping methods by comparing the improvement of job lateness, communication overhead and distribution of utilisation via simulation of several workload patterns.

分布式系统的集中管理需要大量的监控流量，以保持对系统全局状态的准确视图。随着网络中处理元素的数量和工作负载的增加，这些系统的通信开销成为瓶颈。最新的分散式资源管理技术通过允许单个或集群节点在运行时做出决策来管理动态工作负载，从而解决了这个问题。本文的主要贡献是使用生物启发的分布式任务重映射技术来管理动态多视频流解码工作负载。我们所提出的技术具有较低的通信开销，并用于减少视频流的累积作业延迟。次要贡献包括对现有的基于集群的资源管理方法的一些改进，以引入任务阻塞和重新定位距离的意识。我们通过模拟几种工作负载模式，比较工作延迟、通信开销和利用率分布的改进，来评估这两种重新映射方法。

{"title":"Bio-inspired distributed task remapping for multiple video stream decoding on homogeneous NoCs","authors":"H. R. Mendis, L. Indrusiak, N. Audsley","doi":"10.1109/ESTIMedia.2015.7351765","DOIUrl":"https://doi.org/10.1109/ESTIMedia.2015.7351765","url":null,"abstract":"Centralised management of distributed systems require a significant amount of monitoring traffic to maintain an accurate view of the system global state. The communication overhead of these systems becomes a bottleneck as the number of processing elements in the network and workload increase. State-of-the art in decentralised resource management techniques address this issue by allowing individual or clusters of nodes to make decisions at runtime to manage the dynamic workload. The primary contribution of this paper is using a bio-inspired, distributed, task remapping technique to manage dynamic multiple video stream decoding workloads. Our proposed technique has a low-communication overhead and is used to reduce the cumulative job lateness of the video streams. Secondary contributions include, several improvements to an existing clusterbased resource management approach to introduce awareness of task blocking and relocation distance. We evaluate these two remapping methods by comparing the improvement of job lateness, communication overhead and distribution of utilisation via simulation of several workload patterns.","PeriodicalId":350361,"journal":{"name":"2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117243553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Quasi-static scheduling of data flow graphs in the presence of limited channel capacities 信道容量有限时数据流图的准静态调度

2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)

Pub Date : 2015-12-17 DOI: 10.1109/ESTIMedia.2015.7351766

J. Falk, T. Schwarzer, M. Glaß, J. Teich, C. Zebelein, C. Haubelt

Signal processing algorithms as can be found in multimedia applications are often modeled by dynamic Data Flow Graphs (DFGs), especially when targeting heterogeneous multicore platforms. However, there is often a mismatch between the fine granularity of the application and the coarse granularity of the platform. Tailoring the granularity of the DFG to a given platform by employing Quasi-Static Schedules (QSSs) promises performance gains by reducing dynamic scheduling overhead and enabling optimizations targeting groups of actors instead of individual actors in isolation. Unfortunately, all approaches known from literature to compute QSSs implicitly assume DFGs with unbounded First In First Out (FIFO) channels. In contrast, mappings of DFGs to multi-core platforms must adhere to FIFO channels with limited capacities. In this paper, we present a novel FIFO channel capacity adjustment algorithm that enables QSSs to DFGs with limited channel capacities, thus, extending the scope of QSS refinements to general multi-core targets.

多媒体应用中的信号处理算法通常由动态数据流图(dfg)建模，特别是针对异构多核平台时。但是，应用程序的细粒度和平台的粗粒度之间经常存在不匹配。通过使用准静态调度(Quasi-Static Schedules, qss)将DFG的粒度调整到给定的平台，通过减少动态调度开销和支持针对参与者组(而不是孤立的单个参与者)的优化，可以实现性能提升。不幸的是，从文献中已知的所有计算qss的方法都隐含地假设DFGs具有无界的先进先出(FIFO)通道。相反，dfg到多核平台的映射必须遵循容量有限的FIFO通道。在本文中，我们提出了一种新颖的FIFO信道容量调整算法，该算法使QSS能够以有限的信道容量适应DFGs，从而将QSS改进的范围扩展到一般的多核目标。

引用次数: 5

Floating point acceleration for stream processing applications in dynamically reconfigurable processors 动态可重构处理器中流处理应用的浮点加速

2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)

Pub Date : 2015-12-17 DOI: 10.1109/ESTIMedia.2015.7351762

L. Bauer, Artjom Grudnitsky, Marvin Damschen, Srinivas Rao Kerekare, J. Henkel

Runtime reconfigurable processors provide a large degree of flexibility that allows them to dynamically adapt to different applications and requirements. They couple a standard processor with a runtime reconfigurable fabric (like an embedded FPGA) to offload computationally intensive kernels. In this paper we present the design and architecture of a flexible accelerator for floating point operations in stream processing applications. To integrate it in an existing reconfigurable processor, the different frequencies between the sequential processor (high frequency) and parallel accelerators (low frequencies) have to be managed. The results show 63.70× and 3.85× better performance-per-area efficiency when using our accelerator and the reconfigurable processor compared to the baseline processor with a soft-float implementation and a high-performance floating point unit, respectively.

运行时可重构处理器提供了很大程度的灵活性，允许它们动态地适应不同的应用程序和需求。它们将标准处理器与运行时可重构结构(如嵌入式FPGA)相结合，以卸载计算密集型内核。本文提出了一种用于流处理应用中浮点运算的灵活加速器的设计和体系结构。为了将其集成到现有的可重构处理器中，必须管理顺序处理器(高频)和并行加速器(低频)之间的不同频率。结果表明，与采用软浮点实现和高性能浮点单元的基准处理器相比，使用我们的加速器和可重构处理器的每区域性能效率分别提高了63.70倍和3.85倍。

引用次数: 4

On-the-fly energy minimization for multi-mode real-time systems on heterogeneous platforms 异构平台上多模式实时系统的动态能量最小化

2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)

Pub Date : 2015-12-17 DOI: 10.1109/ESTIMedia.2015.7351771

A. Lifa, P. Eles, Zebo Peng

The increasing computational demands of next generation multimedia systems require innovative optimization methods. Modern heterogeneous architectures bring together multiple general-purpose CPUs and multiple GPUs and FPGAs, in an attempt to answer the performance, energy-efficiency and flexibility requirements of today's complex multimedia applications. However, in order to leverage the advantages of such architectures, careful optimization is essential. In modern systems, more and more multimedia applications need real-time support (e.g. automotive systems that use image processing for active safety features). Real-time multi-mode systems are a good model for a wide range of applications that dynamically change their computational requirements over time. In this context, intelligent on-line resource management is needed, such that the heterogeneous resources are used in an energy-efficient manner, while meeting the real-time constraints. This paper proposes a resource manager that implements run-time policies to decide on-the-fly task admission and the mapping of active tasks to resources, such that the energy consumption of the system is minimized and all task deadlines are met.

下一代多媒体系统不断增长的计算需求需要创新的优化方法。现代异构架构汇集了多个通用cpu和多个gpu和fpga，试图满足当今复杂多媒体应用对性能、能效和灵活性的要求。然而，为了利用这种体系结构的优势，仔细的优化是必不可少的。在现代系统中，越来越多的多媒体应用需要实时支持(例如，汽车系统使用图像处理来实现主动安全功能)。实时多模式系统对于随时间动态改变其计算需求的广泛应用程序来说是一个很好的模型。在这种情况下，需要对异构资源进行智能在线管理，在满足实时性约束的前提下，有效利用异构资源。本文提出了一种资源管理器，该资源管理器实现了运行时策略来决定动态任务的允许和活动任务到资源的映射，从而使系统的能量消耗最小化并满足所有任务的截止日期。

{"title":"On-the-fly energy minimization for multi-mode real-time systems on heterogeneous platforms","authors":"A. Lifa, P. Eles, Zebo Peng","doi":"10.1109/ESTIMedia.2015.7351771","DOIUrl":"https://doi.org/10.1109/ESTIMedia.2015.7351771","url":null,"abstract":"The increasing computational demands of next generation multimedia systems require innovative optimization methods. Modern heterogeneous architectures bring together multiple general-purpose CPUs and multiple GPUs and FPGAs, in an attempt to answer the performance, energy-efficiency and flexibility requirements of today's complex multimedia applications. However, in order to leverage the advantages of such architectures, careful optimization is essential. In modern systems, more and more multimedia applications need real-time support (e.g. automotive systems that use image processing for active safety features). Real-time multi-mode systems are a good model for a wide range of applications that dynamically change their computational requirements over time. In this context, intelligent on-line resource management is needed, such that the heterogeneous resources are used in an energy-efficient manner, while meeting the real-time constraints. This paper proposes a resource manager that implements run-time policies to decide on-the-fly task admission and the mapping of active tasks to resources, such that the energy consumption of the system is minimized and all task deadlines are met.","PeriodicalId":350361,"journal":{"name":"2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124876597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Memory-aware cooperative CPU-GPU DVFS governor for mobile games 面向手机游戏的内存感知协同CPU-GPU DVFS调控器

2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)

Pub Date : 2015-12-17 DOI: 10.1109/ESTIMedia.2015.7351775

Chen-Ying Hsieh, Jurn-Gyu Park, N. Dutt, Sung-Soo Lim

Modern mobile heterogeneous platforms have GPUs integrated with multicore processors to enable execution of highend graphics-intensive games. However, these gaming applications consume significant power due to heavy utilization of CPU-GPU resources, which drains battery resources that are critical for mobile devices. While Dynamic Voltage and Frequency Scaling (DVFS) techniques have been exploited previously for dynamic power management, contemporary techniques do not fully exploit the memory access footprint for graphics-intensive gaming applications, missing opportunities for energy efficiency. In this paper, we for the first time propose a memory-aware cooperative CPU-GPU DVFS governor that considers both the memory access footprint as well as the CPU/GPU frequency to improve energy efficiency of high-end mobile game workloads. Our experimental results show that our proposed game governor achieves on average 13% and 5% improvement of energy efficiency with minor degradation of performance compared to default governors and state-of-the-art game governors.

现代移动异构平台拥有集成了多核处理器的gpu，能够执行高端图像密集型游戏。然而，由于大量使用CPU-GPU资源，这些游戏应用程序消耗了大量的功率，这会耗尽对移动设备至关重要的电池资源。虽然动态电压和频率缩放(DVFS)技术之前已经被用于动态电源管理，但现代技术并不能完全利用图形密集型游戏应用的内存访问足迹，从而错失了提高能源效率的机会。在本文中，我们首次提出了一种内存感知的CPU-GPU合作DVFS调控器，该调控器既考虑内存访问占用，又考虑CPU/GPU频率，以提高高端移动游戏工作负载的能效。我们的实验结果表明，与默认调控器和最先进的游戏调控器相比，我们提出的游戏调控器在性能略有下降的情况下，平均实现了13%和5%的能效提高。

引用次数: 18

Mode-controlled data-flow modeling of real-time memory controllers 实时内存控制器的模式控制数据流建模

2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)

Pub Date : 2015-10-08 DOI: 10.1109/ESTIMedia.2015.7351770

Yonghui Li, Hrishikesh Salunkhe, J. Bastos, Orlando Moreira, B. Akesson, K. Goossens

SDRAM is a shared resource in modern multi-core platforms executing multiple real-time (RT) streaming applications. It is crucial to analyze the minimum guaranteed SDRAM bandwidth to ensure that the requirements of the RT streaming applications are always satisfied. However, deriving the worstcase bandwidth (WCBW) is challenging because of the diverse memory traffic with variable transaction sizes. In fact, existing RT memory controllers either do not efficiently support variable transaction sizes or do not provide an analysis to tightly bound WCBW in their presence. We propose a new mode-controlled data-flow (MCDF) model to capture the command scheduling dependencies of memory transactions with variable sizes. The WCBW can be obtained by employing an existing tool to automatically analyze our MCDF model rather than using existing static analysis techniques, which in contrast to our model are hard to extend to cover different RT memory controllers. Moreover, the MCDF analysis can exploit static information about known transaction sequences provided by the applications or by the memory arbiter. Experimental results show that 77% improvement of WCBW can be achieved compared to the case without known transaction sequences. In addition, the results demonstrate that the proposed MCDF model outperforms state-of-the-art analysis approaches and improves the WCBW by 22% without known transaction sequences.

SDRAM是现代多核平台上执行多个实时(RT)流应用程序的共享资源。分析最小保证SDRAM带宽对于保证RT流应用的需求是至关重要的。然而，导出最坏情况带宽(WCBW)是具有挑战性的，因为具有不同事务大小的不同内存流量。事实上，现有的RT内存控制器要么不能有效地支持可变事务大小，要么不能提供对紧密绑定的WCBW的分析。我们提出了一种新的模式控制数据流(MCDF)模型来捕获可变大小内存事务的命令调度依赖关系。WCBW可以通过使用现有的工具来自动分析我们的MCDF模型而不是使用现有的静态分析技术来获得，与我们的模型相比，现有的静态分析技术很难扩展到涵盖不同的RT内存控制器。此外，MCDF分析可以利用由应用程序或内存仲裁器提供的有关已知事务序列的静态信息。实验结果表明，与没有已知事务序列的情况相比，WCBW可以提高77%。此外，结果表明，所提出的MCDF模型优于最先进的分析方法，并且在没有已知交易序列的情况下将WCBW提高了22%。

{"title":"Mode-controlled data-flow modeling of real-time memory controllers","authors":"Yonghui Li, Hrishikesh Salunkhe, J. Bastos, Orlando Moreira, B. Akesson, K. Goossens","doi":"10.1109/ESTIMedia.2015.7351770","DOIUrl":"https://doi.org/10.1109/ESTIMedia.2015.7351770","url":null,"abstract":"SDRAM is a shared resource in modern multi-core platforms executing multiple real-time (RT) streaming applications. It is crucial to analyze the minimum guaranteed SDRAM bandwidth to ensure that the requirements of the RT streaming applications are always satisfied. However, deriving the worstcase bandwidth (WCBW) is challenging because of the diverse memory traffic with variable transaction sizes. In fact, existing RT memory controllers either do not efficiently support variable transaction sizes or do not provide an analysis to tightly bound WCBW in their presence. We propose a new mode-controlled data-flow (MCDF) model to capture the command scheduling dependencies of memory transactions with variable sizes. The WCBW can be obtained by employing an existing tool to automatically analyze our MCDF model rather than using existing static analysis techniques, which in contrast to our model are hard to extend to cover different RT memory controllers. Moreover, the MCDF analysis can exploit static information about known transaction sequences provided by the applications or by the memory arbiter. Experimental results show that 77% improvement of WCBW can be achieved compared to the case without known transaction sequences. In addition, the results demonstrate that the proposed MCDF model outperforms state-of-the-art analysis approaches and improves the WCBW by 22% without known transaction sequences.","PeriodicalId":350361,"journal":{"name":"2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)","volume":"42 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126752514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀