2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia最新文献

英文中文

Sequential specification of time-aware stream processing applications (Extended abstract) 时间感知流处理应用程序的顺序规范(扩展摘要)

2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia

Pub Date : 2012-10-11 DOI: 10.1145/2435227.2435231

Stefan J. Geuns, J. Hausmans, M. Bekooij

Stream processing applications, and in particular Software Defined Radio applications, are typically executed on multi-core systems. Such applications often have real-time throughput constraints. Automatic parallelization of Nested Loop Programs (NLPs) is an attractive method to create embedded real-time stream processing applications for multi-core systems [1]. However, the description and parallelization of applications with a time dependent functional behavior has not been considered for NLPs. In such a description, semantic information about time dependent behavior must be made available for the compiler, such that an optimized time independent implementation can be generated automatically.

流处理应用程序，特别是软件定义无线电应用程序，通常在多核系统上执行。此类应用程序通常具有实时吞吐量限制。嵌套循环程序(nlp)的自动并行化是为多核系统创建嵌入式实时流处理应用程序的一种有吸引力的方法[1]。然而，具有时间相关功能行为的应用程序的描述和并行化尚未被nlp考虑。在这样的描述中，必须向编译器提供与时间相关的行为的语义信息，以便可以自动生成优化的与时间无关的实现。

引用次数: 8

Support for power efficient mobile video playback on simultaneous hybrid display 支持同时混合显示的节能移动视频播放

2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia

Pub Date : 2012-10-01 DOI: 10.1109/ESTIMedia.2012.6507035

Y. Wen, Ziyi Liu, W. Shi, Yifei Jiang, A. Cheng, Feng Yang, Abhinav Kohar

Mobile devices, such as smartphones, e-books, and tablets, have limited battery capability because of the constraint of battery size and mobility requirement. However the large color displays on those devices put more tensions on this situation as the displays consume a large portion of the total battery power. A TOLED-EPD hybrid display that integrates a transparent OLED (TOLED) with an electrophoretic display (EPD) has been emerging to reduce the energy usage of displays. The technology displays information selectively on one of the displays based on the update rate of content, thus reduces the energy usage. In this paper, we propose a design of mobile video playback, Decoder4Hybrid, for the hybrid displays. The proposed approach supports encoded video playback based on the update frequency of each block, which is exploited by the hybrid display controller to determine which display should be used to show a MPEG encoded block. A fast DCT-based heuristic algorithm is proposed to detect the changes between frames at block level with minimal computation cost. Experimental results show that the proposed approach can save up to 40% power with acceptable video quality.

由于电池尺寸和移动性要求的限制，智能手机、电子书和平板电脑等移动设备的电池容量有限。然而，这些设备上的大型彩色显示屏使这种情况更加紧张，因为显示屏消耗了总电池电量的很大一部分。一种集成了透明OLED (TOLED)和电泳显示器(EPD)的TOLED-EPD混合显示器已经出现，以减少显示器的能源使用。该技术根据内容的更新速度有选择地在其中一个显示器上显示信息，从而减少了能源使用。在本文中，我们提出了一种用于混合显示的移动视频播放设计，Decoder4Hybrid。该方法支持基于每个块的更新频率的编码视频播放，混合显示控制器利用该频率来确定应该使用哪个显示器来显示MPEG编码块。提出了一种快速的基于dct的启发式算法，以最小的计算量在块级检测帧间的变化。实验结果表明，该方法可以在保证视频质量的前提下节省高达40%的功耗。

{"title":"Support for power efficient mobile video playback on simultaneous hybrid display","authors":"Y. Wen, Ziyi Liu, W. Shi, Yifei Jiang, A. Cheng, Feng Yang, Abhinav Kohar","doi":"10.1109/ESTIMedia.2012.6507035","DOIUrl":"https://doi.org/10.1109/ESTIMedia.2012.6507035","url":null,"abstract":"Mobile devices, such as smartphones, e-books, and tablets, have limited battery capability because of the constraint of battery size and mobility requirement. However the large color displays on those devices put more tensions on this situation as the displays consume a large portion of the total battery power. A TOLED-EPD hybrid display that integrates a transparent OLED (TOLED) with an electrophoretic display (EPD) has been emerging to reduce the energy usage of displays. The technology displays information selectively on one of the displays based on the update rate of content, thus reduces the energy usage. In this paper, we propose a design of mobile video playback, Decoder4Hybrid, for the hybrid displays. The proposed approach supports encoded video playback based on the update frequency of each block, which is exploited by the hybrid display controller to determine which display should be used to show a MPEG encoded block. A fast DCT-based heuristic algorithm is proposed to detect the changes between frames at block level with minimal computation cost. Experimental results show that the proposed approach can save up to 40% power with acceptable video quality.","PeriodicalId":431615,"journal":{"name":"2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131498569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A lifetime aware buffer assignment method for streaming applications on DRAM/PRAM hybrid memory (Extended abstract) 基于DRAM/PRAM混合存储器的流应用缓冲区分配方法(扩展摘要)

2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia

Pub Date : 2012-10-01 DOI: 10.1145/2435227.2435232

Daeyoung Lee, Hyunok Oh

This paper proposes a lifetime aware buffer assignment method for streaming applications like multimedia specified in a synchronous dataflow (SDF) graph on a DRAM/PRAM hybrid memory in which the endurance of PRAM is limited. We determine whether buffers are assigned to DRAM or PRAM to minimize the writing frequency of PRAM. To solve the problems, we formulate them using Answer Set Programming(ASP). Experimental results show that the proposed approach increases the PRAM lifetime by 63% compared with no optimization, and shows the tradeoff between PRAM and DRAM size to guarantee a lifetime constraint.

本文提出了一种基于同步数据流(SDF)图的多媒体流应用的生命周期感知缓冲区分配方法。我们决定是否将缓冲区分配给DRAM或PRAM以最小化PRAM的写入频率。为了解决这些问题，我们使用答案集编程(ASP)来制定它们。实验结果表明，与未优化相比，该方法将PRAM寿命提高了63%，并在PRAM和DRAM尺寸之间进行了权衡，以保证寿命约束。

引用次数: 7

AVid: Annotation driven video decoding for hybrid memories AVid:用于混合存储器的注释驱动视频解码

2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia

Pub Date : 2012-10-01 DOI: 10.1109/ESTIMedia.2012.6507022

Liviu Codrut Stancu, L. A. Bathen, N. Dutt, A. Nicolau

Adopting emerging non-volatile memory (NVM) technologies is a viable solution to minimize the increasing memory leakage power in today's embedded systems. However, in order to take advantage of the many benefits in NVMs, software must account for their high write overheads. This paper presents AVid, an annotation driven video decoding technique for hybrid memory subsystems. AVid exploits the physical characteristics of NVMs by extracting video decoder access patterns and uses this meta-information to minimize write overheads, thereby improving energy savings and performance. Our experimental results on an annotation-aware H.264 codec show that our technique is able to achieve execution time and energy reduction by up to 40.8% and 39.7% respectively when applied to H.264 decoding.

采用新兴的非易失性存储器(NVM)技术是一种可行的解决方案，可以最大限度地减少当今嵌入式系统中不断增加的内存泄漏功率。然而，为了利用nvm的许多优点，软件必须考虑到它们的高写入开销。提出了一种用于混合存储子系统的注释驱动视频解码技术AVid。AVid通过提取视频解码器访问模式来利用nvm的物理特性，并使用这些元信息来最小化写入开销，从而提高能源节约和性能。在H.264编解码器上的实验结果表明，将该技术应用于H.264解码时，执行时间和能量分别减少了40.8%和39.7%。

引用次数: 11

Multi-objective mapping optimization via problem decomposition for many-core systems 基于问题分解的多核系统多目标映射优化

2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia

Pub Date : 2012-10-01 DOI: 10.1109/ESTIMedia.2012.6507026

Shin-Haeng Kang, Hoeseok Yang, Lars Schor, Iuliana Bacivarov, S. Ha, L. Thiele

Due to the trend of many-core systems for dynamic multimedia applications, the problem size of mapping optimization gets bigger than ever making conventional meta-heuristics no longer effective. Thus, in this paper, we propose a problem decomposition approach for large scale optimization problems. We basically follow the divide-and-conquer concept, in which a large scale problem is divided into several sub-problems. To remove the inter-relationship between sub-problems, proper abstraction is applied. The divided sub-problems can be solved either in parallel or in a sequence. The mapping optimization problem on dynamic many-core systems is decomposed and solved separately considering the system state and architectural hierarchy. Experimental evaluations with several examples prove that the proposed technique outperforms the conventional meta-heuristics both in optimality and diversity of the optimized pareto curve.

由于动态多媒体应用的多核系统趋势，映射优化问题的规模越来越大，使得传统的元启发式方法不再有效。因此，在本文中，我们提出了一种大规模优化问题的问题分解方法。我们基本上遵循分而治之的概念，将一个大规模的问题分成几个子问题。为了消除子问题之间的相互关系，对子问题进行了适当的抽象。划分的子问题既可以并行求解，也可以按顺序求解。考虑系统状态和体系结构层次，对动态多核系统的映射优化问题进行了分解和分离求解。实验结果表明，该方法在优化后的帕累托曲线的最优性和多样性方面都优于传统的元启发式算法。

引用次数: 38

Keynote: “Design space exploration and run-time resource management in the embedded multi-core era” 主题演讲:嵌入式多核时代的设计空间探索与运行时资源管理

2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia

Pub Date : 2012-10-01 DOI: 10.1109/ESTIMedia.2012.6507017

V. Zaccaria

It is widely understood that the next revolution of virtual platform-based design is the holistic optimization of hardware parameters, task mapping and scheduling, and application tuning for many-cores. As a community, we have learned that finding the best trade-off in terms of selected figures of merit can be achieved only by considering the integrated hardware and software dimensions, by evaluating an enormous number of configurations, each characterized by a long simulation time. The problem worsens when dealing with small ecosystems such as embedded systems-on-chip where the environment is too constrained to assume that a sophisticated run-time algorithm can be implemented to schedule efficiently the access to resources. In this keynote I will explain why some newest findings in global optimization can be used to address effectively virtual-platform design. I am going then to describe how sophisticated algorithms based on response-surface prediction models can efficiently identify optimal configurations in a significant number of platform optimization scenarios. Finally, I am going to outline some research directions stemming from the MULTICUBE and 2PARMA EU projects that I think will untap the full potential of platform optimization.

人们普遍认为，基于虚拟平台设计的下一次革命是硬件参数、任务映射和调度以及多核应用程序调优的整体优化。作为一个社区，我们已经了解到，只有通过考虑集成的硬件和软件维度，通过评估大量的配置，每个配置的特征都是长时间的模拟，才能在选定的优点数字方面找到最佳权衡。当处理小型生态系统(如嵌入式芯片系统)时，问题会变得更糟，因为这些生态系统的环境过于受限，无法假设可以实现复杂的运行时算法来有效地调度对资源的访问。在这个主题演讲中，我将解释为什么全局优化中的一些最新发现可以用于有效地解决虚拟平台设计问题。然后，我将描述基于响应面预测模型的复杂算法如何在大量平台优化场景中有效地识别最佳配置。最后，我将概述来自MULTICUBE和2PARMA EU项目的一些研究方向，我认为这些研究方向将释放出平台优化的全部潜力。

{"title":"Keynote: “Design space exploration and run-time resource management in the embedded multi-core era”","authors":"V. Zaccaria","doi":"10.1109/ESTIMedia.2012.6507017","DOIUrl":"https://doi.org/10.1109/ESTIMedia.2012.6507017","url":null,"abstract":"It is widely understood that the next revolution of virtual platform-based design is the holistic optimization of hardware parameters, task mapping and scheduling, and application tuning for many-cores. As a community, we have learned that finding the best trade-off in terms of selected figures of merit can be achieved only by considering the integrated hardware and software dimensions, by evaluating an enormous number of configurations, each characterized by a long simulation time. The problem worsens when dealing with small ecosystems such as embedded systems-on-chip where the environment is too constrained to assume that a sophisticated run-time algorithm can be implemented to schedule efficiently the access to resources. In this keynote I will explain why some newest findings in global optimization can be used to address effectively virtual-platform design. I am going then to describe how sophisticated algorithms based on response-surface prediction models can efficiently identify optimal configurations in a significant number of platform optimization scenarios. Finally, I am going to outline some research directions stemming from the MULTICUBE and 2PARMA EU projects that I think will untap the full potential of platform optimization.","PeriodicalId":431615,"journal":{"name":"2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia","volume":"149 15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129942174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards real-time applications in mobile web browsers 面向移动web浏览器的实时应用程序

2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia

Pub Date : 2012-10-01 DOI: 10.1109/ESTIMedia.2012.6507030

E. Aho, Kimmo Kuusilinna, T. Aarnio, Janne Pietiainen, Jari Nikara

WebGL and WebCL are web targeted versions of OpenGL ES and OpenCL standards. Using these standards, it is possible to better exploit the hardware resources in embedded systems from web browsers allowing timely processing of audio, video, and graphics. WebGL excels in graphics applications while WebCL fares better when more flexibility is required in execution platform selection, load balancing, data formats, control flow, or memory access patterns. This paper explores the potential for mobile web application acceleration utilizing WebGL and particularly WebCL which is currently under intense development. Where driver support is lacking, WebGL is used as a proxy to provide an estimate of WebCL opportunity. Speedups in the order of 200x over JavaScript are demonstrated in best case situations for a GPU target. In similar situations, CPU acceleration can be 10x while running in a laptop browser. In addition, as building and optimizing a WebCL implementation is part of the reported work, an overview of the important development issues is given.

WebGL和WebCL是OpenGL ES和OpenCL标准的web目标版本。使用这些标准，可以更好地利用嵌入式系统中的硬件资源，使web浏览器能够及时处理音频、视频和图形。WebGL在图形应用程序中表现出色，而WebCL在执行平台选择、负载平衡、数据格式、控制流或内存访问模式需要更大灵活性时表现更好。本文探讨了利用WebGL，特别是目前正在蓬勃发展的WebCL加速移动web应用程序的潜力。在缺乏驱动程序支持的地方，WebGL被用作代理来提供对WebCL机会的估计。在GPU目标的最佳情况下，比JavaScript加速200倍。在类似的情况下，在笔记本电脑浏览器中运行时，CPU加速可以达到10倍。此外，由于构建和优化WebCL实现是报告工作的一部分，因此给出了重要开发问题的概述。

引用次数: 6

Power versus quality trade-offs for adaptive real-time applications 自适应实时应用的功率与质量权衡

2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia

Pub Date : 2012-10-01 DOI: 10.1109/ESTIMedia.2012.6507032

Andrew Nelson, B. Akesson, A. Molnos, Sj Pas, K. Goossens

Electronic devices are expected to accommodate evermore complex functionality. Portable devices, such as mobile phones, have experienced a rapid increase in functionality, while at the same time being constrained by the amount of energy that may be stored in their batteries. Dynamic Voltage and Frequency Scaling (DVFS) is a common technique that is used to trade processor speed for a reduction in power consumption. Adaptive applications can reduce their output quality in exchange for a reduction in their execution time. This exchange has been shown to be useful for meeting temporal constraints, but its usefulness for reducing energy/power consumption has not been investigated. In this paper, we present a technique that uses existing DVFS methods to trade a quality decrease for lower power/energy consumption through an intermediary reduction in execution time. Our technique achieves this while meeting soft and/or hard time/energy/power constraints. We demonstrate the applicability of our technique on an adaptive H.263 decoder application, running on a predictable hardware platform that is prototyped on an FPGA. We further contribute an experimental evaluation of the H.263 decoder's scalable mechanisms, in their ability to trade quality for temporal/energy/power. From experimentation, we show that our quality trading technique is able to achieve up to a 45% increase in the number of frames decoded for the same amount of energy, in comparison to frequency scaling alone, but with a quality reduction of up to 22dB Peak Signal-to-Noise Ratio (PSNR).

人们期望电子设备能够容纳越来越复杂的功能。便携式设备，如移动电话，已经经历了功能的快速增长，但同时受到其电池中可能存储的能量的限制。动态电压和频率缩放(DVFS)是一种常用的技术，用于交换处理器速度以降低功耗。自适应应用程序可以降低它们的输出质量，以换取减少它们的执行时间。这种交换已被证明对满足时间限制是有用的，但它对减少能源/电力消耗的有用性尚未得到调查。在本文中，我们提出了一种技术，该技术使用现有的DVFS方法通过减少执行时间来降低质量以换取更低的功耗/能耗。我们的技术在满足软/硬时间/能量/功率限制的同时实现了这一点。我们演示了我们的技术在自适应H.263解码器应用程序上的适用性，该应用程序运行在可预测的硬件平台上，该平台是在FPGA上原型的。我们进一步提供了H.263解码器的可扩展机制的实验评估，在他们的交易质量为时间/能量/功率的能力。从实验中，我们表明，与单独的频率缩放相比，我们的质量交易技术能够在相同的能量下实现高达45%的解码帧数增加，但峰值信噪比(PSNR)的质量降低高达22dB。

{"title":"Power versus quality trade-offs for adaptive real-time applications","authors":"Andrew Nelson, B. Akesson, A. Molnos, Sj Pas, K. Goossens","doi":"10.1109/ESTIMedia.2012.6507032","DOIUrl":"https://doi.org/10.1109/ESTIMedia.2012.6507032","url":null,"abstract":"Electronic devices are expected to accommodate evermore complex functionality. Portable devices, such as mobile phones, have experienced a rapid increase in functionality, while at the same time being constrained by the amount of energy that may be stored in their batteries. Dynamic Voltage and Frequency Scaling (DVFS) is a common technique that is used to trade processor speed for a reduction in power consumption. Adaptive applications can reduce their output quality in exchange for a reduction in their execution time. This exchange has been shown to be useful for meeting temporal constraints, but its usefulness for reducing energy/power consumption has not been investigated. In this paper, we present a technique that uses existing DVFS methods to trade a quality decrease for lower power/energy consumption through an intermediary reduction in execution time. Our technique achieves this while meeting soft and/or hard time/energy/power constraints. We demonstrate the applicability of our technique on an adaptive H.263 decoder application, running on a predictable hardware platform that is prototyped on an FPGA. We further contribute an experimental evaluation of the H.263 decoder's scalable mechanisms, in their ability to trade quality for temporal/energy/power. From experimentation, we show that our quality trading technique is able to achieve up to a 45% increase in the number of frames decoded for the same amount of energy, in comparison to frequency scaling alone, but with a quality reduction of up to 22dB Peak Signal-to-Noise Ratio (PSNR).","PeriodicalId":431615,"journal":{"name":"2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116961484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

I/O aware task scheduling for energy harvesting embedded systems with PV and capacitor arrays 具有PV和电容器阵列的能量收集嵌入式系统的I/O感知任务调度

2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia

Pub Date : 2012-10-01 DOI: 10.1109/ESTIMedia.2012.6507028

Kyungsoo Lee, T. Ishihara

The system efficiency using an energy generation source is important. The high efficiency can reduce the cost of the system or increase the lifetime of the system operation. The high efficiency can be achieved by a high generating efficiency, a high consumption efficiency or a high transferring efficiency. Conventional maximum power point tracking (MPPT) techniques and multi-core scheduling methods do not consider the transferring efficiency in the multiple load system. This paper presents a generalized technique for the task scheduling of a multi-core processor considering the transferring efficiency in multiple loads. The target system contains a functionality for dynamic reconfiguration of a photovoltaic/supercapacitor array to change the input voltage of DC-DC converters in multiple loads. The proposed technique minimizes the power loss in the DC-DC converters and charger of the system. Experiments with actual application demonstrate that our approach reduces the energy consumption by 17.7% over the conventional approach, which employs a dynamic voltage and frequency processor.

使用能源产生源的系统效率是重要的。高效率可以降低系统的成本或增加系统的使用寿命。高效率可以通过高发电效率、高消耗效率或高传输效率来实现。传统的最大功率点跟踪技术和多核调度方法没有考虑多负荷系统的传输效率。本文提出了一种考虑多负载传输效率的多核处理器任务调度的通用技术。目标系统包含动态重新配置光伏/超级电容器阵列的功能，以改变多个负载中DC-DC转换器的输入电压。该技术最大限度地减少了系统中DC-DC转换器和充电器的功率损耗。实际应用实验表明，该方法与采用动态电压频率处理器的传统方法相比，能耗降低了17.7%。

引用次数: 2

Memory-centric VDF graph transformations for practical FPGA implementation 以内存为中心的VDF图形转换的实际FPGA实现

2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia

Pub Date : 2012-10-01 DOI: 10.1109/ESTIMedia.2012.6507023

Matthew Milford, J. McAllister

Realising memory intensive applications such as image and video processing on FPGA requires creation of complex, multi-level memory hierarchies to achieve real-time performance; however commerical High Level Synthesis tools are unable to automatically derive such structures and hence are unable to meet the demanding bandwidth and capacity constraints of these applications. Current approaches to solving this problem can only derive either single-level memory structures or very deep, highly inefficient hierarchies, leading in either case to one or more of high implementation cost and low performance. This paper presents an enhancement to an existing MC-HLS synthesis approach which solves this problem; it exploits and eliminates data duplication at multiple levels levels of the generated hierarchy, leading to a reduction in the number of levels and ultimately higher performance, lower cost implementations. When applied to synthesis of C-based Motion Estimation, Matrix Multiplication and Sobel Edge Detection applications, this enables reductions in Block RAM and Look Up Table (LUT) cost of up to 25%, whilst simultaneously increasing throughput.

在FPGA上实现图像和视频处理等内存密集型应用需要创建复杂的多级内存层次结构以实现实时性能;然而，商业高级综合工具无法自动导出这样的结构，因此无法满足这些应用对带宽和容量的要求。目前解决这个问题的方法要么只能得到单级内存结构，要么只能得到非常深的、效率极低的层次结构，这两种情况都会导致一个或多个高实现成本和低性能。本文对现有的MC-HLS合成方法进行了改进，解决了这一问题;它利用并消除了生成的层次结构的多个级别上的数据重复，从而减少了级别的数量，最终实现了更高的性能和更低的成本。当应用于基于c的运动估计，矩阵乘法和Sobel边缘检测应用的合成时，这可以减少块RAM和查找表(LUT)成本高达25%，同时提高吞吐量。

{"title":"Memory-centric VDF graph transformations for practical FPGA implementation","authors":"Matthew Milford, J. McAllister","doi":"10.1109/ESTIMedia.2012.6507023","DOIUrl":"https://doi.org/10.1109/ESTIMedia.2012.6507023","url":null,"abstract":"Realising memory intensive applications such as image and video processing on FPGA requires creation of complex, multi-level memory hierarchies to achieve real-time performance; however commerical High Level Synthesis tools are unable to automatically derive such structures and hence are unable to meet the demanding bandwidth and capacity constraints of these applications. Current approaches to solving this problem can only derive either single-level memory structures or very deep, highly inefficient hierarchies, leading in either case to one or more of high implementation cost and low performance. This paper presents an enhancement to an existing MC-HLS synthesis approach which solves this problem; it exploits and eliminates data duplication at multiple levels levels of the generated hierarchy, leading to a reduction in the number of levels and ultimately higher performance, lower cost implementations. When applied to synthesis of C-based Motion Estimation, Matrix Multiplication and Sobel Edge Detection applications, this enables reductions in Block RAM and Look Up Table (LUT) cost of up to 25%, whilst simultaneously increasing throughput.","PeriodicalId":431615,"journal":{"name":"2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia","volume":"584 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132509814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2012 IEEE 10th Symposium on Embedded Systems for Real-time Multimedia

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀