首页 > 最新文献

15th International Symposium on System Synthesis, 2002.最新文献

英文 中文
An object-oriented design process for system-on-chip using UML 基于UML的片上系统的面向对象设计过程
Pub Date : 2002-10-02 DOI: 10.1145/581199.581254
T. Nakata, Akio Matsuda, M. Shoji, S. Kuwamura, Qiang Zhu
The object-oriented design process has been a hot topic in software development since it will improve product quality and productivity significantly, which is also a major issue in system-on-chip design. In this paper, a design process is proposed for hardware-software heterogeneous systems by reinforcing parallelism, structure, and timing. The management of design abstraction is also introduced for refinement of hardware. UML is used as a modeling language, and the reinforcement above is gracefully integrated into UML by its extensibility mechanism. An example of architecture exploration and performance analysis is illustrated through the application of the process to an image decoding design.
面向对象的设计过程是软件开发中的一个热门话题,因为它可以显著地提高产品的质量和生产率,这也是片上系统设计中的一个主要问题。本文提出了一种基于并行性、结构性和时序性的软硬件异构系统设计方法。同时引入了设计抽象的管理,以实现硬件的精细化。UML被用作建模语言,并且上面的强化通过其可扩展性机制优雅地集成到UML中。通过将该过程应用于图像解码设计,举例说明了架构探索和性能分析。
{"title":"An object-oriented design process for system-on-chip using UML","authors":"T. Nakata, Akio Matsuda, M. Shoji, S. Kuwamura, Qiang Zhu","doi":"10.1145/581199.581254","DOIUrl":"https://doi.org/10.1145/581199.581254","url":null,"abstract":"The object-oriented design process has been a hot topic in software development since it will improve product quality and productivity significantly, which is also a major issue in system-on-chip design. In this paper, a design process is proposed for hardware-software heterogeneous systems by reinforcing parallelism, structure, and timing. The management of design abstraction is also introduced for refinement of hardware. UML is used as a modeling language, and the reinforcement above is gracefully integrated into UML by its extensibility mechanism. An example of architecture exploration and performance analysis is illustrated through the application of the process to an image decoding design.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124771181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
A symbolic approach for the combined solution of scheduling and allocation 调度与分配问题组合求解的一种符号方法
Pub Date : 2002-10-02 DOI: 10.1145/581199.581252
L. Lavagno, M. Lazarescu, S. Quer, Sergio Nocco, C. Passerone, G. Cabodi
Scheduling is widely recognized as a very important step in high-level synthesis. Nevertheless, it is usually done without taking into account the effects on the actual hardware implementation. This paper presents an efficient symbolic technique to concurrently integrate operation scheduling and resource allocation. The technique inherits all the features of "standard" BDD-based control dominated scheduling, including resource-constraining, speculation and pruning. In addition, it introduces an efficient way of encoding allocation information within a symbolic scheduling automaton with a two-folded target. Firstly, it finds a minimum cost allocation of operation resources satisfying a given schedule. Secondly, it optimizes the amount of registers required to store intermediate results of operations. Theory and algorithms are developed and presented. Experimental results on a well known set of benchmarks show the potentiality of the approach.
调度被广泛认为是高级综合的一个重要步骤。尽管如此,这样做通常不会考虑对实际硬件实现的影响。提出了一种高效的符号技术,实现了操作调度和资源分配的并行集成。该技术继承了“标准”基于bdd的控制主导调度的所有特征,包括资源约束、推测和修剪。此外,还引入了一种有效的编码方法,在具有双折叠目标的符号调度自动机中编码分配信息。首先,找到满足给定调度的最小运行资源成本分配。其次,它优化了存储操作中间结果所需的寄存器数量。理论和算法的发展和提出。在一组众所周知的基准测试上的实验结果表明了该方法的潜力。
{"title":"A symbolic approach for the combined solution of scheduling and allocation","authors":"L. Lavagno, M. Lazarescu, S. Quer, Sergio Nocco, C. Passerone, G. Cabodi","doi":"10.1145/581199.581252","DOIUrl":"https://doi.org/10.1145/581199.581252","url":null,"abstract":"Scheduling is widely recognized as a very important step in high-level synthesis. Nevertheless, it is usually done without taking into account the effects on the actual hardware implementation. This paper presents an efficient symbolic technique to concurrently integrate operation scheduling and resource allocation. The technique inherits all the features of \"standard\" BDD-based control dominated scheduling, including resource-constraining, speculation and pruning. In addition, it introduces an efficient way of encoding allocation information within a symbolic scheduling automaton with a two-folded target. Firstly, it finds a minimum cost allocation of operation resources satisfying a given schedule. Secondly, it optimizes the amount of registers required to store intermediate results of operations. Theory and algorithms are developed and presented. Experimental results on a well known set of benchmarks show the potentiality of the approach.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"268 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115968419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A run-time word-level reconfigurable coarse-grain functional unit for a VLIW processor 用于VLIW处理器的运行时字级可重构的粗粒度功能单元
Pub Date : 2002-10-02 DOI: 10.1145/581199.581211
Carles Rodoreda Sala, N. Busá
Nowadays, new DSP applications are offering combined and flexible multimedia and telecom services. VLIW processor architectures, which include dedicated but inflexible functional units, are usually tuned to a single specific application. In order to accelerate a wide range of applications, we propose a VLIW processor containing a novel run-time reconfigurable functional unit (RC-FU). Only a few hundred bits and few cycles are necessary to configure a new coarse-grain operation on the RC-FU unit. After reconfiguring its internal datapath and microprogram, the RC-FU can execute a number of look-alike DSP functions, such as 8-point DCT or 4-point FFT. The RC-FU itself is a VLIW processor and the configuration contexts are generated using a high-level synthesis tool. The proposed RC-FU provides high processing power and can be efficiently tuned to the requirements of a variety of DSP applications.
如今,新的DSP应用正在提供灵活的多媒体和电信服务。VLIW处理器体系结构包括专用但不灵活的功能单元,通常针对单个特定应用程序进行调优。为了加速广泛的应用,我们提出了一种包含新型运行时可重构功能单元(RC-FU)的VLIW处理器。在RC-FU单元上配置一个新的粗粒度操作只需要几百位和几个周期。在重新配置其内部数据路径和微程序后,RC-FU可以执行许多类似DSP的功能,例如8点DCT或4点FFT。RC-FU本身是一个VLIW处理器,配置上下文是使用高级合成工具生成的。提出的RC-FU提供高处理能力,可以有效地调整到各种DSP应用的要求。
{"title":"A run-time word-level reconfigurable coarse-grain functional unit for a VLIW processor","authors":"Carles Rodoreda Sala, N. Busá","doi":"10.1145/581199.581211","DOIUrl":"https://doi.org/10.1145/581199.581211","url":null,"abstract":"Nowadays, new DSP applications are offering combined and flexible multimedia and telecom services. VLIW processor architectures, which include dedicated but inflexible functional units, are usually tuned to a single specific application. In order to accelerate a wide range of applications, we propose a VLIW processor containing a novel run-time reconfigurable functional unit (RC-FU). Only a few hundred bits and few cycles are necessary to configure a new coarse-grain operation on the RC-FU unit. After reconfiguring its internal datapath and microprogram, the RC-FU can execute a number of look-alike DSP functions, such as 8-point DCT or 4-point FFT. The RC-FU itself is a VLIW processor and the configuration contexts are generated using a high-level synthesis tool. The proposed RC-FU provides high processing power and can be efficiently tuned to the requirements of a variety of DSP applications.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122869952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Trimaran based framework for exploring the design space of VLIW ASIPs with coarse grain functional units 基于三体体的粗粒功能单元VLIW ip设计空间探索框架
Pub Date : 2002-10-02 DOI: 10.1145/581199.581203
M. Balakrishnan, Anshul Kumar, P. Ienne, Anup Gangwar, Bhuvan Middha
It is widely accepted that use of an Application Specific Instruction Set Processor (ASIP) in an embedded system can provide a solution which is much more flexible than ASICs and much more efficient than standard processors in terms of performance and power consumption. However a lack of an acceptable design methodology and supporting tools for ASIPs limits their use even today. We present in this paper a methodology for design space exploration of high performance VLIW ASIPs by modeling Application Specific Functional Units in Trimaran Compiler Infrastructure. To demonstrate the effectiveness of our strategy we consider two important applications FFT and Kalman Filter and perform compute intensive operations in these applications via special Functional Units. The results we obtain are very promising with up to 2/spl times/ speed improvement.
人们普遍认为,在嵌入式系统中使用专用指令集处理器(ASIP)可以提供比asic更灵活的解决方案,并且在性能和功耗方面比标准处理器更高效。然而,缺乏可接受的设计方法和支持api的工具,即使在今天也限制了它们的使用。在本文中,我们提出了一种通过在三体船编译器基础设施中建模应用特定功能单元来探索高性能VLIW api设计空间的方法。为了证明我们的策略的有效性,我们考虑了两个重要的应用FFT和卡尔曼滤波器,并通过特殊的功能单元在这些应用中执行计算密集型操作。我们获得的结果非常有希望,速度提高了2/ 1倍。
{"title":"A Trimaran based framework for exploring the design space of VLIW ASIPs with coarse grain functional units","authors":"M. Balakrishnan, Anshul Kumar, P. Ienne, Anup Gangwar, Bhuvan Middha","doi":"10.1145/581199.581203","DOIUrl":"https://doi.org/10.1145/581199.581203","url":null,"abstract":"It is widely accepted that use of an Application Specific Instruction Set Processor (ASIP) in an embedded system can provide a solution which is much more flexible than ASICs and much more efficient than standard processors in terms of performance and power consumption. However a lack of an acceptable design methodology and supporting tools for ASIPs limits their use even today. We present in this paper a methodology for design space exploration of high performance VLIW ASIPs by modeling Application Specific Functional Units in Trimaran Compiler Infrastructure. To demonstrate the effectiveness of our strategy we consider two important applications FFT and Kalman Filter and perform compute intensive operations in these applications via special Functional Units. The results we obtain are very promising with up to 2/spl times/ speed improvement.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114329233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
A new performance evaluation approach for system level design space exploration 一种新的系统级设计空间探索性能评估方法
Pub Date : 2002-10-02 DOI: 10.1145/581199.581239
C. P. Joshi, Anshul Kumar, M. Balakrishnan
Application specific systems have potential for customization of design with a view to achieve a better cost-performance-power trade-off. Such customization requires extensive design space exploration. In this paper, we introduce a performance evaluation methodology for system-level design exploration that is much faster than traditional cycle-accurate simulation. The trade off is between accuracy and simulation speed. The methodology is based on probabilistic modeling of system components customized with application behavior. Performance numbers are generated by simulating these models. We have implemented our models using SystemC and validated these for uni-processor as well as multiprocessor systems against various benchmarks.
特定于应用程序的系统具有定制设计的潜力,以实现更好的成本-性能-功率权衡。这种定制需要广泛的设计空间探索。在本文中,我们介绍了一种性能评估方法,用于系统级设计探索,比传统的周期精确仿真快得多。在准确性和模拟速度之间进行权衡。该方法基于根据应用程序行为定制的系统组件的概率建模。通过模拟这些模型生成性能数字。我们已经使用SystemC实现了我们的模型,并针对各种基准测试在单处理器和多处理器系统上验证了这些模型。
{"title":"A new performance evaluation approach for system level design space exploration","authors":"C. P. Joshi, Anshul Kumar, M. Balakrishnan","doi":"10.1145/581199.581239","DOIUrl":"https://doi.org/10.1145/581199.581239","url":null,"abstract":"Application specific systems have potential for customization of design with a view to achieve a better cost-performance-power trade-off. Such customization requires extensive design space exploration. In this paper, we introduce a performance evaluation methodology for system-level design exploration that is much faster than traditional cycle-accurate simulation. The trade off is between accuracy and simulation speed. The methodology is based on probabilistic modeling of system components customized with application behavior. Performance numbers are generated by simulating these models. We have implemented our models using SystemC and validated these for uni-processor as well as multiprocessor systems against various benchmarks.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"8 9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130527506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A case study of hardware and software synthesis in ForSyDe ForSyDe中硬件与软件综合的案例研究
Pub Date : 2002-10-02 DOI: 10.1145/581199.581219
I. Sander, A. Jantsch, Zhonghai Lu
ForSyDe (FORmal SYstem DEsign) is a methodology which addresses the design of SoC applications which may contain control as well as data flow dominated parts. Starting with a formal system specification, which captures the functionality of the system, it provides refinement methods inside the functional domain to transform the abstract specification into an efficient implementation model which serves as a starting point for synthesis into hardware and software. In this paper we illustrate with a case study of a digital equalizer how a ForSyDe model can be synthesized into a hardware, a software or a combined hardware/software implementation.
ForSyDe(正式系统设计)是一种方法,它解决了SoC应用程序的设计,这些应用程序可能包含控制和数据流主导部分。从捕获系统功能的正式系统规范开始,它提供了功能域中的细化方法,以将抽象规范转换为有效的实现模型,该模型作为集成到硬件和软件的起点。在本文中,我们通过一个数字均衡器的案例研究来说明如何将ForSyDe模型合成为硬件,软件或硬件/软件组合实现。
{"title":"A case study of hardware and software synthesis in ForSyDe","authors":"I. Sander, A. Jantsch, Zhonghai Lu","doi":"10.1145/581199.581219","DOIUrl":"https://doi.org/10.1145/581199.581219","url":null,"abstract":"ForSyDe (FORmal SYstem DEsign) is a methodology which addresses the design of SoC applications which may contain control as well as data flow dominated parts. Starting with a formal system specification, which captures the functionality of the system, it provides refinement methods inside the functional domain to transform the abstract specification into an efficient implementation model which serves as a starting point for synthesis into hardware and software. In this paper we illustrate with a case study of a digital equalizer how a ForSyDe model can be synthesized into a hardware, a software or a combined hardware/software implementation.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114203991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
System level power-performance trade-offs in embedded systems using voltage and frequency scaling of off-chip buses and memory 使用片外总线和存储器的电压和频率缩放的嵌入式系统中的系统级功率性能权衡
Pub Date : 2002-10-02 DOI: 10.1145/581199.581249
A. Chatterjee, P. Ellervee, V. Mooney, Jun-Cheol Park, Kyu-won Choi, Kiran Puttaswamy
In embedded systems, off-chip buses and memory (i.e., L2 memory as opposed to the L1 memory which is usually on-chip cache) consume significant power often more than the processor itself. In this paper for the case of an embedded system with one processor chip and one memory chip, we propose frequency and voltage scaling of the off-chip buses and the memory chip and use a known micro-architectural enhancement called a store buffer to reduce the resulting impact on execution time. Our benchmarks show a system (processor + off-chip bus + off-chip memory) power savings of 28% to 36%, an energy savings of 13% to 35%, all while increasing the execution time in the range of 1% to 29%. Previous work in power-aware computing has focused on frequency and voltage scaling of the processors or selective power-down of sub-sets of off-chip memory chips. This paper quantitatively explores voltage/frequency scaling of off-chip buses and memory as a means of trading off performance for power/energy at the system level in embedded systems.
在嵌入式系统中,片外总线和内存(即L2内存,与L1内存相反,L1内存通常是片上缓存)消耗的功率通常比处理器本身要大。在本文中,对于一个带有一个处理器芯片和一个存储芯片的嵌入式系统,我们提出了片外总线和存储芯片的频率和电压缩放,并使用称为存储缓冲区的已知微架构增强来减少对执行时间的影响。我们的基准测试显示,系统(处理器+片外总线+片外存储器)的功耗节省了28%到36%,能源节省了13%到35%,同时执行时间增加了1%到29%。以前在功率感知计算方面的工作主要集中在处理器的频率和电压缩放或片外存储芯片子集的选择性断电。本文定量地探讨了片外总线和存储器的电压/频率缩放,作为在嵌入式系统级别上权衡功率/能量性能的一种手段。
{"title":"System level power-performance trade-offs in embedded systems using voltage and frequency scaling of off-chip buses and memory","authors":"A. Chatterjee, P. Ellervee, V. Mooney, Jun-Cheol Park, Kyu-won Choi, Kiran Puttaswamy","doi":"10.1145/581199.581249","DOIUrl":"https://doi.org/10.1145/581199.581249","url":null,"abstract":"In embedded systems, off-chip buses and memory (i.e., L2 memory as opposed to the L1 memory which is usually on-chip cache) consume significant power often more than the processor itself. In this paper for the case of an embedded system with one processor chip and one memory chip, we propose frequency and voltage scaling of the off-chip buses and the memory chip and use a known micro-architectural enhancement called a store buffer to reduce the resulting impact on execution time. Our benchmarks show a system (processor + off-chip bus + off-chip memory) power savings of 28% to 36%, an energy savings of 13% to 35%, all while increasing the execution time in the range of 1% to 29%. Previous work in power-aware computing has focused on frequency and voltage scaling of the processors or selective power-down of sub-sets of off-chip memory chips. This paper quantitatively explores voltage/frequency scaling of off-chip buses and memory as a means of trading off performance for power/energy at the system level in embedded systems.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121181790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
OpenMP: parallel programming API for shared memory multiprocessors and on-chip multiprocessors OpenMP:用于共享内存多处理器和片上多处理器的并行编程API
Pub Date : 2002-10-02 DOI: 10.1145/581199.581224
M. Sato
The OpenMP application programming interface is an emerging standard for parallel programming on shared-memory multiprocessors. Recently, OpenMP is attracting widespread interest because of its easy-to-use portable parallel programming model. In this paper, we describe a brief introduction of OpenMP API and its parallel programming. We present our Omni OpenMP complier and performance of some applications on a shared memory multiprocessor. In the end, a role of OpenMP for modern on-chip multiprocessors is discussed.
OpenMP应用程序编程接口是用于在共享内存多处理器上并行编程的新兴标准。最近,OpenMP由于其易于使用的可移植并行编程模型而引起了广泛的兴趣。本文简要介绍了OpenMP API及其并行编程。我们介绍了Omni OpenMP编译器和一些应用程序在共享内存多处理器上的性能。最后讨论了OpenMP在现代片上多处理器中的作用。
{"title":"OpenMP: parallel programming API for shared memory multiprocessors and on-chip multiprocessors","authors":"M. Sato","doi":"10.1145/581199.581224","DOIUrl":"https://doi.org/10.1145/581199.581224","url":null,"abstract":"The OpenMP application programming interface is an emerging standard for parallel programming on shared-memory multiprocessors. Recently, OpenMP is attracting widespread interest because of its easy-to-use portable parallel programming model. In this paper, we describe a brief introduction of OpenMP API and its parallel programming. We present our Omni OpenMP complier and performance of some applications on a shared memory multiprocessor. In the end, a role of OpenMP for modern on-chip multiprocessors is discussed.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115315313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 82
Reducing energy consumption by dynamic copying of instructions onto onchip memory 通过将指令动态复制到片上存储器来减少能耗
Pub Date : 2002-10-02 DOI: 10.1145/581199.581247
M. Balakrishnan, P. Marwedel, L. Wehmeyer, Nils Grunwald, R. Banakar, S. Steinke
The number of mobile embedded systems is increasing and all of them are limited in their uptime by their battery capacity. Several hardware changes have been introduced during the last years, but the steadily growing functionality still requires further energy reductions, e.g. through software optimizations. A significant amount of energy can be saved in the memory hierarchy where most of the energy is consumed. In this paper, a new software technique is presented which supports the use of an onchip scratchpad memory by dynamically copying program parts into it. The set of selected program parts are determined with an optimal algorithm using integer linear programming. Experimental results show a reduction of the energy consumption by nearly 30%, a performance increase by 25% against a common cache system and energy improvements against a static approach of up to 38%.
移动嵌入式系统的数量正在增加,所有这些系统的正常运行时间都受到电池容量的限制。在过去的几年中,已经引入了一些硬件变化,但是稳定增长的功能仍然需要进一步降低能耗,例如通过软件优化。在消耗大部分能量的内存层次中可以节省大量的能量。本文提出了一种新的软件技术,通过动态地将程序部件复制到片上刮本存储器中,从而支持使用刮本存储器。采用整数线性规划的最优算法确定了所选程序部件集。实验结果表明,与普通缓存系统相比,能耗降低了近30%,性能提高了25%,与静态方法相比,能耗提高了38%。
{"title":"Reducing energy consumption by dynamic copying of instructions onto onchip memory","authors":"M. Balakrishnan, P. Marwedel, L. Wehmeyer, Nils Grunwald, R. Banakar, S. Steinke","doi":"10.1145/581199.581247","DOIUrl":"https://doi.org/10.1145/581199.581247","url":null,"abstract":"The number of mobile embedded systems is increasing and all of them are limited in their uptime by their battery capacity. Several hardware changes have been introduced during the last years, but the steadily growing functionality still requires further energy reductions, e.g. through software optimizations. A significant amount of energy can be saved in the memory hierarchy where most of the energy is consumed. In this paper, a new software technique is presented which supports the use of an onchip scratchpad memory by dynamically copying program parts into it. The set of selected program parts are determined with an optimal algorithm using integer linear programming. Experimental results show a reduction of the energy consumption by nearly 30%, a performance increase by 25% against a common cache system and energy improvements against a static approach of up to 38%.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"379 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116575294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 123
Virtual synchronization for fast distributed cosimulation of dataflow task graphs 数据流任务图快速分布式协同仿真的虚拟同步
Pub Date : 2002-10-02 DOI: 10.1145/581199.581238
S. Ha, Sungchan Kim, Chan-Eun Rhee, Hyunguk Jung, Youngmin Yi, Dohyung Kim
Fast distributed cosimulation is a challenging problem for the embedded system design. The main theme of this paper is to increase simulation speed by reducing the frequency of inter-simulator communications, reducing the active duration of simulators and utilizing the parallelism of component simulators, which is accomplished by combining event-driven and data-driven simulation methods. The proposed technique is applicable when the simulated tasks follow dataflow execution semantics. Experimental results show that the proposed technique can boost the cosimulation speed significantly compared with the previous conservative approaches.
快速分布式协同仿真是嵌入式系统设计中一个具有挑战性的问题。本文的主题是通过结合事件驱动和数据驱动仿真方法,通过减少模拟器间通信频率、减少模拟器的活动持续时间和利用组件模拟器的并行性来提高仿真速度。当模拟任务遵循数据流执行语义时,所提出的技术是适用的。实验结果表明,与传统保守方法相比,该方法能显著提高协同仿真速度。
{"title":"Virtual synchronization for fast distributed cosimulation of dataflow task graphs","authors":"S. Ha, Sungchan Kim, Chan-Eun Rhee, Hyunguk Jung, Youngmin Yi, Dohyung Kim","doi":"10.1145/581199.581238","DOIUrl":"https://doi.org/10.1145/581199.581238","url":null,"abstract":"Fast distributed cosimulation is a challenging problem for the embedded system design. The main theme of this paper is to increase simulation speed by reducing the frequency of inter-simulator communications, reducing the active duration of simulators and utilizing the parallelism of component simulators, which is accomplished by combining event-driven and data-driven simulation methods. The proposed technique is applicable when the simulated tasks follow dataflow execution semantics. Experimental results show that the proposed technique can boost the cosimulation speed significantly compared with the previous conservative approaches.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129472870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
15th International Symposium on System Synthesis, 2002.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1