首页 > 最新文献

International Conference on Hardware/Software Codesign and System Synthesis最新文献

英文 中文
A recursive approach to end-to-end path latency computation in heterogeneous multiprocessor systems 异构多处理器系统中端到端路径延迟计算的递归方法
Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629494
S. Schliecker, R. Ernst
This paper proposes a method for the derivation of end-to-end delays of applications that involve processing on multiple components in a heterogeneous multiprocessor system. The rocedure precisely captures the pipelined and parallel processing of multiple events along an application path by accurately capturing the resource timing and avoiding the pay-bursts-only-once problem. Both time-triggered and event-triggered task activation schemes with arbitrary event patterns are supported. In contrast to previous work, complex application topologies are allowed: The approach considers path forking and merging, as well as functional cycles and non-functional cyclic dependencies. The basis for the proposed method is an iterative compositional performance analysis, that allows computing event models in such systems. Based on the event models and local performance abstractions we propose a recursive approach to the derivation of the worst-case latency.
本文提出了一种推导异构多处理器系统中涉及多个组件处理的应用程序端到端延迟的方法。该过程通过精确地捕获资源定时和避免只支付一次的问题,精确地捕获沿着应用程序路径的多个事件的流水线和并行处理。支持具有任意事件模式的时间触发和事件触发任务激活方案。与以前的工作相反,允许使用复杂的应用程序拓扑:该方法考虑路径分叉和合并,以及功能循环和非功能循环依赖。提出的方法的基础是一个迭代组合性能分析,它允许在这样的系统中计算事件模型。基于事件模型和局部性能抽象,我们提出了一种递归方法来推导最坏情况延迟。
{"title":"A recursive approach to end-to-end path latency computation in heterogeneous multiprocessor systems","authors":"S. Schliecker, R. Ernst","doi":"10.1145/1629435.1629494","DOIUrl":"https://doi.org/10.1145/1629435.1629494","url":null,"abstract":"This paper proposes a method for the derivation of end-to-end delays of applications that involve processing on multiple components in a heterogeneous multiprocessor system. The rocedure precisely captures the pipelined and parallel processing of multiple events along an application path by accurately capturing the resource timing and avoiding the pay-bursts-only-once problem. Both time-triggered and event-triggered task activation schemes with arbitrary event patterns are supported. In contrast to previous work, complex application topologies are allowed: The approach considers path forking and merging, as well as functional cycles and non-functional cyclic dependencies. The basis for the proposed method is an iterative compositional performance analysis, that allows computing event models in such systems. Based on the event models and local performance abstractions we propose a recursive approach to the derivation of the worst-case latency.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125437339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Energy-efficiency for multiframe real-time tasks on a dynamic voltage scaling processor 动态电压缩放处理器上多帧实时任务的能量效率
Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629465
Chuan-Yue Yang, Jian-Jia Chen, Tei-Wei Kuo
Energy-aware design for electronic systems has been an important issue in hardware and software implementations. Dynamic voltage scaling (DVS) techniques have been adopted to effectively trade the performance for the energy consumption. However, most existing research for energy-efficient design in DVS systems with realtime constraints focuses on tasks with worst-case execution times. Once a task instance completes earlier than its worst-case estimation, the unused slacks can be used for slowing down to reduce the energy consumption. This paper explores how to efficiently and effectively minimize the energy consumption to schedule a set of periodic real-time tasks with the multiframe property, in which the execution times of task instances are characterized by a vector of elements that are repeated. This paper proposes two types of approaches: (1) the task-based approach and (2) the frame-based approach. The task-based approach allocates the same time length for the executions of task instances belonging to the same task. The frame-based approach can reduce the energy consumption further by assigning an execution speed to each task frame. For on-line use, the scheduling overhead for speed determination is constant in both types of the proposed approaches. Simulations show that our proposed approaches sacrifice some optimality in terms of energy savings, compared to the optimal solutions, but require less space and less overhead for scheduling in the on-line (run-time) fashion.
电子系统的能量感知设计一直是硬件和软件实现中的一个重要问题。动态电压标度(DVS)技术被用来有效地交换性能和能耗。然而,现有的具有实时约束的分布式交换机系统节能设计研究大多集中在具有最坏情况执行时间的任务上。一旦任务实例完成的时间早于其最坏情况估计,那么未使用的空闲时间可以用于减速以减少能量消耗。本文探讨了如何高效、有效地调度一组具有多帧特性的周期性实时任务,其中任务实例的执行时间由重复元素向量表征。本文提出了两种方法:(1)基于任务的方法和(2)基于框架的方法。基于任务的方法为属于同一任务的任务实例的执行分配相同的时间长度。基于帧的方法通过为每个任务帧分配执行速度,可以进一步降低能耗。对于联机使用,两种方法中用于速度确定的调度开销是恒定的。仿真表明,与最优解决方案相比,我们提出的方法在节能方面牺牲了一些最优性,但需要更少的空间和更少的在线(运行时)调度开销。
{"title":"Energy-efficiency for multiframe real-time tasks on a dynamic voltage scaling processor","authors":"Chuan-Yue Yang, Jian-Jia Chen, Tei-Wei Kuo","doi":"10.1145/1629435.1629465","DOIUrl":"https://doi.org/10.1145/1629435.1629465","url":null,"abstract":"Energy-aware design for electronic systems has been an important issue in hardware and software implementations. Dynamic voltage scaling (DVS) techniques have been adopted to effectively trade the performance for the energy consumption. However, most existing research for energy-efficient design in DVS systems with realtime constraints focuses on tasks with worst-case execution times. Once a task instance completes earlier than its worst-case estimation, the unused slacks can be used for slowing down to reduce the energy consumption. This paper explores how to efficiently and effectively minimize the energy consumption to schedule a set of periodic real-time tasks with the multiframe property, in which the execution times of task instances are characterized by a vector of elements that are repeated. This paper proposes two types of approaches: (1) the task-based approach and (2) the frame-based approach. The task-based approach allocates the same time length for the executions of task instances belonging to the same task. The frame-based approach can reduce the energy consumption further by assigning an execution speed to each task frame. For on-line use, the scheduling overhead for speed determination is constant in both types of the proposed approaches. Simulations show that our proposed approaches sacrifice some optimality in terms of energy savings, compared to the optimal solutions, but require less space and less overhead for scheduling in the on-line (run-time) fashion.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126726365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Synthesis of heterogeneous pipelined multiprocessor systems using ILP: jpeg case study 使用ILP的异构流水线多处理器系统的综合:jpeg案例研究
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450137
Haris Javaid, S. Parameswaran
Streaming applications can be implemented with a pipeline of processors. Each processor in the pipeline can be an application Specific Instruction Set Processor (ASIP) with the result being a heterogeneous pipelined MPSoC system. Since ASIPs can be of differing configurations, finding the optimal set of configurations for a multiprocessor architecture is a difficult problem. In this paper, we obtain an optimal system design for a set of processors which execute a multimedia application. The variables in the system are the presence or absence of different additional instructions and differing cache configurations for each of the processors. The problem is formulated as a 0-1 Integer Linear Programming (ILP) problem. To reduce the complexity of the ILP formulation, inferior ASIP configurations are efficiently pruned so that the solution could be reached quickly. Given a system runtime constraint, the proposed methodology finds a design with minimal area. We integrated this design methodology into a commercial design flow, and performed a case study upon the JPEG encoding application. We obtained 15 optimal designs subject to 15 different runtime constraints, each in less than 100 seconds from more than 4.2 x 1013 design points.
流应用程序可以通过处理器的管道来实现。流水线中的每个处理器都可以是应用特定指令集处理器(ASIP),其结果是异构流水线MPSoC系统。由于api可以具有不同的配置,因此为多处理器体系结构找到最优的配置集是一个难题。在本文中,我们得到了一组执行多媒体应用的处理器的最优系统设计。系统中的变量是不同附加指令的存在与否,以及每个处理器的不同缓存配置。该问题被表述为一个0-1整数线性规划(ILP)问题。为了降低ILP制定的复杂性,可以有效地修剪较差的ASIP配置,以便快速得到解决方案。给定一个系统运行时约束,所提出的方法找到一个最小面积的设计。我们将这种设计方法集成到一个商业设计流程中,并对JPEG编码应用程序进行了案例研究。我们从超过4.2 x 1013个设计点中获得了15个受15种不同运行时约束的最佳设计,每个设计在不到100秒的时间内完成。
{"title":"Synthesis of heterogeneous pipelined multiprocessor systems using ILP: jpeg case study","authors":"Haris Javaid, S. Parameswaran","doi":"10.1145/1450135.1450137","DOIUrl":"https://doi.org/10.1145/1450135.1450137","url":null,"abstract":"Streaming applications can be implemented with a pipeline of processors. Each processor in the pipeline can be an application Specific Instruction Set Processor (ASIP) with the result being a heterogeneous pipelined MPSoC system. Since ASIPs can be of differing configurations, finding the optimal set of configurations for a multiprocessor architecture is a difficult problem.\u0000 In this paper, we obtain an optimal system design for a set of processors which execute a multimedia application. The variables in the system are the presence or absence of different additional instructions and differing cache configurations for each of the processors. The problem is formulated as a 0-1 Integer Linear Programming (ILP) problem. To reduce the complexity of the ILP formulation, inferior ASIP configurations are efficiently pruned so that the solution could be reached quickly. Given a system runtime constraint, the proposed methodology finds a design with minimal area. We integrated this design methodology into a commercial design flow, and performed a case study upon the JPEG encoding application. We obtained 15 optimal designs subject to 15 different runtime constraints, each in less than 100 seconds from more than 4.2 x 1013 design points.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128091220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Slack analysis in the system design loop 系统设计回路中的松弛分析
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450189
Girish Venkataramani, S. Goldstein
We present a system-level technique to analyze the impact of design optimizations on system-level timing dependencies. This technique enables us to speed up the design cycle by substituting, in the design the loop, the time-consuming simulation step with a fast timing update routine. As a result, we can significantly reduce the design time from on the order of hours/days to the order of seconds/minutes. The update algorithm is defined on the Transaction Level Model (TLM) and can be used by any design flow that invokes TLM-based optimizations. This algorithm has linear-time complexity in the program size and experimental results indicate that any loss of accuracy due to this technique is negligible (< ±1%); the benefit is a reduction in total design cycle time from several hours to a matter of seconds.
我们提出了一种系统级技术来分析设计优化对系统级时间依赖性的影响。这种技术使我们能够通过用快速定时更新例程代替设计循环中耗时的仿真步骤来加快设计周期。因此,我们可以将设计时间从数小时/天减少到数秒/分钟。更新算法是在事务级别模型(Transaction Level Model, TLM)上定义的,任何调用基于TLM的优化的设计流都可以使用它。该算法在程序大小上具有线性时间复杂度,实验结果表明,由于该技术导致的任何精度损失都可以忽略不计(<±1%);这样做的好处是将总设计周期从几个小时缩短到几秒钟。
{"title":"Slack analysis in the system design loop","authors":"Girish Venkataramani, S. Goldstein","doi":"10.1145/1450135.1450189","DOIUrl":"https://doi.org/10.1145/1450135.1450189","url":null,"abstract":"We present a system-level technique to analyze the impact of design optimizations on system-level timing dependencies. This technique enables us to speed up the design cycle by substituting, in the design the loop, the time-consuming simulation step with a fast timing update routine. As a result, we can significantly reduce the design time from on the order of hours/days to the order of seconds/minutes. The update algorithm is defined on the Transaction Level Model (TLM) and can be used by any design flow that invokes TLM-based optimizations. This algorithm has linear-time complexity in the program size and experimental results indicate that any loss of accuracy due to this technique is negligible (< ±1%); the benefit is a reduction in total design cycle time from several hours to a matter of seconds.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131276443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ODOR: a microresonator-based high-performance low-cost router for optical networks-on-Chip 基于微谐振器的片上光网络高性能低成本路由器
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450181
Huaxi Gu, Jiang Xu, Zheng Wang
The performance of system-on-chip is determined not only by the performance of its functional units, but also by how efficiently they cooperate with one another. It is the on-chip communication architecture which determines the cooperation efficiency. Network-on-Chip (NoC) is introduced to improve communication bandwidth and power efficiency. However, traditional metallic interconnects consume significant amount of power to deliver large communication bandwidths. Optical NoCs are based on silicon optical interconnects with significant bandwidth and power advantages. Optical routers are the key enabling components of optical NoCs. This paper proposed a novel optical router architecture, ODOR, for optical NoCs based on XY routing algorithm. We compared ODOR with four other router architectures, and analyzed three aspects in details, including power consumption, optical power insertion loss, and the number of microresonators. The results show that ODOR has the lowest power consumption and losses and requires the least microresonators. ODOR has 40% less power consumption, 40% less loss, and 52% less microresonator than the full-connected crossbar. Furthermore, ODOR has a special feature which guarantees the maximum power to route a packet through a network to be a small constant number, regardless of the network size. The maximum power consumption is 0.96fJ/bit under current technology. We simulated a 6x6 2D mesh NoC based on ODOR, and showed the end-to-end delay and network throughput under different offered loads and packet sizes.
片上系统的性能不仅取决于其功能单元的性能,还取决于它们之间相互合作的效率。决定协作效率的是片上通信架构。为了提高通信带宽和功耗,引入了片上网络(NoC)。然而,传统的金属互连消耗大量的电力来提供大的通信带宽。光noc基于硅光互连,具有显著的带宽和功耗优势。光路由器是光网络的关键使能部件。本文提出了一种基于XY路由算法的光noc的新型光路由器结构ODOR。我们将ODOR与其他四种路由器架构进行了比较,并从功耗、光功率插入损耗和微谐振器数量三个方面进行了详细分析。结果表明,该方法具有最低的功耗和损耗,并且需要最少的微谐振器。与全连接交叉杆相比,ODOR的功耗降低40%,损耗降低40%,微谐振器减少52%。此外,ODOR有一个特殊的特性,它保证通过网络路由数据包的最大功率是一个小的常数,而与网络大小无关。在现有技术下,最大功耗为0.96fJ/bit。我们模拟了一个基于气味的6x6二维网格NoC,并展示了在不同提供的负载和数据包大小下的端到端延迟和网络吞吐量。
{"title":"ODOR: a microresonator-based high-performance low-cost router for optical networks-on-Chip","authors":"Huaxi Gu, Jiang Xu, Zheng Wang","doi":"10.1145/1450135.1450181","DOIUrl":"https://doi.org/10.1145/1450135.1450181","url":null,"abstract":"The performance of system-on-chip is determined not only by the performance of its functional units, but also by how efficiently they cooperate with one another. It is the on-chip communication architecture which determines the cooperation efficiency. Network-on-Chip (NoC) is introduced to improve communication bandwidth and power efficiency. However, traditional metallic interconnects consume significant amount of power to deliver large communication bandwidths. Optical NoCs are based on silicon optical interconnects with significant bandwidth and power advantages. Optical routers are the key enabling components of optical NoCs. This paper proposed a novel optical router architecture, ODOR, for optical NoCs based on XY routing algorithm. We compared ODOR with four other router architectures, and analyzed three aspects in details, including power consumption, optical power insertion loss, and the number of microresonators. The results show that ODOR has the lowest power consumption and losses and requires the least microresonators. ODOR has 40% less power consumption, 40% less loss, and 52% less microresonator than the full-connected crossbar. Furthermore, ODOR has a special feature which guarantees the maximum power to route a packet through a network to be a small constant number, regardless of the network size. The maximum power consumption is 0.96fJ/bit under current technology. We simulated a 6x6 2D mesh NoC based on ODOR, and showed the end-to-end delay and network throughput under different offered loads and packet sizes.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131937468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Model checking SystemC designs using timed automata 使用时间自动机的系统设计
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450166
Paula Herber, Joachim Fellmuth, S. Glesner
SystemC is widely used for modeling and simulation in hardware/software co-design. Due to the lack of a complete formal semantics, it is not possible to verify SystemC designs. In this paper, we present an approach to overcome this problem by defining the semantics of SystemC by a mapping from SystemC designs into the well-defined semantics of Uppaal timed automata. The informally defined behavior and the structure of SystemC designs are completely preserved in the generated Uppaal models. The resulting Uppaal models allow us to use the Uppaal model checker and the Uppaal tool suite, including simulation and visualization tools. The model checker can be used to verify important properties such as liveness, deadlock freedom or compliance with timing constraints. We have implemented the presented transformation, applied it to two examples and verified liveness, safety and timing properties by model checking, thus showing the applicability of our approach in practice.
SystemC广泛用于软硬件协同设计中的建模和仿真。由于缺乏完整的形式化语义,验证SystemC设计是不可能的。在本文中,我们提出了一种克服这一问题的方法,即通过将SystemC设计映射到定义良好的Uppaal时间自动机语义来定义SystemC的语义。非正式定义的行为和SystemC设计的结构完全保留在生成的Uppaal模型中。得到的Uppaal模型允许我们使用Uppaal模型检查器和Uppaal工具套件,包括仿真和可视化工具。模型检查器可用于验证重要的属性,如活动性、死锁自由度或对时间约束的遵从性。我们对所提出的变换进行了实现,并将其应用到两个实例中,通过模型检验验证了该方法的活动性、安全性和时序性,从而表明了该方法在实践中的适用性。
{"title":"Model checking SystemC designs using timed automata","authors":"Paula Herber, Joachim Fellmuth, S. Glesner","doi":"10.1145/1450135.1450166","DOIUrl":"https://doi.org/10.1145/1450135.1450166","url":null,"abstract":"SystemC is widely used for modeling and simulation in hardware/software co-design. Due to the lack of a complete formal semantics, it is not possible to verify SystemC designs. In this paper, we present an approach to overcome this problem by defining the semantics of SystemC by a mapping from SystemC designs into the well-defined semantics of Uppaal timed automata. The informally defined behavior and the structure of SystemC designs are completely preserved in the generated Uppaal models. The resulting Uppaal models allow us to use the Uppaal model checker and the Uppaal tool suite, including simulation and visualization tools. The model checker can be used to verify important properties such as liveness, deadlock freedom or compliance with timing constraints. We have implemented the presented transformation, applied it to two examples and verified liveness, safety and timing properties by model checking, thus showing the applicability of our approach in practice.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115190064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 99
Combination of instruction set simulation and abstract RTOS model execution for fast and accurate target software evaluation 指令集仿真与抽象RTOS模型执行相结合,实现快速准确的目标软件评估
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450168
Matthias Krause, Dominik Englert, O. Bringmann, W. Rosenstiel
Instruction set simulation and real time operating system modeling have become important issues for the design of distributed embedded systems. This paper presents a holistic approach to simulate a distributed, embedded system that includes target software, processing units, and abstract RTOS within a virtual prototype environment. The processing unit is modeled by an ISS, which is embedded in a SystemC environment to allow the integration into a platform model. In comparison to existing approaches, the RTOS is not directly running on the ISS but outsourced and replaced by an RTOS model. This step strongly reduces simulation time since the execution on the ISS is much more time consuming in contrast to the execution on the host processor. The results show the theoretical and measured performance gain depending on the RTOS scheduler and task switching.
指令集仿真和实时操作系统建模已成为分布式嵌入式系统设计中的重要问题。本文提出了一种在虚拟原型环境中模拟分布式嵌入式系统的整体方法,该系统包括目标软件、处理单元和抽象RTOS。处理单元由嵌入在SystemC环境中的ISS建模,以允许集成到平台模型中。与现有的方法相比,RTOS不是直接在国际空间站上运行,而是外包并由RTOS模型代替。这一步骤大大减少了模拟时间,因为与在主机处理器上执行相比,在ISS上执行要花费更多的时间。结果显示了理论和实际性能增益取决于RTOS调度程序和任务切换。
{"title":"Combination of instruction set simulation and abstract RTOS model execution for fast and accurate target software evaluation","authors":"Matthias Krause, Dominik Englert, O. Bringmann, W. Rosenstiel","doi":"10.1145/1450135.1450168","DOIUrl":"https://doi.org/10.1145/1450135.1450168","url":null,"abstract":"Instruction set simulation and real time operating system modeling have become important issues for the design of distributed embedded systems. This paper presents a holistic approach to simulate a distributed, embedded system that includes target software, processing units, and abstract RTOS within a virtual prototype environment. The processing unit is modeled by an ISS, which is embedded in a SystemC environment to allow the integration into a platform model. In comparison to existing approaches, the RTOS is not directly running on the ISS but outsourced and replaced by an RTOS model. This step strongly reduces simulation time since the execution on the ISS is much more time consuming in contrast to the execution on the host processor. The results show the theoretical and measured performance gain depending on the RTOS scheduler and task switching.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115526570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Static analysis of processor stall cycle aggregation 处理器失速周期聚合的静态分析
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450143
Jongeun Lee, Aviral Shrivastava
Processor Idle Cycle Aggregation (PICA) is a promising approach for low power execution of processors, in which small memory stalls are aggregated to create a large one, and the processor is switched to low-power mode in it. We extend the previous proposed approach in two dimensions. i) We develop static analysis for the PICA technique and present optimum parameters for five common types of loops based on steady-state analysis. ii) We show that software only control is unable to guarantee its correctness in a varying runtime environment, potentially causing deadlocks. We enhance the robustness of PICA with minimal hardware extension, ensuring correct execution for any loops and parameters, which greatly facilitates exploration based parameter optimization. The combined use of our static analysis and exploration based fine-tuning makes the PICA technique applicable, to any memory-bound loop, with energy reduction. We validate our analytical models against simulation based optimization and also show through our experiments on embedded application benchmarks, that our technique can be applied to a wide range of loops with average 20% energy reductions compared to executions without PICA.
处理器空闲周期聚合(PICA)是处理器低功耗执行的一种很有前途的方法,在这种方法中,将小的内存停顿聚合为一个大的内存停顿,并在其中将处理器切换到低功耗模式。我们在二维上扩展了先前提出的方法。i)我们对PICA技术进行了静态分析,并基于稳态分析给出了五种常见类型环路的最佳参数。ii)我们表明,仅软件控制无法保证其在变化的运行时环境中的正确性,从而可能导致死锁。我们以最小的硬件扩展增强了PICA的鲁棒性,确保了任何循环和参数的正确执行,这极大地促进了基于勘探的参数优化。结合使用我们的静态分析和基于探索的微调,使得PICA技术适用于任何内存绑定的循环,并减少了能量。我们根据基于仿真的优化验证了我们的分析模型,并通过我们在嵌入式应用程序基准测试上的实验表明,与没有PICA的执行相比,我们的技术可以应用于广泛的循环,平均能耗降低20%。
{"title":"Static analysis of processor stall cycle aggregation","authors":"Jongeun Lee, Aviral Shrivastava","doi":"10.1145/1450135.1450143","DOIUrl":"https://doi.org/10.1145/1450135.1450143","url":null,"abstract":"Processor Idle Cycle Aggregation (PICA) is a promising approach for low power execution of processors, in which small memory stalls are aggregated to create a large one, and the processor is switched to low-power mode in it. We extend the previous proposed approach in two dimensions. i) We develop static analysis for the PICA technique and present optimum parameters for five common types of loops based on steady-state analysis. ii) We show that software only control is unable to guarantee its correctness in a varying runtime environment, potentially causing deadlocks. We enhance the robustness of PICA with minimal hardware extension, ensuring correct execution for any loops and parameters, which greatly facilitates exploration based parameter optimization. The combined use of our static analysis and exploration based fine-tuning makes the PICA technique applicable, to any memory-bound loop, with energy reduction. We validate our analytical models against simulation based optimization and also show through our experiments on embedded application benchmarks, that our technique can be applied to a wide range of loops with average 20% energy reductions compared to executions without PICA.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115579079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Providing accurate event models for the analysis of heterogeneous multiprocessor systems 为异构多处理器系统的分析提供准确的事件模型
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450177
S. Schliecker, J. Rox, M. Ivers, R. Ernst
This paper proposes a new method for deriving quantitative event information for compositional multiprocessor performance analysis. This procedure brakes down the complexity into the analysis of individual components (tasks mapped to resources) and the propagation of the timing information with the help of event models. This paper improves previous methods to derive event models in a multiprocessor system by providing tighter bounds and allowing arbitrarily shaped event models. The procedure is based on a a simple yet expressive resource model called the multiple event busy time which can be derived on the basis of classical scheduling theory -- it can therefore be provided for a large domain of scheduling policies. Our experiments show that overestimation by previous methods can be reduced significantly.
本文提出了一种用于组合多处理器性能分析的定量事件信息提取的新方法。此过程将复杂性分解为对单个组件(映射到资源的任务)的分析,以及借助事件模型传播计时信息。本文通过提供更严格的边界和允许任意形状的事件模型,改进了以往在多处理器系统中导出事件模型的方法。该过程基于一个简单而富有表现力的资源模型,称为多事件繁忙时间模型,该模型可以在经典调度理论的基础上推导出来,因此可以为大范围的调度策略提供支持。我们的实验表明,以前的方法可以显著减少高估。
{"title":"Providing accurate event models for the analysis of heterogeneous multiprocessor systems","authors":"S. Schliecker, J. Rox, M. Ivers, R. Ernst","doi":"10.1145/1450135.1450177","DOIUrl":"https://doi.org/10.1145/1450135.1450177","url":null,"abstract":"This paper proposes a new method for deriving quantitative event information for compositional multiprocessor performance analysis. This procedure brakes down the complexity into the analysis of individual components (tasks mapped to resources) and the propagation of the timing information with the help of event models. This paper improves previous methods to derive event models in a multiprocessor system by providing tighter bounds and allowing arbitrarily shaped event models. The procedure is based on a a simple yet expressive resource model called the multiple event busy time which can be derived on the basis of classical scheduling theory -- it can therefore be provided for a large domain of scheduling policies. Our experiments show that overestimation by previous methods can be reduced significantly.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117229042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Co-design in the wilderness 荒野中的协同设计
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450185
D. Gale
Hardware technology platforms are blurring at the edges as the integration frontier, a co-design frontier, holds promise of new functionality being achieved beyond electronics and software, through photonic and fluidic technologies. The problems here are complex as one considers the multiple technology partitions and the possibilities for exploring trade-offs and so this co-design frontier is largely untamed. A co-design "box", somewhat empirical by nature, supports exploratory research interests. It is motivated by trying to merge two lines of activity: one involving electronic-software rapid-prototyping and the other involving the design and fabrication of novel non-electronic devices or structures. The expectation initially is to demonstrate from an embedded system perspective whether novel fluidic devices perform as intended. Other devices will be considered in the future. Individuals and groups working in these frontier areas have attempted to promote some degree of standardization which might help clear a path forward in support of less empirical co-design techniques. Experience with microelectronics is most often used as a model with reference to the hierarchy of leaf-cells, components, functionally-designated subsystems and defined physical and signal interfaces. Physical aspects of internet connectivity are an example of advances made at the photonics-electronics frontier using multiple signal wavelengths and command, control and communication involving software, microelectronics, photonics and signal conversion. Progress with fluidics is at an early stage and major outcomes, no less transformative than the internet in the last 20 years, will occur whether in health-care or the environment or in some other sector. Complex devices or micro-assemblies that carry electronic, photonic and fluidic signals are now made regularly. Co-design technology, while lagging seriously, has the potential to reduce exploration barriers at the integration frontier, multiplying the number of exploratory paths being pursued by an increasing number of practitioners and yielding beneficial outcomes sooner than might otherwise be expected.
硬件技术平台的边缘正在变得模糊,因为集成前沿和协同设计前沿有望通过光子和流体技术实现超越电子和软件的新功能。这里的问题很复杂,因为考虑到多种技术分区和探索权衡的可能性,所以这个协同设计前沿在很大程度上是未驯服的。一个协同设计的“盒子”,从本质上来说多少有些经验性,支持探索性的研究兴趣。它的动机是试图合并两种活动:一种涉及电子软件快速原型设计,另一种涉及设计和制造新颖的非电子设备或结构。最初的期望是从嵌入式系统的角度来证明新颖的流体装置是否如预期的那样运行。未来还会考虑其他设备。在这些前沿领域工作的个人和团体试图促进某种程度的标准化,这可能有助于为支持较少经验性的协同设计技术扫清道路。微电子学的经验最常被用作参考叶细胞、组件、功能指定的子系统和定义的物理和信号接口的层次结构的模型。互联网连接的物理方面是光子学-电子学前沿使用多信号波长和涉及软件、微电子、光子学和信号转换的命令、控制和通信取得进展的一个例子。流体技术的进展尚处于早期阶段,无论是在医疗保健、环境还是其他领域,都将取得重大成果,其变革性不亚于过去20年互联网的发展。携带电子、光子和流体信号的复杂装置或微型组件现在已经有规律地制造出来。协同设计技术虽然严重滞后,但有可能减少集成前沿的探索障碍,增加越来越多的从业者所追求的探索路径的数量,并比预期更快地产生有益的结果。
{"title":"Co-design in the wilderness","authors":"D. Gale","doi":"10.1145/1450135.1450185","DOIUrl":"https://doi.org/10.1145/1450135.1450185","url":null,"abstract":"Hardware technology platforms are blurring at the edges as the integration frontier, a co-design frontier, holds promise of new functionality being achieved beyond electronics and software, through photonic and fluidic technologies. The problems here are complex as one considers the multiple technology partitions and the possibilities for exploring trade-offs and so this co-design frontier is largely untamed. A co-design \"box\", somewhat empirical by nature, supports exploratory research interests. It is motivated by trying to merge two lines of activity: one involving electronic-software rapid-prototyping and the other involving the design and fabrication of novel non-electronic devices or structures. The expectation initially is to demonstrate from an embedded system perspective whether novel fluidic devices perform as intended. Other devices will be considered in the future. Individuals and groups working in these frontier areas have attempted to promote some degree of standardization which might help clear a path forward in support of less empirical co-design techniques. Experience with microelectronics is most often used as a model with reference to the hierarchy of leaf-cells, components, functionally-designated subsystems and defined physical and signal interfaces. Physical aspects of internet connectivity are an example of advances made at the photonics-electronics frontier using multiple signal wavelengths and command, control and communication involving software, microelectronics, photonics and signal conversion. Progress with fluidics is at an early stage and major outcomes, no less transformative than the internet in the last 20 years, will occur whether in health-care or the environment or in some other sector. Complex devices or micro-assemblies that carry electronic, photonic and fluidic signals are now made regularly. Co-design technology, while lagging seriously, has the potential to reduce exploration barriers at the integration frontier, multiplying the number of exploratory paths being pursued by an increasing number of practitioners and yielding beneficial outcomes sooner than might otherwise be expected.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134289867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Conference on Hardware/Software Codesign and System Synthesis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1