首页 > 最新文献

International Conference on Hardware/Software Codesign and System Synthesis最新文献

英文 中文
Yield maximization for system-level task assignment and configuration selection of configurable multiprocessors 可配置多处理器的系统级任务分配和配置选择的产量最大化
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450192
L. Singhal, Sejong Oh, E. Bozorgzadeh
Configurable multiprocessor system is a promising design alternative because of its high degree of flexibility, short development time, and potentially high performance under constraints and challenges driven by applications. An important design challenge at 45nm for multi-core system is manufacturing process variation. Due to increasing concern of WID variation, designers will have to choose configurations of processing cores that maximize yield of the system while not affecting performance and throughput constraints. Due to interdependency between processor configuration selection and task allocation and its impact on yield and latency constraints, we tackle both problems simultaneously. In this paper, we propose the problem of task allocation and configuration selection for yield optimization. We prove the problem is NP-hard and propose an optimal pseudo-polynomial on Serial-Parallel graphs. We target streaming applications in pipelined reconfigurable multiprocessor systems. We provide a case study of configurable Leon processors as the cores implemented on FPGA. Results show that proposed problem could result in significant improvement of the timing yield of the system by exploiting extra slack on tasks.
可配置多处理器系统是一种很有前途的设计选择,因为它具有高度的灵活性,开发时间短,并且在应用程序驱动的限制和挑战下具有潜在的高性能。45nm多核系统的一个重要设计挑战是制造工艺的变化。由于对WID变化的日益关注,设计人员将不得不选择处理内核的配置,以最大限度地提高系统的产量,同时不影响性能和吞吐量限制。由于处理器配置选择和任务分配之间的相互依赖性及其对良率和延迟约束的影响,我们同时解决了这两个问题。本文提出了成品率优化的任务分配和配置选择问题。我们证明了这个问题是np困难的,并在串行-并行图上提出了一个最优伪多项式。我们的目标是流应用在流水线可重构多处理器系统。我们提供了一个可配置Leon处理器作为FPGA实现核心的案例研究。结果表明,所提出的问题可以通过利用任务上的额外空闲来显著提高系统的时序良率。
{"title":"Yield maximization for system-level task assignment and configuration selection of configurable multiprocessors","authors":"L. Singhal, Sejong Oh, E. Bozorgzadeh","doi":"10.1145/1450135.1450192","DOIUrl":"https://doi.org/10.1145/1450135.1450192","url":null,"abstract":"Configurable multiprocessor system is a promising design alternative because of its high degree of flexibility, short development time, and potentially high performance under constraints and challenges driven by applications. An important design challenge at 45nm for multi-core system is manufacturing process variation. Due to increasing concern of WID variation, designers will have to choose configurations of processing cores that maximize yield of the system while not affecting performance and throughput constraints. Due to interdependency between processor configuration selection and task allocation and its impact on yield and latency constraints, we tackle both problems simultaneously. In this paper, we propose the problem of task allocation and configuration selection for yield optimization. We prove the problem is NP-hard and propose an optimal pseudo-polynomial on Serial-Parallel graphs. We target streaming applications in pipelined reconfigurable multiprocessor systems. We provide a case study of configurable Leon processors as the cores implemented on FPGA. Results show that proposed problem could result in significant improvement of the timing yield of the system by exploiting extra slack on tasks.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122850246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Static analysis for fast and accurate design space exploration of caches 静态分析用于快速、准确地设计空间探索缓存
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450159
Yun Liang, T. Mitra
Application-specific system-on-chip platforms create the opportunity to customize the cache configuration for optimal performance with minimal chip estate. Simulation, in particular trace-driven simulation, is widely used to estimate cache hit rates. However, simulation is too slow to be deployed in the design space exploration, specially when it involves hundreds of design points and huge traces or long program execution. In this paper, we propose a novel static analysis technique for rapid and accurate design space exploration of instruction caches. Given the program control flow graph (CFG) annotated only with basic block and control flow edge execution counts, our analysis estimates the hit rates for multiple cache configurations in one pass. We achieve this by modeling the cache states at each node of the CFG in probabilistic manner and exploiting the structural similarities among related cache configurations. Experimental results indicate that our analysis is 24--3,855 times faster compared to the fastest known cache simulator while maintaining high accuracy (0.7% average error), in predicting hit rates for popular embedded benchmarks.
特定于应用程序的片上系统平台创造了定制缓存配置的机会,以最小的芯片占用实现最佳性能。仿真,特别是跟踪驱动仿真,被广泛用于估计缓存命中率。然而,在设计空间探索中,特别是涉及数百个设计点和巨大的轨迹或长时间的程序执行时,仿真速度太慢。在本文中,我们提出了一种新的静态分析技术,用于快速准确地探索指令缓存的设计空间。给定程序控制流图(CFG)仅注释了基本块和控制流边缘执行计数,我们的分析估计了一次通过多个缓存配置的命中率。我们通过以概率方式对CFG的每个节点的缓存状态进行建模,并利用相关缓存配置之间的结构相似性来实现这一目标。实验结果表明,我们的分析比已知最快的缓存模拟器快24- 3,855倍,同时保持高精度(平均误差0.7%),预测流行嵌入式基准的命中率。
{"title":"Static analysis for fast and accurate design space exploration of caches","authors":"Yun Liang, T. Mitra","doi":"10.1145/1450135.1450159","DOIUrl":"https://doi.org/10.1145/1450135.1450159","url":null,"abstract":"Application-specific system-on-chip platforms create the opportunity to customize the cache configuration for optimal performance with minimal chip estate. Simulation, in particular trace-driven simulation, is widely used to estimate cache hit rates. However, simulation is too slow to be deployed in the design space exploration, specially when it involves hundreds of design points and huge traces or long program execution. In this paper, we propose a novel static analysis technique for rapid and accurate design space exploration of instruction caches. Given the program control flow graph (CFG) annotated only with basic block and control flow edge execution counts, our analysis estimates the hit rates for multiple cache configurations in one pass. We achieve this by modeling the cache states at each node of the CFG in probabilistic manner and exploiting the structural similarities among related cache configurations. Experimental results indicate that our analysis is 24--3,855 times faster compared to the fastest known cache simulator while maintaining high accuracy (0.7% average error), in predicting hit rates for popular embedded benchmarks.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130967378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Specification and OS-based implementation of self-adaptive, hardware/software embedded systems 规范和基于操作系统的自适应,硬件/软件嵌入式系统的实现
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450151
Yvan Eustache, J. Diguet
This paper presents our solution for specifying and implementing self-adaptivness within an OS-based and reconfigurable embedded system according to objectives such as quality of service (QoS), performance or power consumption. More precisely, we detail our approach to separate, at runtime, application-specific decisions and hardware/software implementation decisions at system level. The first ones are related to the control of the efficiency of applications, they are specified in Local Configuration Managers (LCM) based on the knowledge of application engineers. The second ones are generic and address the choice between various hardware and software implementations according to observations of the gap between online measurements and objectives set by the user, these decisions are implemented in the Global Configuration Manager (GCM) as an adaptive close-loop model. We have designed a video tracking application on an FPGA to demonstrate the effectiveness of our solution, results are given for a system built around a NIOS soft-core with ¼COS II RTOS and new services for managing hardware and soft-ware tasks transparently.
本文提出了我们的解决方案,根据服务质量(QoS)、性能或功耗等目标,在基于操作系统和可重构的嵌入式系统中指定和实现自适应性。更准确地说,我们详细说明了在运行时分离特定于应用程序的决策和系统级别的硬件/软件实现决策的方法。前者与应用程序的效率控制有关,它们是基于应用程序工程师的知识在本地配置管理器(LCM)中指定的。第二种是通用的,根据对在线测量和用户设置的目标之间的差距的观察,在各种硬件和软件实现之间进行选择,这些决策在全局配置管理器(GCM)中作为自适应闭环模型实现。我们在FPGA上设计了一个视频跟踪应用程序来证明我们的解决方案的有效性,结果给出了一个围绕NIOS软核构建的系统,带有¼COS II RTOS和用于透明管理硬件和软件任务的新服务。
{"title":"Specification and OS-based implementation of self-adaptive, hardware/software embedded systems","authors":"Yvan Eustache, J. Diguet","doi":"10.1145/1450135.1450151","DOIUrl":"https://doi.org/10.1145/1450135.1450151","url":null,"abstract":"This paper presents our solution for specifying and implementing self-adaptivness within an OS-based and reconfigurable embedded system according to objectives such as quality of service (QoS), performance or power consumption. More precisely, we detail our approach to separate, at runtime, application-specific decisions and hardware/software implementation decisions at system level. The first ones are related to the control of the efficiency of applications, they are specified in Local Configuration Managers (LCM) based on the knowledge of application engineers. The second ones are generic and address the choice between various hardware and software implementations according to observations of the gap between online measurements and objectives set by the user, these decisions are implemented in the Global Configuration Manager (GCM) as an adaptive close-loop model. We have designed a video tracking application on an FPGA to demonstrate the effectiveness of our solution, results are given for a system built around a NIOS soft-core with ¼COS II RTOS and new services for managing hardware and soft-ware tasks transparently.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131102165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Simulation and embedded software development for Anton, a parallel machine with heterogeneous multicore ASICs 异构多核asic并行机Anton的仿真与嵌入式软件开发
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450165
J. P. Grossman, C. Young, Joseph A. Bank, Kenneth M. Mackenzie, D. Ierardi, J. Salmon, R. Dror, D. Shaw
Anton, a special-purpose parallel machine currently under construction, is the result of a significant hardware-software codesign effort that relied heavily on an architectural simulator. One of this simulator's many important roles is to support the development of embedded software (software that runs on Anton's ASICs), which is challenging for several reasons. First, the Anton ASIC is a heterogeneous multicore system-on-a-chip, with three types of embedded cores tightly coupled to special-purpose hardware units. Second, a standard 512-ASIC configuration contains a total of 6,656 distinct embedded cores, all of which must be explicitly modeled within the simulator. Third, a portion of the embedded software is dynamically generated at simulation time. This paper discusses the various ways in which the Anton simulator addresses these challenges. We use a hardware abstraction layer that allows embedded software source code to be compiled without modification for either the simulation host or the hardware target. We report on the effectiveness of embedding golden-model testbenches within the simulator to verify embedded software as it runs. We also describe our hardware-software cosimulation strategy for dynamically generated embedded software. Finally, we use a methodology that we refer to as concurrent mixed-level simulation to model embedded cores within massively parallel systems. These techniques allow the Anton simulator to serve as an efficient platform for embedded software development.
Anton是目前正在建设中的专用并行机器,是一个重要的硬件软件协同设计工作的结果,它严重依赖于一个架构模拟器。该模拟器的许多重要作用之一是支持嵌入式软件(在安东的asic上运行的软件)的开发,这有几个原因具有挑战性。首先,安东ASIC是一种异构多核单片系统,具有三种类型的嵌入式内核,与专用硬件单元紧密耦合。其次,标准的512-ASIC配置包含总共6,656个不同的嵌入式核心,所有这些核心都必须在模拟器中显式建模。第三,在仿真时动态生成部分嵌入式软件。本文讨论了Anton模拟器解决这些挑战的各种方法。我们使用硬件抽象层,该层允许编译嵌入式软件源代码,而无需为模拟主机或硬件目标进行修改。我们报告了在模拟器中嵌入金模型测试台以验证嵌入式软件运行时的有效性。我们还描述了动态生成嵌入式软件的软硬件协同仿真策略。最后,我们使用一种称为并发混合级仿真的方法来模拟大规模并行系统中的嵌入式内核。这些技术使安东模拟器成为嵌入式软件开发的有效平台。
{"title":"Simulation and embedded software development for Anton, a parallel machine with heterogeneous multicore ASICs","authors":"J. P. Grossman, C. Young, Joseph A. Bank, Kenneth M. Mackenzie, D. Ierardi, J. Salmon, R. Dror, D. Shaw","doi":"10.1145/1450135.1450165","DOIUrl":"https://doi.org/10.1145/1450135.1450165","url":null,"abstract":"Anton, a special-purpose parallel machine currently under construction, is the result of a significant hardware-software codesign effort that relied heavily on an architectural simulator. One of this simulator's many important roles is to support the development of embedded software (software that runs on Anton's ASICs), which is challenging for several reasons. First, the Anton ASIC is a heterogeneous multicore system-on-a-chip, with three types of embedded cores tightly coupled to special-purpose hardware units. Second, a standard 512-ASIC configuration contains a total of 6,656 distinct embedded cores, all of which must be explicitly modeled within the simulator. Third, a portion of the embedded software is dynamically generated at simulation time.\u0000 This paper discusses the various ways in which the Anton simulator addresses these challenges. We use a hardware abstraction layer that allows embedded software source code to be compiled without modification for either the simulation host or the hardware target. We report on the effectiveness of embedding golden-model testbenches within the simulator to verify embedded software as it runs. We also describe our hardware-software cosimulation strategy for dynamically generated embedded software. Finally, we use a methodology that we refer to as concurrent mixed-level simulation to model embedded cores within massively parallel systems. These techniques allow the Anton simulator to serve as an efficient platform for embedded software development.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126652579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Methodology for multi-granularity embedded processor power model generation for an ESL design flow 面向ESL设计流程的多粒度嵌入式处理器功率模型生成方法
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450194
Young-Hwan Park, S. Pasricha, F. Kurdahi, N. Dutt
With power becoming a major constraint for multi-processor embedded systems, it is becoming important for designers to characterize and model processor power dissipation. It is critical for these processor power models to be useable across various modeling abstractions in an electronic system level (ESL) design flow, to guide early design decisions. In this paper, we propose a unified processor power modeling methodology for the creation of power models at multiple granularity levels that can be quickly mapped to an ESL design flow. Our experimental results based on applying the proposed methodology on an OpenRISC processor demonstrate the usefulness of having multiple power models. The generated models range from very high-level two-state and architectural/ISS models that can be used in transaction level models (TLM), to extremely detailed cycle-accurate models that enable early exploration of power optimization techniques. These models offer a designer tremendous flexibility to trade off estimation accuracy with estimation/simulation effort.
随着功耗成为多处理器嵌入式系统的主要制约因素,设计人员对处理器功耗进行表征和建模变得越来越重要。这些处理器功率模型在电子系统级(ESL)设计流中的各种建模抽象之间可用,以指导早期的设计决策,这一点至关重要。在本文中,我们提出了一种统一的处理器功率建模方法,用于创建多个粒度级别的功率模型,这些模型可以快速映射到ESL设计流。我们基于在OpenRISC处理器上应用所提出的方法的实验结果表明,具有多个功率模型是有用的。生成的模型范围从可以在事务级模型(TLM)中使用的非常高级的双状态和架构/ISS模型,到非常详细的周期精确模型,可以早期探索功率优化技术。这些模型为设计人员提供了极大的灵活性,可以在估计精度和估计/模拟工作之间进行权衡。
{"title":"Methodology for multi-granularity embedded processor power model generation for an ESL design flow","authors":"Young-Hwan Park, S. Pasricha, F. Kurdahi, N. Dutt","doi":"10.1145/1450135.1450194","DOIUrl":"https://doi.org/10.1145/1450135.1450194","url":null,"abstract":"With power becoming a major constraint for multi-processor embedded systems, it is becoming important for designers to characterize and model processor power dissipation. It is critical for these processor power models to be useable across various modeling abstractions in an electronic system level (ESL) design flow, to guide early design decisions. In this paper, we propose a unified processor power modeling methodology for the creation of power models at multiple granularity levels that can be quickly mapped to an ESL design flow. Our experimental results based on applying the proposed methodology on an OpenRISC processor demonstrate the usefulness of having multiple power models. The generated models range from very high-level two-state and architectural/ISS models that can be used in transaction level models (TLM), to extremely detailed cycle-accurate models that enable early exploration of power optimization techniques. These models offer a designer tremendous flexibility to trade off estimation accuracy with estimation/simulation effort.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116983982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Power reduction via macroblock prioritization for power aware H.264 video applications 通过宏块优先级降低功耗的H.264视频应用
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450195
Michael A. Baker, V. Parameswaran, Karam S. Chatha, Baoxin Li
As the importance of multimedia applications in hand-held devices increases, the computational strain and corresponding demand for energy in such devices continues to grow. Portable multimedia devices with inherently limited energy supplies face tight energy constraints and require optimization for energy conservation. Power-aware applications give their users flexibility to prioritize and trade between performance and battery-life. This paper introduces a power-aware technique for user selectable power reduction in exchange for controlled reductions in video quality for H.264 video streams. The technique uses an encoder-decoder pair. The encoder characterizes video streams and provides information to the decoder via Flexible Macroblock Ordering (FMO) by generating prioritized slice groups. The decoder selectively ignores low priority slice groups based on user selected preference effectively reducing the decoder workload. With a reduced computational requirement, processor voltage and frequency scaling (DVFS) significantly improve decoder power performance within timing constraints. Our PXA270 system implementation resulted in power savings of as much as 53% with an average PSNR per frame of 24dB compared to the unmodified video.
随着多媒体应用在手持设备中的重要性的增加,这些设备的计算压力和相应的能量需求也在不断增长。便携式多媒体设备本身的能源供应有限,面临着严格的能源限制,需要优化节能。功耗感知应用程序为用户提供了在性能和电池寿命之间进行优先级和交易的灵活性。本文介绍了一种功率感知技术,用于用户可选择的功率降低,以换取H.264视频流视频质量的可控降低。该技术使用编码器-解码器对。编码器表征视频流,并通过柔性宏块排序(FMO)通过生成优先级片组向解码器提供信息。解码器根据用户选择的偏好选择性地忽略低优先级的片组,有效地减少了解码器的工作量。随着计算需求的减少,处理器电压和频率缩放(DVFS)在时间限制下显着提高了解码器的功率性能。与未修改的视频相比,我们的PXA270系统实现可节省高达53%的功耗,每帧平均PSNR为24dB。
{"title":"Power reduction via macroblock prioritization for power aware H.264 video applications","authors":"Michael A. Baker, V. Parameswaran, Karam S. Chatha, Baoxin Li","doi":"10.1145/1450135.1450195","DOIUrl":"https://doi.org/10.1145/1450135.1450195","url":null,"abstract":"As the importance of multimedia applications in hand-held devices increases, the computational strain and corresponding demand for energy in such devices continues to grow. Portable multimedia devices with inherently limited energy supplies face tight energy constraints and require optimization for energy conservation. Power-aware applications give their users flexibility to prioritize and trade between performance and battery-life.\u0000 This paper introduces a power-aware technique for user selectable power reduction in exchange for controlled reductions in video quality for H.264 video streams. The technique uses an encoder-decoder pair. The encoder characterizes video streams and provides information to the decoder via Flexible Macroblock Ordering (FMO) by generating prioritized slice groups. The decoder selectively ignores low priority slice groups based on user selected preference effectively reducing the decoder workload. With a reduced computational requirement, processor voltage and frequency scaling (DVFS) significantly improve decoder power performance within timing constraints. Our PXA270 system implementation resulted in power savings of as much as 53% with an average PSNR per frame of 24dB compared to the unmodified video.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115060703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Profiling of lossless-compression algorithms for a novel biomedical-implant architecture 一种新型生物医学植入体结构的无损压缩算法分析
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450160
C. Strydis, G. Gaydadjiev
In view of a booming market for microelectronic implants, our ongoing research work is focusing on the specification and design of a novel biomedical microprocessor core targeting a large subset of existing and future biomedical applications. Towards this end, we have taken steps in identifying various tasks commonly required by such applications and profiling their behavior and requirements. A prominent family of such tasks is lossless data compression. In this work we profile a large collection of compression algorithms on suitably selected biomedical workloads. Compression ratio, average and peak power consumption, total energy budget, compression rate and program-code size metrics have been evaluated. Findings indicate the best-performing algorithms across most metrics to be mlzo (scores high in 5 out of 6 imposed metrics) and fin (present in 4 out of 6 metrics). Further mlzo profiling reveals the dominance of i) address-generation, load, branch and compare instructions, and ii) interdependent logical-logical and logical-compare instructions combinations.
鉴于微电子植入物市场的蓬勃发展,我们正在进行的研究工作集中在规范和设计一种新型生物医学微处理器核心,目标是现有和未来生物医学应用的一个大子集。为了达到这个目的,我们已经采取步骤来识别这些应用程序通常需要的各种任务,并分析它们的行为和需求。这类任务的一个突出的家族是无损数据压缩。在这项工作中,我们在适当选择的生物医学工作负载上分析了大量压缩算法。压缩比,平均和峰值功耗,总能源预算,压缩率和程序代码大小指标进行了评估。研究结果表明,在大多数指标中表现最好的算法是mlzo(在6个强制指标中有5个得分高)和fin(在6个指标中有4个得分高)。进一步的mlzo分析揭示了i)地址生成、加载、分支和比较指令的主导地位,以及ii)相互依赖的逻辑-逻辑和逻辑-比较指令组合。
{"title":"Profiling of lossless-compression algorithms for a novel biomedical-implant architecture","authors":"C. Strydis, G. Gaydadjiev","doi":"10.1145/1450135.1450160","DOIUrl":"https://doi.org/10.1145/1450135.1450160","url":null,"abstract":"In view of a booming market for microelectronic implants, our ongoing research work is focusing on the specification and design of a novel biomedical microprocessor core targeting a large subset of existing and future biomedical applications. Towards this end, we have taken steps in identifying various tasks commonly required by such applications and profiling their behavior and requirements. A prominent family of such tasks is lossless data compression. In this work we profile a large collection of compression algorithms on suitably selected biomedical workloads. Compression ratio, average and peak power consumption, total energy budget, compression rate and program-code size metrics have been evaluated. Findings indicate the best-performing algorithms across most metrics to be mlzo (scores high in 5 out of 6 imposed metrics) and fin (present in 4 out of 6 metrics). Further mlzo profiling reveals the dominance of i) address-generation, load, branch and compare instructions, and ii) interdependent logical-logical and logical-compare instructions combinations.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"168 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124153832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Highly-cited ideas in system codesign and synthesis 在系统协同设计和综合中被高度引用的思想
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450178
F. Vahid, T. Givargis
We conducted a study of citations of papers published between 1996 and 2006 in the CODES and ISSS conferences, representing the hardware/software codesign and system synthesis community. Citations, meaning non-self-citations only, were considered from all papers known to Google Scholar, as well as only from subsequent CODES/ISSS papers. We list the most-cited CODES/ISSS papers of each year, summarizing their topics, and discussing common features of those papers. For comparison purposes, we also measured citations for the computer architecture community's ISCA and MICRO conferences, and for the field-programmable gate array community's FPGA and FCCM conferences. We point out several interesting differences among the citation patterns of the three communities.
我们对1996年至2006年间在code和ISSS会议上发表的论文的引用进行了研究,这些会议代表了硬件/软件协同设计和系统综合社区。引用,即仅非自引,被考虑来自Google Scholar已知的所有论文,以及随后的CODES/ISSS论文。我们列出了每年被引用最多的CODES/ISSS论文,总结了它们的主题,并讨论了这些论文的共同特征。为了比较,我们还测量了计算机体系结构社区的ISCA和MICRO会议以及现场可编程门阵列社区的FPGA和FCCM会议的引用。我们指出了三个群落在引文模式上的一些有趣的差异。
{"title":"Highly-cited ideas in system codesign and synthesis","authors":"F. Vahid, T. Givargis","doi":"10.1145/1450135.1450178","DOIUrl":"https://doi.org/10.1145/1450135.1450178","url":null,"abstract":"We conducted a study of citations of papers published between 1996 and 2006 in the CODES and ISSS conferences, representing the hardware/software codesign and system synthesis community. Citations, meaning non-self-citations only, were considered from all papers known to Google Scholar, as well as only from subsequent CODES/ISSS papers. We list the most-cited CODES/ISSS papers of each year, summarizing their topics, and discussing common features of those papers. For comparison purposes, we also measured citations for the computer architecture community's ISCA and MICRO conferences, and for the field-programmable gate array community's FPGA and FCCM conferences. We point out several interesting differences among the citation patterns of the three communities.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130450191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A performance-oriented hardware/software partitioning for datapath applications 用于数据路径应用程序的面向性能的硬件/软件分区
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450149
L. Frigerio, F. Salice
This article proposes a hardware/software partitioning method targeted to performance-constrained systems for datapath applications. Exploiting a platform based design, a Timed Petri Net formalism is proposed to represent the mapping of the application onto the platform, allowing to statically extract performance estimations in early phases of the design process and without the need of expensive simulations. The mapping process is generalized in order to allow an automatic exploration of the solution space, that identifies the best performance/area configurations among several application-architecture combinations. The method is evaluated implementing a typical datapath performance constrained system, i.e. a packet processing application.
本文提出了一种针对数据路径应用程序的性能受限系统的硬件/软件分区方法。利用基于平台的设计,提出了一个定时Petri网形式来表示应用程序到平台的映射,允许在设计过程的早期阶段静态提取性能估计,而不需要昂贵的模拟。映射过程是一般化的,以便允许对解决方案空间进行自动探索,从而在几个应用程序体系结构组合中确定最佳性能/区域配置。该方法在一个典型的数据路径性能受限系统中进行了评估,即数据包处理应用程序。
{"title":"A performance-oriented hardware/software partitioning for datapath applications","authors":"L. Frigerio, F. Salice","doi":"10.1145/1450135.1450149","DOIUrl":"https://doi.org/10.1145/1450135.1450149","url":null,"abstract":"This article proposes a hardware/software partitioning method targeted to performance-constrained systems for datapath applications. Exploiting a platform based design, a Timed Petri Net formalism is proposed to represent the mapping of the application onto the platform, allowing to statically extract performance estimations in early phases of the design process and without the need of expensive simulations. The mapping process is generalized in order to allow an automatic exploration of the solution space, that identifies the best performance/area configurations among several application-architecture combinations. The method is evaluated implementing a typical datapath performance constrained system, i.e. a packet processing application.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134309366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A security monitoring service for NoCs noc的安全监控服务
Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450180
Leandro Fiorin, G. Palermo, C. Silvano
As computing and communications increasingly pervade our lives, security and protection of sensitive data and systems are emerging as extremely important issues. Networks-on-Chip (NoCs) have appeared as design strategy to cope with the rapid increase in complexity of Multiprocessor Systems-on-Chip (MPSoCs), but only recently research community have addressed security on NoC-based architectures. In this paper, we present a monitoring system for NoC based architectures, whose goal is to help detect security violations carried out against the system. Information collected are sent to a central unit for efficiently counteracting actions performed by attackers. We detail the design of the basic blocks and analyse overhead associated with the ASIC implementation of the monitoring system, discussing type of security threats that it can help detect and counteract.
随着计算和通信日益渗透到我们的生活中,敏感数据和系统的安全和保护已成为极其重要的问题。片上网络(noc)作为一种设计策略已经出现,以应对多处理器片上系统(mpsoc)复杂性的快速增长,但直到最近研究团体才开始关注基于noc架构的安全性。在本文中,我们提出了一个基于NoC架构的监控系统,其目标是帮助检测对系统执行的安全违规行为。收集到的信息被发送到一个中央单元,以有效地抵消攻击者所执行的操作。我们详细介绍了基本模块的设计,并分析了与监控系统的ASIC实现相关的开销,讨论了它可以帮助检测和抵消的安全威胁类型。
{"title":"A security monitoring service for NoCs","authors":"Leandro Fiorin, G. Palermo, C. Silvano","doi":"10.1145/1450135.1450180","DOIUrl":"https://doi.org/10.1145/1450135.1450180","url":null,"abstract":"As computing and communications increasingly pervade our lives, security and protection of sensitive data and systems are emerging as extremely important issues. Networks-on-Chip (NoCs) have appeared as design strategy to cope with the rapid increase in complexity of Multiprocessor Systems-on-Chip (MPSoCs), but only recently research community have addressed security on NoC-based architectures.\u0000 In this paper, we present a monitoring system for NoC based architectures, whose goal is to help detect security violations carried out against the system. Information collected are sent to a central unit for efficiently counteracting actions performed by attackers. We detail the design of the basic blocks and analyse overhead associated with the ASIC implementation of the monitoring system, discussing type of security threats that it can help detect and counteract.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116926764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 76
期刊
International Conference on Hardware/Software Codesign and System Synthesis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1