Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors最新文献

英文中文

SIMD extension to VLIW multicluster processors for embedded applications 用于嵌入式应用的VLIW多集群处理器的SIMD扩展

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors

Pub Date : 2002-09-16 DOI: 10.1109/ICCD.2002.1106823

D. Barretta, W. Fornaciari, M. Sami, D. Pau

We propose a retargetable architecture, based on a multicluster VLIW processor that can exploit either instruction level parallelism (ILP) or ILP and data level parallelism (DLP) jointly in a SIMD fashion. Simulation results show that performances may increase significantly when the application is compiled for the proposed architecture.

我们提出了一种基于多集群VLIW处理器的可重目标架构，该架构可以以SIMD方式共同利用指令级并行性(ILP)或ILP和数据级并行性(DLP)。仿真结果表明，在该架构下编译应用程序可以显著提高性能。

引用次数: 6

The Imagine Stream Processor Imagine流处理器

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors

Pub Date : 2002-09-16 DOI: 10.1109/ICCD.2002.1106783

U. Kapasi, W. Dally, S. Rixner, John Douglas Owens, Brucek Khailany

The Imagine Stream Processor is a single-chip programmable media processor with 48 parallel ALUs. At 400 MHz, this translates to a peak arithmetic rate of 16 GFLOPS on single-precision data and 32 GOPS on 16 bit fixed-point data. The scalability of Imagine's programming model and architecture enable it to achieve such high arithmetic rates. Imagine executes applications that have been mapped to the stream programming model. The stream model decomposes applications into a set of computation kernels that operate on data streams. This mapping exposes the inherent locality and parallelism in the application, and Imagine exploits the locality and parallelism to provide a scalable architecture that supports 48 ALUs on a single chip. This paper presents the Imagine architecture and programming model in the first half and explores the scalability of the Imagine architecture in the second half.

Imagine流处理器是一款单芯片可编程媒体处理器，具有48个并行alu。在400mhz时，单精度数据的峰值运算速率为16 GFLOPS, 16位定点数据的峰值运算速率为32 GOPS。Imagine的编程模型和架构的可伸缩性使其能够实现如此高的算术速率。Imagine执行映射到流编程模型的应用程序。流模型将应用程序分解为一组对数据流进行操作的计算内核。这种映射暴露了应用程序中固有的局部性和并行性，Imagine利用局部性和并行性提供了一个支持单个芯片上48个alu的可扩展架构。本文在前半部分介绍了Imagine架构和编程模型，并在后半部分探讨了Imagine架构的可扩展性。

引用次数: 256

Dynamic loop caching meets preloaded loop caching-a hybrid approach 动态循环缓存满足预加载循环缓存——一种混合方法

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors

Pub Date : 2002-09-16 DOI: 10.1109/ICCD.2002.1106810

A. Gordon-Ross, F. Vahid

Dynamically-loaded tagless loop caching reduces instruction fetch power for embedded software with small loops, but only supports simple loops without taken branches. Preloaded tagless loop caching supports complex loops with branches and thus can reduce power further, but has a limit on the total number of instructions cached. We show that each does well on particular benchmarks, but neither is best across all of those benchmarks. We present a new hybrid loop cache that only preloads the complex loops, while dynamically loading other loops, thus achieving the strengths of each approach. We demonstrate better power savings than either previous approach alone.

动态加载的无标签循环缓存减少了具有小循环的嵌入式软件的指令获取能力，但只支持没有分支的简单循环。预加载的无标签循环缓存支持带有分支的复杂循环，因此可以进一步降低功耗，但对缓存的指令总数有限制。我们展示了它们在特定的基准测试中表现良好，但在所有的基准测试中都不是最好的。我们提出了一种新的混合循环缓存，它只预加载复杂的循环，同时动态加载其他循环，从而实现了每种方法的优势。我们演示了比之前单独使用的任何一种方法都更好的节能。

引用次数: 14

On the detectability of parametric faults in analog circuits 模拟电路中参数故障的可检测性研究

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors

Pub Date : 2002-09-16 DOI: 10.1109/ICCD.2002.1106781

J. Savir, Zhen Guo

This paper investigates the detectability of parameter faults in linear, time-invariant, analog circuits. We show that there are inherent limitations with regard to analog fault detectability.

研究了线性时不变模拟电路中参数故障的可检测性。我们表明，在模拟故障检测方面存在固有的局限性。

引用次数: 22

Design methodology and system for a configurable media embedded processor extensible to VLIW architecture 可扩展到VLIW架构的可配置媒体嵌入式处理器的设计方法和系统

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors

Pub Date : 2002-09-16 DOI: 10.1109/ICCD.2002.1106738

Atsushi Mizuno, K. Kohno, Ryuichiro Ohyama, T. Tokuyoshi, H. Uetani, H. Eichel, T. Miyamori, Nobu Matsumoto, M. Matsui

A new integrated system to design and generate a configurable embedded processor for multimedia applications has been developed. The system, "Media embedded Processor Integrator", provides a distinctive feature that generates development tools, such as compilers and simulators, not only for the configurable embedded processor but also for its template based extensible VLIW co-processor. This paper describes the architecture and the function of the "Media embedded Processor Integrator" especially focusing on how the system treats the VLIW co-processor extension. In order to determine an ISA for a 3-way VLIW co-processor for image recognition as an example, several different sets of ISA were evaluated and compared for the best performance using corresponding compilers and simulators, which were generated by the system. The system greatly contributed to reduce this entire ISA definition process.

开发了一种用于多媒体应用的可配置嵌入式处理器设计和生成的集成系统。该系统“媒体嵌入式处理器集成器”提供了一个独特的功能，不仅为可配置嵌入式处理器，而且为其基于模板的可扩展VLIW协处理器生成编译器和模拟器等开发工具。本文介绍了“媒体嵌入式处理器集成器”的体系结构和功能，重点介绍了该系统如何处理VLIW协处理器扩展。为了确定用于图像识别的3路VLIW协处理器的ISA，以系统生成的相应编译器和模拟器为例，对几种不同的ISA进行了评估和比较，以获得最佳性能。该系统极大地减少了整个ISA定义过程。

引用次数: 35

Embedded protocol processor for fast and efficient packet reception 嵌入式协议处理器，用于快速有效的数据包接收

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors

Pub Date : 2002-09-16 DOI: 10.1109/ICCD.2002.1106804

T. Henriksson, U. Nordqvist, Dake Liu

Computer network equipment presents a bottleneck for further increasing the capacity in the networks. Terminals have problems keeping up with network speed when using general purpose processors for protocol processing. We present a novel processor architecture, that works in-line with the data flow and does not use a traditional von Neuman architecture. The program is contained in three lookup tables within the processor core, which allows for one cycle if-then-else and switch-case-case... execution. The processor is estimated to be able to handle a 10 Gb/s Ethernet connection when implemented in 0.18 micron technology.

计算机网络设备是网络容量进一步增长的瓶颈。当使用通用处理器进行协议处理时，终端在跟上网络速度方面存在问题。我们提出了一种新的处理器架构，它与数据流一致，而不是使用传统的冯·诺依曼架构。该程序包含在处理器核心内的三个查找表中，这允许一个if-then-else和switch-case-case循环…执行。当采用0.18微米技术时，该处理器估计能够处理10gb /s的以太网连接。

引用次数: 21

System-architectures for sensor networks issues, alternatives, and directions 传感器网络问题、替代方案和方向的系统架构

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors

Pub Date : 2002-09-16 DOI: 10.1109/ICCD.2002.1106775

Jessica Feng, F. Koushanfar, M. Potkonjak

Our goal is to identify the key architectural and design issues related to Sensor Networks (SNs), evaluate the proposed solutions, and to outline the most challenging research directions. The evaluation has three scopes ndividual components on SN nodes (processor, communication, storage, sensors, actuators, and power supply), node level and networked system level. The special emphasis is placed on architecture and system software, and on new challenges related to the usage of new types of components in networked systems. The evaluation is guided by anticipated technology trends and both current and future applications. The main conclusion of the analysis is that the architectural and synthesis emphasis will be shifted from computation and to some extent communication components to sensors and actuators.

我们的目标是确定与传感器网络(SNs)相关的关键架构和设计问题，评估提出的解决方案，并概述最具挑战性的研究方向。该评估包括三个范围:SN节点上的单个组件(处理器、通信、存储、传感器、执行器和电源)、节点级和网络系统级。特别强调的是体系结构和系统软件，以及与在网络系统中使用新型组件相关的新挑战。评估以预期的技术趋势以及当前和未来的应用为指导。分析的主要结论是，架构和综合的重点将从计算和某种程度上的通信组件转移到传感器和执行器。

引用次数: 83

Branch predictor prediction: a power-aware branch predictor for high-performance processors 分支预测器预测:高性能处理器的功率感知分支预测器

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors

Pub Date : 2002-09-16 DOI: 10.1109/ICCD.2002.1106813

A. Baniasadi, Andreas Moshovos

We introduce branch predictor prediction (BPP) as a power-aware branch prediction technique for high performance processors. Our predictor reduces branch prediction power dissipation by selectively turning on and off two of the three tables used in the combined branch predictor BPP relies on a small buffer that stores the addresses and the sub-predictors used by the most recent branches executed. Later we refer to this buffer to decide if any of the sub-predictors and the selector could be gated without harming performance. In this paper we study power and performance trade-offs for a subset of SPEC 2k benchmarks. We show that on the average and for an 8-way processor, BPP can reduce branch prediction power dissipation by 28% and 14% compared to non-banked and banked 32k predictors respectively. This comes with a negligible impact on performance (1% max). We show that BPP always reduces power even for smaller predictors and that it offers better overall power and performance compared to simpler predictors.

我们介绍了分支预测器预测(BPP)作为高性能处理器的功率感知分支预测技术。我们的预测器通过选择性地打开和关闭组合分支预测器中使用的三个表中的两个表来减少分支预测功耗。BPP依赖于一个小缓冲区，该缓冲区存储最近执行的分支使用的地址和子预测器。稍后，我们将引用该缓冲区来决定是否可以在不影响性能的情况下对子预测器和选择器进行门控。在本文中，我们研究了SPEC 2k基准测试子集的功耗和性能权衡。我们表明，平均而言，对于8路处理器，与非银行和银行32k预测器相比，BPP可以将分支预测功耗分别降低28%和14%。这对性能的影响微不足道(最多1%)。我们表明，即使对于较小的预测器，BPP也总是降低功耗，并且与更简单的预测器相比，它提供了更好的整体功耗和性能。

引用次数: 34

Register binding based power management for high-level synthesis of control-flow intensive behaviors 基于寄存器绑定的高级控制流密集行为综合电源管理

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors

Pub Date : 2002-09-16 DOI: 10.1109/ICCD.2002.1106800

Lin Zhong, Jiong Luo, Yunsi Fei, N. Jha

A circuit or circuit component that does not contain any spurious switching activity, i.e., activity that is not required by its specified functionality, is called perfectly power managed (PPM). We present a general sufficient condition for register binding to ensure that a given set of functional units is PPM. This condition not only applies to data-flow intensive (DFI) behaviors but also to control-flow intensive (CFI) behaviors. It leads to a straightforward power-managed (PM) register binding algorithm. The proposed algorithm is independent of the functional unit binding and scheduling algorithms. Hence, it can be easily incorporated into existing high-level synthesis systems. For the benchmarks we experimented with, an average 45.9% power reduction was achieved by our method at the cost of 7.7% average area overhead, compared to power-optimized register-transfer level (RTL) circuits which did not use PM register binding.

不包含任何虚假开关活动的电路或电路元件，即其指定功能不需要的活动，称为完全电源管理(PPM)。我们提出了寄存器绑定的一般充分条件，以确保给定的一组功能单元是PPM。这个条件不仅适用于数据流密集型(DFI)行为，也适用于控制流密集型(CFI)行为。它导致了一个简单的电源管理(PM)寄存器绑定算法。该算法独立于功能单元绑定和调度算法。因此，它可以很容易地合并到现有的高级综合系统中。对于我们实验的基准测试，与不使用PM寄存器绑定的功耗优化寄存器传输电平(RTL)电路相比，我们的方法平均降低了45.9%的功耗，平均面积开销为7.7%。

引用次数: 8

A test processor concept for systems-on-a-chip 片上系统的测试处理器概念

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors

Pub Date : 2002-09-01 DOI: 10.1109/ICCD.2002.1106772

C. Galke, M. Pflanz, H. Vierhaus

This paper introduces a new concept for the self test of systems on a chip (SoCs) with embedded processors. We propose hardware- and software-based test strategy. A minimum sized test processor was designed in order to perform on-chip test functions. Its architecture contains special adopted registers to realize LFSR or MISR functions for pattern de-compaction and pattern filtering. High-performance interfaces allow parallel and serial pattern in and output, and a fast test vector comparison. The architecture is scalable and is based on a standard RISC architecture in order to facilitate the use of standard compilers.

本文介绍了一种嵌入式处理器单片系统(soc)自检的新概念。提出了基于硬件和软件的测试策略。为了实现片上测试功能，设计了最小尺寸的测试处理器。它的架构包含了特殊的寄存器来实现LFSR或MISR功能，用于模式解压缩和模式过滤。高性能接口允许并行和串行模式输入和输出，以及快速测试矢量比较。该架构是可扩展的，并且基于标准的RISC架构，以便于使用标准编译器。

引用次数: 16

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀