首页 > 最新文献

Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems最新文献

英文 中文
Built-in self-testable data path synthesis 内置的自测试数据路径合成
Laurence Tianruo Yangt, Jon Muxio
In this paper, we describe a high-level data path allocation algorithm to facilitate built-in self test. It generates self-testable data path design while maximizing the sharing of modules and test registers. The sharing of modules and test registers enables only a small number of registers is modified for BIST, thereby decreasing the hardware area which is one of the major overheads for BIST technique. In our approach, both module allocation and register allocation are performed incrementally. In each iteration, module allocation is guided by a testability balance technique while register allocation aims at increasing the sharing degrees of registers. With a variety of benchmarks, we demonstrate the advantage of our approach compared with other conventional approaches.
在本文中,我们描述了一个高级数据路径分配算法,以促进内置自测试。它生成可自我测试的数据路径设计,同时最大限度地共享模块和测试寄存器。模块和测试寄存器的共享使得BIST只需要修改少量的寄存器,从而减少了硬件面积,这是BIST技术的主要开销之一。在我们的方法中,模块分配和寄存器分配都是增量执行的。在每次迭代中,模块分配以可测试性平衡技术为指导,寄存器分配以提高寄存器的共享程度为目标。通过各种基准测试,我们展示了与其他传统方法相比,我们的方法的优势。
{"title":"Built-in self-testable data path synthesis","authors":"Laurence Tianruo Yangt, Jon Muxio","doi":"10.1109/IWV.2001.923143","DOIUrl":"https://doi.org/10.1109/IWV.2001.923143","url":null,"abstract":"In this paper, we describe a high-level data path allocation algorithm to facilitate built-in self test. It generates self-testable data path design while maximizing the sharing of modules and test registers. The sharing of modules and test registers enables only a small number of registers is modified for BIST, thereby decreasing the hardware area which is one of the major overheads for BIST technique. In our approach, both module allocation and register allocation are performed incrementally. In each iteration, module allocation is guided by a testability balance technique while register allocation aims at increasing the sharing degrees of registers. With a variety of benchmarks, we demonstrate the advantage of our approach compared with other conventional approaches.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129525951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Energy locality: processing/communication/interface tradeoffs to optimize energy in mobile systems 能量局部性:在移动系统中优化能量的处理/通信/接口权衡
D. Siewiorek
Summary form only given, as follows. As computers continue to shrink, research and commercial interest in mobile/wearable computing is rapidly growing. Unlike traditional desktop computing in which the user is required to come to the computer, mobile computing brings the computer to the user. Mobile/wearable computers represent the next evolutionary step in the trend toward more people-centric computing. One of the key problems with mobile/wearable computing is energy consumption. Battery weight for mobile/wearable computers often exceeds the weight of all other components combined. In order to make mobile/wearable computing widely applicable, major advances in reducing power consumption and battery weight are needed. While a "Moore's Law" exists for power consumption of microprocessors with mW/MIPS decreasing by a factor of ten every five year, there is no such similar trend in wireless communications. This suggests that future wearable computers will be communications bound. In fact, we estimate that nearly 80% of the power consumed by wearable computers can be due to communications. Trading off energy expensive communication for energy cheap computation through effective partitioning of control and data can result in significant energy savings. Examples and measurements will illustrate how the use of proxies can reduce power consumption due to communications by several orders of magnitude. In addition, the interface design must be carefully matched with user tasks and balanced against energy consumption. Many complex and interrelated issues determine the balance between ease-of-use and power consumption. Simply trading off ease-of-use for lower per operation power consumption may result in higher task energy consumption due to the increase in the number of operations needed to traverse a less intuitive interface. The effect of user interface on energy consumption can be evaluated by developing several different interfaces and measuring and comparing the ease-of-use and energy consumption. In conclusion, an architecture that supports these studies will be introduced. The Spot wearable computer includes a dozen power monitors that can be read under software control to determine which subsystems are active and their power consumption during an application.
仅给出摘要形式,如下。随着计算机的不断缩小,对移动/可穿戴计算的研究和商业兴趣正在迅速增长。与传统的桌面计算不同,在桌面计算中,用户需要来到计算机前,而移动计算将计算机带到用户面前。移动/可穿戴计算机代表了以人为中心的计算趋势的下一个进化步骤。移动/可穿戴计算的关键问题之一是能耗。移动/可穿戴电脑的电池重量通常超过所有其他组件的重量总和。为了使移动/可穿戴计算得到广泛应用,需要在降低功耗和电池重量方面取得重大进展。虽然“摩尔定律”存在于微处理器的功耗,每五年以十倍的速度降低mW/MIPS,但在无线通信中没有类似的趋势。这表明,未来的可穿戴电脑将与通信紧密相连。事实上,我们估计,可穿戴电脑消耗的近80%的电力可能来自通信。通过有效地划分控制和数据,将能源昂贵的通信交换为能源便宜的计算,可以显著节省能源。示例和测量将说明代理的使用如何将由于通信导致的功耗降低几个数量级。此外,界面设计必须仔细匹配用户任务,并平衡能耗。许多复杂且相互关联的问题决定了易用性和功耗之间的平衡。简单地牺牲易用性来换取更低的每次操作功耗可能会导致更高的任务能耗,因为遍历不太直观的界面所需的操作数量增加。用户界面对能耗的影响可以通过开发几个不同的界面,并测量和比较易用性和能耗来评估。最后,将介绍一个支持这些研究的架构。Spot可穿戴计算机包括十几个电源监视器,可以在软件控制下读取,以确定哪些子系统是活动的,以及它们在应用程序中的功耗。
{"title":"Energy locality: processing/communication/interface tradeoffs to optimize energy in mobile systems","authors":"D. Siewiorek","doi":"10.1109/IWV.2001.923131","DOIUrl":"https://doi.org/10.1109/IWV.2001.923131","url":null,"abstract":"Summary form only given, as follows. As computers continue to shrink, research and commercial interest in mobile/wearable computing is rapidly growing. Unlike traditional desktop computing in which the user is required to come to the computer, mobile computing brings the computer to the user. Mobile/wearable computers represent the next evolutionary step in the trend toward more people-centric computing. One of the key problems with mobile/wearable computing is energy consumption. Battery weight for mobile/wearable computers often exceeds the weight of all other components combined. In order to make mobile/wearable computing widely applicable, major advances in reducing power consumption and battery weight are needed. While a \"Moore's Law\" exists for power consumption of microprocessors with mW/MIPS decreasing by a factor of ten every five year, there is no such similar trend in wireless communications. This suggests that future wearable computers will be communications bound. In fact, we estimate that nearly 80% of the power consumed by wearable computers can be due to communications. Trading off energy expensive communication for energy cheap computation through effective partitioning of control and data can result in significant energy savings. Examples and measurements will illustrate how the use of proxies can reduce power consumption due to communications by several orders of magnitude. In addition, the interface design must be carefully matched with user tasks and balanced against energy consumption. Many complex and interrelated issues determine the balance between ease-of-use and power consumption. Simply trading off ease-of-use for lower per operation power consumption may result in higher task energy consumption due to the increase in the number of operations needed to traverse a less intuitive interface. The effect of user interface on energy consumption can be evaluated by developing several different interfaces and measuring and comparing the ease-of-use and energy consumption. In conclusion, an architecture that supports these studies will be introduced. The Spot wearable computer includes a dozen power monitors that can be read under software control to determine which subsystems are active and their power consumption during an application.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126380540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Structural design composition for C++ hardware models 结构设计为c++组成硬件模型
F. Doucet, V. Sinha, Raj Kumar Gupta
This paper addresses the modeling of layout structure in high level C++ models. Researchers agree that the level of abstraction for integrated circuit design needs to be raised. New languages and methodologies are being proposed, most of them from the software engineering domain. However one of the fundamental hardware design challenges is often overlooked as push button synthesis solutions are sought: physical design predictability. In this paper we describe how C++ constructs should be used to capture structural and physical implementation concerns. Our explanation relies on the importance of the floorplan and component placement estimations at high levels of abstraction. We highlight how using object oriented mechanisms eases the structural modeling of circuit components, and we present a C++ class library design to specify these structural concerns.
本文讨论了在高级c++模型中布局结构的建模。研究人员一致认为集成电路设计的抽象程度需要提高。新的语言和方法正在被提出,其中大多数来自软件工程领域。然而,在寻求按钮综合解决方案时,一个基本的硬件设计挑战往往被忽视:物理设计的可预测性。在本文中,我们描述了应该如何使用c++结构来捕获结构和物理实现关注点。我们的解释依赖于高度抽象的平面规划和组件放置估计的重要性。我们强调了如何使用面向对象的机制简化电路组件的结构建模,并提出了一个c++类库设计来指定这些结构关注点。
{"title":"Structural design composition for C++ hardware models","authors":"F. Doucet, V. Sinha, Raj Kumar Gupta","doi":"10.1109/IWV.2001.923137","DOIUrl":"https://doi.org/10.1109/IWV.2001.923137","url":null,"abstract":"This paper addresses the modeling of layout structure in high level C++ models. Researchers agree that the level of abstraction for integrated circuit design needs to be raised. New languages and methodologies are being proposed, most of them from the software engineering domain. However one of the fundamental hardware design challenges is often overlooked as push button synthesis solutions are sought: physical design predictability. In this paper we describe how C++ constructs should be used to capture structural and physical implementation concerns. Our explanation relies on the importance of the floorplan and component placement estimations at high levels of abstraction. We highlight how using object oriented mechanisms eases the structural modeling of circuit components, and we present a C++ class library design to specify these structural concerns.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115246827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A pipelined LNS ALU 一个流水线化的LNS ALU
M. Arnold
A new ALU design is proposed that is more economical than a conventional Logarithmic Number System (LNS) ALU for pipelined multiply-accumulate applications (such as FIR filters). A novel interpolator that accepts both positive and negative arguments allows rearrangement of the fixed-point adders that implement the LNS addition algorithm. The area for the resulting circuit is essentially the same as the traditional LNS approach, but the critical path for the proposed circuit is shorter, allowing a faster cycle time and/or a shorter latency. To make the advantages of the improved LNS ALU available to end users, new primitive operations (increment-multiply and multiply-increment-multiply) should be supported instead of the more traditional add and multiply-accumulate operations. The Verilog coding for such a novel increment-multiply module is given.
提出了一种新的ALU设计,它比传统的对数系统(LNS) ALU更经济,适用于流水线乘累加应用(如FIR滤波器)。一种接受正负参数的新颖插值器允许重新排列实现LNS加法算法的定点加法器。所得电路的面积基本上与传统的LNS方法相同,但是所建议的电路的关键路径更短,从而允许更快的周期时间和/或更短的延迟。为了使改进后的LNS ALU的优势能够提供给终端用户,应该支持新的基本操作(增量-乘法和乘法-增量-乘法),而不是更传统的加法和乘法-累加操作。给出了这种新型增量乘模块的Verilog编码。
{"title":"A pipelined LNS ALU","authors":"M. Arnold","doi":"10.1109/IWV.2001.923155","DOIUrl":"https://doi.org/10.1109/IWV.2001.923155","url":null,"abstract":"A new ALU design is proposed that is more economical than a conventional Logarithmic Number System (LNS) ALU for pipelined multiply-accumulate applications (such as FIR filters). A novel interpolator that accepts both positive and negative arguments allows rearrangement of the fixed-point adders that implement the LNS addition algorithm. The area for the resulting circuit is essentially the same as the traditional LNS approach, but the critical path for the proposed circuit is shorter, allowing a faster cycle time and/or a shorter latency. To make the advantages of the improved LNS ALU available to end users, new primitive operations (increment-multiply and multiply-increment-multiply) should be supported instead of the more traditional add and multiply-accumulate operations. The Verilog coding for such a novel increment-multiply module is given.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124467917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Current sensing techniques for global interconnects in very deep submicron (VDSM) CMOS 甚深亚微米(VDSM) CMOS中全局互连的传感技术
A. Maheshwari, Wayne Burleson
Sensing current instead of voltage provides an alternative to signaling on the long wires that are increasingly limiting the performance of CMOS as it scales into the VDSM regime (<0.25 /spl mu/). Current-mode techniques have been proposed for sensing bit-lines. We present a comparative study of Current-sensing with the optimal repeater insertion technique for wires from 0.35 cm to 1.75 cm in length. Simulation results using SPICE for 0.18 /spl mu/ showed that current-sensing was faster and lower-power when compared to optimal repeater insertion technique. While the power dissipated by the optimal repeater circuit increased linearly with line length, power dissipated by the current-sensing circuit was almost constant for longer lines. Inductance had little effect on the differential current sensing technique.
感应电流而不是电压为长导线上的信号提供了一种替代方案,随着CMOS扩展到VDSM范围(<0.25 /spl mu/),长导线越来越限制了CMOS的性能。电流模式技术已被提出用于检测位线。我们提出了一个比较研究的电流传感与最佳中继器插入技术导线从0.35厘米至1.75厘米的长度。基于SPICE的0.18 /spl mu/的仿真结果表明,与最优中继器插入技术相比,电流传感速度更快,功耗更低。最优中继器电路的功耗随线路长度线性增加,而电流感测电路的功耗在较长的线路上几乎是恒定的。电感对差动电流传感技术影响不大。
{"title":"Current sensing techniques for global interconnects in very deep submicron (VDSM) CMOS","authors":"A. Maheshwari, Wayne Burleson","doi":"10.1109/IWV.2001.923141","DOIUrl":"https://doi.org/10.1109/IWV.2001.923141","url":null,"abstract":"Sensing current instead of voltage provides an alternative to signaling on the long wires that are increasingly limiting the performance of CMOS as it scales into the VDSM regime (<0.25 /spl mu/). Current-mode techniques have been proposed for sensing bit-lines. We present a comparative study of Current-sensing with the optimal repeater insertion technique for wires from 0.35 cm to 1.75 cm in length. Simulation results using SPICE for 0.18 /spl mu/ showed that current-sensing was faster and lower-power when compared to optimal repeater insertion technique. While the power dissipated by the optimal repeater circuit increased linearly with line length, power dissipated by the current-sensing circuit was almost constant for longer lines. Inductance had little effect on the differential current sensing technique.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126846705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
A novel architecture for low-power design of parallel multipliers 并行乘法器低功耗设计的新架构
A. Fayed, M. Bayoumi
In this paper, a new architecture for low-power design of parallel multipliers is proposed. Reduction of power consumption is achieved by reducing the circuit activity at the architecture level by dividing the multiplication circuit into clusters of smaller multipliers. By applying clock gating techniques and preprocessing operations on the input pattern using simple logic functions, some of these clusters that are producing a zero result can be disabled and hence saving the switching power component that could be consumed by these clusters. The amount of power savings is dependent on the nature of the input pattern, which varies according to the application. Analysis of the input pattern is performed. For testing purposes, A 8-bit multiplier prototype is constructed in 0.35 micron double metal CMOS technology using Cadence development tools. For the average case when all the input combinations have an equal probability of occurrence, HSPICE simulation results at 3.3 V and 500 MHz frequency show that the proposed architecture results in 13.4% power savings.
本文提出了一种新的低功耗并行乘法器设计体系结构。通过将乘法电路划分为较小的乘法器簇,可以在架构级别上减少电路活动,从而实现功耗的降低。通过使用简单的逻辑函数对输入模式应用时钟门控技术和预处理操作,可以禁用一些产生零结果的集群,从而节省可能被这些集群消耗的开关功率组件。省电的数量取决于输入模式的性质,输入模式根据应用程序的不同而不同。对输入模式进行分析。为了测试目的,使用Cadence开发工具,采用0.35微米双金属CMOS技术构建了一个8位乘法器原型。对于所有输入组合出现概率相等的平均情况,在3.3 V和500 MHz频率下的HSPICE仿真结果表明,所提出的架构可节省13.4%的功耗。
{"title":"A novel architecture for low-power design of parallel multipliers","authors":"A. Fayed, M. Bayoumi","doi":"10.1109/IWV.2001.923154","DOIUrl":"https://doi.org/10.1109/IWV.2001.923154","url":null,"abstract":"In this paper, a new architecture for low-power design of parallel multipliers is proposed. Reduction of power consumption is achieved by reducing the circuit activity at the architecture level by dividing the multiplication circuit into clusters of smaller multipliers. By applying clock gating techniques and preprocessing operations on the input pattern using simple logic functions, some of these clusters that are producing a zero result can be disabled and hence saving the switching power component that could be consumed by these clusters. The amount of power savings is dependent on the nature of the input pattern, which varies according to the application. Analysis of the input pattern is performed. For testing purposes, A 8-bit multiplier prototype is constructed in 0.35 micron double metal CMOS technology using Cadence development tools. For the average case when all the input combinations have an equal probability of occurrence, HSPICE simulation results at 3.3 V and 500 MHz frequency show that the proposed architecture results in 13.4% power savings.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130324521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Application of output prediction logic to differential CMOS 输出预测逻辑在差分CMOS中的应用
Su Go, L. McMurchie, Carl Sechen
We apply the output prediction logic (OPL) technique to the differential CMOS logic family. Including the effects of process, voltage and temperature (PVT) variations, we show that OPL differential CMOS is more than 40% faster than the single-rail OPL-dynamic logic family, and nearly 5 times faster than optimized static CMOS. We also demonstrate an OPL-differential 64:2 compressor that is 37% faster than the OPL-dynamic version. Finally, we show that OPL-differential is nearly twice as fast as differential domino.
我们将输出预测逻辑(OPL)技术应用于差分CMOS逻辑系列。考虑到工艺、电压和温度(PVT)变化的影响,我们发现OPL差分CMOS比单轨OPL动态逻辑家族快40%以上,比优化后的静态CMOS快近5倍。我们还演示了一种opl差分64:2压缩机,比opl动态版本快37%。最后,我们证明了OPL-differential的速度几乎是微分多米诺的两倍。
{"title":"Application of output prediction logic to differential CMOS","authors":"Su Go, L. McMurchie, Carl Sechen","doi":"10.1109/IWV.2001.923140","DOIUrl":"https://doi.org/10.1109/IWV.2001.923140","url":null,"abstract":"We apply the output prediction logic (OPL) technique to the differential CMOS logic family. Including the effects of process, voltage and temperature (PVT) variations, we show that OPL differential CMOS is more than 40% faster than the single-rail OPL-dynamic logic family, and nearly 5 times faster than optimized static CMOS. We also demonstrate an OPL-differential 64:2 compressor that is 37% faster than the OPL-dynamic version. Finally, we show that OPL-differential is nearly twice as fast as differential domino.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134022133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A low power SIMD architecture for affine-based texture mapping 基于仿射纹理映射的低功耗SIMD架构
Wael Badawy
This paper presents a novel low power SIMD architecture for texture mapping using transformation. Low power has been achieved by exploring the properties of the affine transformation to reduce the computational cost. The architecture has been prototyped using 0.35 /spl mu/m CMOS technology with three layers of metal. The proposed architecture can be used in video object motion tracking and texture warping processors.
本文提出了一种利用变换实现纹理映射的低功耗SIMD架构。通过探索仿射变换的性质,降低了计算成本,实现了低功耗。该架构采用三层金属的0.35 /spl mu/m CMOS技术进行原型设计。该结构可用于视频对象运动跟踪和纹理翘曲处理。
{"title":"A low power SIMD architecture for affine-based texture mapping","authors":"Wael Badawy","doi":"10.1109/IWV.2001.923151","DOIUrl":"https://doi.org/10.1109/IWV.2001.923151","url":null,"abstract":"This paper presents a novel low power SIMD architecture for texture mapping using transformation. Low power has been achieved by exploring the properties of the affine transformation to reduce the computational cost. The architecture has been prototyped using 0.35 /spl mu/m CMOS technology with three layers of metal. The proposed architecture can be used in video object motion tracking and texture warping processors.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129933505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VLIW scheduling for energy and performance VLIW调度能源和性能
A. Parikh, M. Kandemir, N. Vijaykrishnan, M. J. Irwin
We present and evaluate several instruction scheduling algorithms that reorder a given sequence of instructions taking into account the energy considerations. We first compare a performance oriented scheduling technique with three energy-oriented instruction scheduling algorithms from both performance (execution cycles of the resulting schedules) and energy consumption points of view. Then, we propose scheduling algorithms that consider energy and performance at the same time. The results obtained using randomly generated directed acyclic graphs show that these techniques are quite successful in reducing energy consumption and their performance (in terms of execution cycles) is comparable to that of a pure performance-oriented scheduling.
我们提出并评估了几种指令调度算法,这些算法在考虑能量因素的情况下对给定的指令序列进行重新排序。我们首先从性能(结果调度的执行周期)和能耗的角度比较了面向性能的调度技术与三种面向能量的指令调度算法。然后,我们提出了同时考虑能量和性能的调度算法。使用随机生成的有向无环图获得的结果表明,这些技术在降低能耗方面非常成功,而且它们的性能(就执行周期而言)与纯粹的面向性能的调度相当。
{"title":"VLIW scheduling for energy and performance","authors":"A. Parikh, M. Kandemir, N. Vijaykrishnan, M. J. Irwin","doi":"10.1109/IWV.2001.923148","DOIUrl":"https://doi.org/10.1109/IWV.2001.923148","url":null,"abstract":"We present and evaluate several instruction scheduling algorithms that reorder a given sequence of instructions taking into account the energy considerations. We first compare a performance oriented scheduling technique with three energy-oriented instruction scheduling algorithms from both performance (execution cycles of the resulting schedules) and energy consumption points of view. Then, we propose scheduling algorithms that consider energy and performance at the same time. The results obtained using randomly generated directed acyclic graphs show that these techniques are quite successful in reducing energy consumption and their performance (in terms of execution cycles) is comparable to that of a pure performance-oriented scheduling.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"1960 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130203938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Towards a very high bandwidth wireless battery powered device 朝着高带宽无线电池供电设备的方向发展
J. Glossner, D. Routenberg, E. Hokenek, M. Moudgill, M. Schulte, P. Balzola, S. Vassiliadis
We discuss the hardware and software challenges in building a 2 Mbit per second wireless battery powered communications device. Of primary importance is power dissipation. To achieve aggressive power targets, a host of new techniques are required at all levels of the design hierarchy. Techniques for parallelizing saturating arithmetic will become important because of the software optimizations they enable. Highly configurable programmable structures will enable multiprotocol SOC solutions. To program complex SOCs, new compiler techniques will be required. Hardware implementations will need to be intimately aware of these software techniques. In particular both signal processing code written in C and control code written in Java will drive new compilation techniques to enable broadband 3G wireless systems.
我们讨论了在构建2mbit / s无线电池供电通信设备时所面临的硬件和软件挑战。最重要的是功耗。为了实现激进的功率目标,在设计层次的各个层面都需要大量的新技术。并行化饱和算法的技术将变得非常重要,因为它们可以实现软件优化。高度可配置的可编程结构将实现多协议SOC解决方案。要编写复杂的soc,就需要新的编译器技术。硬件实现需要密切关注这些软件技术。特别是用C编写的信号处理代码和用Java编写的控制代码都将推动新的编译技术,以实现宽带3G无线系统。
{"title":"Towards a very high bandwidth wireless battery powered device","authors":"J. Glossner, D. Routenberg, E. Hokenek, M. Moudgill, M. Schulte, P. Balzola, S. Vassiliadis","doi":"10.1109/IWV.2001.923132","DOIUrl":"https://doi.org/10.1109/IWV.2001.923132","url":null,"abstract":"We discuss the hardware and software challenges in building a 2 Mbit per second wireless battery powered communications device. Of primary importance is power dissipation. To achieve aggressive power targets, a host of new techniques are required at all levels of the design hierarchy. Techniques for parallelizing saturating arithmetic will become important because of the software optimizations they enable. Highly configurable programmable structures will enable multiprotocol SOC solutions. To program complex SOCs, new compiler techniques will be required. Hardware implementations will need to be intimately aware of these software techniques. In particular both signal processing code written in C and control code written in Java will drive new compilation techniques to enable broadband 3G wireless systems.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130651179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1