首页 > 最新文献

2009 22nd International Conference on VLSI Design最新文献

英文 中文
New Techniques for Accelerating Small Delay ATPG and Generating Compact Test Sets 加速小延迟ATPG和生成紧凑测试集的新技术
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.64
Boxue Yin, D. Xiang, Zhen Chen
The small delay defects testing has two challenges. One is that the longest testable path selection for every target fault in ATPG consumes much CPU time. The other is the test data volume are very large. In this paper, we propose two strategies to resolve these two problems. A new path selection in advance scheme is proposed to accelerate ATPG. It aims to find fewer paths and cover more faults in advance, which is different from the previous works. To reduce the test data volume, we propose a novel scan-based test scheme. We partition the scan flip-flops into some scan chains. The first scan flip-flop of every scan chain works in enhanced scan mode. And other scan flip-flops work in broad-side mode. This can significantly increase the don't care bits of every test pattern and provide more room for test compaction. Then the test pattern count can be reduced significantly. Experimental results show the efficiency of these techniques.
小延迟缺陷测试面临两个挑战。一是ATPG中每个目标故障的最长可测试路径选择会消耗大量CPU时间。二是测试数据量非常大。在本文中,我们提出了两种策略来解决这两个问题。提出了一种新的加速ATPG的提前路径选择方案。与以往的作品不同,它的目的是提前找到更少的路径,覆盖更多的故障。为了减少测试数据量,我们提出了一种新的基于扫描的测试方案。我们将扫描触发器划分为若干个扫描链。每个扫描链的第一个扫描触发器工作在增强扫描模式下。其他的扫描触发器工作在宽边模式。这可以显著增加每个测试模式的“不关心”部分,并为测试压缩提供更多空间。然后,测试模式计数可以显著减少。实验结果表明了这些技术的有效性。
{"title":"New Techniques for Accelerating Small Delay ATPG and Generating Compact Test Sets","authors":"Boxue Yin, D. Xiang, Zhen Chen","doi":"10.1109/VLSI.Design.2009.64","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.64","url":null,"abstract":"The small delay defects testing has two challenges. One is that the longest testable path selection for every target fault in ATPG consumes much CPU time. The other is the test data volume are very large. In this paper, we propose two strategies to resolve these two problems. A new path selection in advance scheme is proposed to accelerate ATPG. It aims to find fewer paths and cover more faults in advance, which is different from the previous works. To reduce the test data volume, we propose a novel scan-based test scheme. We partition the scan flip-flops into some scan chains. The first scan flip-flop of every scan chain works in enhanced scan mode. And other scan flip-flops work in broad-side mode. This can significantly increase the don't care bits of every test pattern and provide more room for test compaction. Then the test pattern count can be reduced significantly. Experimental results show the efficiency of these techniques.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132267347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Design and Implementation of Fine-Grain Power Gating with Ground Bounce Suppression 抑制地弹跳的细粒度功率门控的设计与实现
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.63
K. Usami, T. Shirai, T. Hashida, H. Masuda, S. Takeda, M. Nakata, N. Seki, H. Amano, M. Namiki, Masashi Imai, Masaaki Kondo, Hiroshi Nakamura
This paper describes a design and implementation methodology for fine-grain power gating. Since sleep-in and wakeup are controlled in a fine granularity in run time, shortening the transition time between the sleep and active states is strongly required. In particular, shortening the wakeup time is essential because it affects the execution time and hence does the performance. However, this requirement makes suppression of the ground-bounce more difficult. We propose a novel technique to skew the wakeup timings of fine-grain local power domains to suppress the ground bounce. Delay of buffers driving power switches is skewed in the buffer tree by selectively downsizing them. We designed a MIPS R3000 based CPU core in a 90nm CMOS technology and applied our technique to internal function units. Simulation results showed that our technique reduces the rush current to 47% over the case to turn-on the power switches simultaneously. This resulted in suppressing the ground bounce to 53mV with 3.3ns wakeup time. Simulation results from running benchmark programs showed that the total power dissipation for the function units was reduced by up to 15% at 25°C and by 62% at 100°C. Effectiveness in power savings is discussed from the viewpoint of the temperature-dependent break-even points and the consecutive idle time in the program.
本文介绍了一种细粒度功率门控的设计与实现方法。由于睡眠和唤醒在运行时以细粒度控制,因此迫切需要缩短睡眠状态和活动状态之间的转换时间。特别是,缩短唤醒时间至关重要,因为它会影响执行时间,从而影响性能。然而,这一要求使得抑制地面反弹变得更加困难。我们提出了一种新的技术来倾斜细粒局部功率域的唤醒时间,以抑制地面反弹。通过选择性地减小缓冲区的大小,使得驱动电源开关的缓冲区的延迟在缓冲区树中发生偏斜。我们设计了一个基于90纳米CMOS技术的MIPS R3000 CPU内核,并将我们的技术应用于内部功能单元。仿真结果表明,在同时打开电源开关的情况下,我们的技术将激流电流降低到47%。这导致地面反弹抑制到53mV与3.3ns唤醒时间。运行基准程序的仿真结果表明,功能单元的总功耗在25°C时降低了15%,在100°C时降低了62%。从温度相关的盈亏平衡点和程序中连续空闲时间的角度讨论了节电的有效性。
{"title":"Design and Implementation of Fine-Grain Power Gating with Ground Bounce Suppression","authors":"K. Usami, T. Shirai, T. Hashida, H. Masuda, S. Takeda, M. Nakata, N. Seki, H. Amano, M. Namiki, Masashi Imai, Masaaki Kondo, Hiroshi Nakamura","doi":"10.1109/VLSI.Design.2009.63","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.63","url":null,"abstract":"This paper describes a design and implementation methodology for fine-grain power gating. Since sleep-in and wakeup are controlled in a fine granularity in run time, shortening the transition time between the sleep and active states is strongly required. In particular, shortening the wakeup time is essential because it affects the execution time and hence does the performance. However, this requirement makes suppression of the ground-bounce more difficult. We propose a novel technique to skew the wakeup timings of fine-grain local power domains to suppress the ground bounce. Delay of buffers driving power switches is skewed in the buffer tree by selectively downsizing them. We designed a MIPS R3000 based CPU core in a 90nm CMOS technology and applied our technique to internal function units. Simulation results showed that our technique reduces the rush current to 47% over the case to turn-on the power switches simultaneously. This resulted in suppressing the ground bounce to 53mV with 3.3ns wakeup time. Simulation results from running benchmark programs showed that the total power dissipation for the function units was reduced by up to 15% at 25°C and by 62% at 100°C. Effectiveness in power savings is discussed from the viewpoint of the temperature-dependent break-even points and the consecutive idle time in the program.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125893086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
A Method for the Multi-Net Multi-Pin Routing Problem with Layer Assignment 一种具有层分配的多网络多引脚路由问题的方法
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.30
T. Samanta, H. Rahaman, P. Ghosal, P. Dasgupta
Interconnects are vital in deep sub-micron VLSI design, as they impose constraints, such as delay, congestion, crosstalk, power dissipation and others, and consume resources. These parameters affect the efforts for obtaining a feasible solution for the global routing of multiple nets. In addition, efforts are on for exploration and use of non-Manhattan routing architectures. In this work, we focus on the specific problem of multi-net multi-pin global Y -routing for custom-built design styles with several available routing layers. The problem is formulated as a minimum crossing Y -Steiner Minimal tree problem with multi-layer assignment. Experimental results are quite encouraging.
互连在深亚微米VLSI设计中至关重要,因为它们会施加限制,如延迟、拥塞、串扰、功耗等,并消耗资源。这些参数影响了多网全局路由的可行性求解。此外,还在努力探索和使用非曼哈顿路由架构。在这项工作中,我们重点研究了具有多个可用路由层的定制设计风格的多网络多引脚全局Y路由的具体问题。该问题被表述为具有多层分配的最小交叉Y -Steiner最小树问题。实验结果相当令人鼓舞。
{"title":"A Method for the Multi-Net Multi-Pin Routing Problem with Layer Assignment","authors":"T. Samanta, H. Rahaman, P. Ghosal, P. Dasgupta","doi":"10.1109/VLSI.Design.2009.30","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.30","url":null,"abstract":"Interconnects are vital in deep sub-micron VLSI design, as they impose constraints, such as delay, congestion, crosstalk, power dissipation and others, and consume resources. These parameters affect the efforts for obtaining a feasible solution for the global routing of multiple nets. In addition, efforts are on for exploration and use of non-Manhattan routing architectures. In this work, we focus on the specific problem of multi-net multi-pin global Y -routing for custom-built design styles with several available routing layers. The problem is formulated as a minimum crossing Y -Steiner Minimal tree problem with multi-layer assignment. Experimental results are quite encouraging.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"402 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115100780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Why is Design Automation and Reuse of Analog Designs Increasingly Trailing the Digital World? 为什么模拟设计的设计自动化和重用越来越落后于数字世界?
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.107
G. Agarwal, Prakash Bare
Summary: Demand for high performance in today's systems requires some key IP components to be designed in analog, whereas quick turn around time requires significant use of digital IPs. While there has been significant progress in design automation and design reuse of digital circuits in the last couple of decades, much has not changed for analog design. Design capture in low
摘要:当今系统对高性能的需求要求一些关键IP组件以模拟方式设计,而快速的周转时间要求大量使用数字IP。虽然在过去的几十年里,数字电路的设计自动化和设计重用方面取得了重大进展,但模拟电路的设计并没有改变。低设计捕获
{"title":"Why is Design Automation and Reuse of Analog Designs Increasingly Trailing the Digital World?","authors":"G. Agarwal, Prakash Bare","doi":"10.1109/VLSI.Design.2009.107","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.107","url":null,"abstract":"Summary: Demand for high performance in today's systems requires some key IP components to be designed in analog, whereas quick turn around time requires significant use of digital IPs. While there has been significant progress in design automation and design reuse of digital circuits in the last couple of decades, much has not changed for analog design. Design capture in low","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124981428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Security and Dependability of Embedded Systems: A Computer Architects' Perspective 嵌入式系统的安全性和可靠性:一个计算机架构师的视角
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.114
Jörg Henkel, N. Vijaykrishnan, S. Parameswaran, R. Ragel
Designers of embedded systems have traditionally optimized circuits for speed, size, power and time to market. Recently however, the dependability of the system is emerging as a great concern to the modern designer with the decrease in feature size and the increase in the demand for functionality. Yet another crucial concern is the security of systems used for storage of personal details and for financial transactions. A significant number of techniques that are used to overcome security and dependability are the same or have similar origins. Thus this tutorial will examine the overlapping concerns of security and dependability and the design methods used to overcome the problems and threats. This tutorial is divided into four parts: the first will examine dependability issues due to technology effects; the second will look at reliability aware designs; the third, will describe the security threats; and, the fourth part will illustrate the countermeasures to security and reliability issues Part I: Dependability Issues due to Technology Effects and Architectural Countermeasures Moore’s law has been in place for more than four decades. Each new technology node provided advantages in basically all major design constraints (performance, power, area, etc.). When migrating to upcoming technology nodes it will become obvious that this win-win situation soon will be at an end. Or, in other words, in future it becomes far more difficult and expensive to migrate to new technology nodes. One major point is an inherent undependability which will become a challenging problem. Undependability addressed within this part of the tutorial is related to a) Fabrication and Design-Time Effects like “Yield and Process Variations” and “Complexity” as well as b) run-time effects as “Aging Effects”, “Thermal Effects” and “Soft Errors”. The first part of this tutorial will give the details of these effects and a prospect of how these effects might influence future architectures for embedded systems. An overview of selected state-of-the-art paradigms and approaches is given including a focus on organic computing principles as well as run-time adaptive embedded processor architectures that can deal with dependability issues. Part II: Reliability Aware Design for Embedded Systems Design of robust embedded systems meeting stringent quality, reliability, and availability requirements is becoming increasingly difficult in advanced technologies. The current design paradigm which assumes that no gate or interconnect will ever operate incorrectly within the lifetime of a product must change to cope with such failures. New architectural features are required for robust system design with built-in mechanisms for failure tolerance, detection and recovery during normal system operation. This part of the tutorial will focus on new design techniques required for building robust systems: concurrent error detection, recovery, and selfrepair. A broad spectrum of circuit-level, logic-level,
传统上,嵌入式系统的设计者会根据速度、尺寸、功耗和上市时间对电路进行优化。然而,近年来,随着特征尺寸的减小和功能需求的增加,系统的可靠性正成为现代设计人员非常关注的问题。然而,另一个关键问题是用于存储个人信息和进行金融交易的系统的安全性。用于克服安全性和可靠性的大量技术是相同的或具有相似的起源。因此,本教程将研究安全性和可靠性的重叠关注点,以及用于克服这些问题和威胁的设计方法。本教程分为四个部分:第一部分将检查由于技术影响而引起的可靠性问题;第二部分将着眼于可靠性感知设计;第三,将描述安全威胁;第四部分将说明安全性和可靠性问题的对策第一部分:由于技术影响和架构对策引起的可靠性问题摩尔定律已经存在了四十多年。每个新技术节点基本上在所有主要设计约束(性能、功耗、面积等)方面都具有优势。当迁移到即将到来的技术节点时,很明显这种双赢的局面很快就会结束。或者,换句话说,将来迁移到新的技术节点会变得更加困难和昂贵。一个主要问题是固有的不可靠性,这将成为一个具有挑战性的问题。在本教程的这一部分中解决的不可靠性涉及到a)制造和设计时的效果,如“产量和工艺变化”和“复杂性”,以及b)运行时的效果,如“老化效应”,“热效应”和“软错误”。本教程的第一部分将详细介绍这些效果,并展望这些效果如何影响嵌入式系统的未来架构。概述了选定的最先进的范例和方法,包括对有机计算原理的关注,以及可以处理可靠性问题的运行时自适应嵌入式处理器架构。在先进技术中,设计满足严格的质量、可靠性和可用性要求的健壮嵌入式系统变得越来越困难。当前的设计范式假设在产品的生命周期内没有门或互连将永远运行错误,必须改变以应对此类故障。在系统正常运行期间,需要新的体系结构特征来实现强大的系统设计,并内置故障容忍、检测和恢复机制。本教程的这一部分将重点介绍构建健壮系统所需的新设计技术:并发错误检测、恢复和自修复。广泛的电路级,逻辑级,微架构,硬件子系统和软件技术将被覆盖;将介绍各种技术之间的相关权衡。实施的保护机制是由复杂的权力评估决定的
{"title":"Security and Dependability of Embedded Systems: A Computer Architects' Perspective","authors":"Jörg Henkel, N. Vijaykrishnan, S. Parameswaran, R. Ragel","doi":"10.1109/VLSI.Design.2009.114","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.114","url":null,"abstract":"Designers of embedded systems have traditionally optimized circuits for speed, size, power and time to market. Recently however, the dependability of the system is emerging as a great concern to the modern designer with the decrease in feature size and the increase in the demand for functionality. Yet another crucial concern is the security of systems used for storage of personal details and for financial transactions. A significant number of techniques that are used to overcome security and dependability are the same or have similar origins. Thus this tutorial will examine the overlapping concerns of security and dependability and the design methods used to overcome the problems and threats. This tutorial is divided into four parts: the first will examine dependability issues due to technology effects; the second will look at reliability aware designs; the third, will describe the security threats; and, the fourth part will illustrate the countermeasures to security and reliability issues Part I: Dependability Issues due to Technology Effects and Architectural Countermeasures Moore’s law has been in place for more than four decades. Each new technology node provided advantages in basically all major design constraints (performance, power, area, etc.). When migrating to upcoming technology nodes it will become obvious that this win-win situation soon will be at an end. Or, in other words, in future it becomes far more difficult and expensive to migrate to new technology nodes. One major point is an inherent undependability which will become a challenging problem. Undependability addressed within this part of the tutorial is related to a) Fabrication and Design-Time Effects like “Yield and Process Variations” and “Complexity” as well as b) run-time effects as “Aging Effects”, “Thermal Effects” and “Soft Errors”. The first part of this tutorial will give the details of these effects and a prospect of how these effects might influence future architectures for embedded systems. An overview of selected state-of-the-art paradigms and approaches is given including a focus on organic computing principles as well as run-time adaptive embedded processor architectures that can deal with dependability issues. Part II: Reliability Aware Design for Embedded Systems Design of robust embedded systems meeting stringent quality, reliability, and availability requirements is becoming increasingly difficult in advanced technologies. The current design paradigm which assumes that no gate or interconnect will ever operate incorrectly within the lifetime of a product must change to cope with such failures. New architectural features are required for robust system design with built-in mechanisms for failure tolerance, detection and recovery during normal system operation. This part of the tutorial will focus on new design techniques required for building robust systems: concurrent error detection, recovery, and selfrepair. A broad spectrum of circuit-level, logic-level,","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130422575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
An Approach to Measure the Performance Impact of Dynamic Voltage Fluctuations Using Static Timing Analysis 用静态时序分析测量动态电压波动对性能影响的方法
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.45
Ramamurthy Vishweshwara, R. Venkatraman, H. Udayakumar, N. Arvind
Design closure for predictable silicon performance is emerging as the most challenging digital VLSI design problem in advanced deep-submicron technology nodes. One of the significant problems is effective power-grid distribution,and the comprehension of the impact of voltage drops in the power grid on design timing and performance. This paper proposes a way by which the complex interactions between timing and dynamic power drops can be comprehended without being significantly pessimistic, while also not losing out on accuracy. We highlight the heuristics that we have used in this regard to reduce the complexity of the timing analysis, and to reduce the overall computation time. The overall method uses conventional analysis approaches for dynamic voltage-drop and timing. This method proposes options for comprehending effects of dynamic voltage drops during traditional design-closure methods and also highlights means of validating any assumptions made. Comparison results between performance degradation due to voltage drop assumptions and the traditional margin based approaches show significant reduction in the pessimism and these are presented in this paper.
在先进的深亚微米技术节点中,可预测硅性能的设计闭合是最具挑战性的数字VLSI设计问题。其中一个重要的问题是有效的电网分配,以及对电网电压降对设计时序和性能影响的理解。本文提出了一种方法,该方法可以在不显着悲观的情况下理解时序和动态功率下降之间的复杂相互作用,同时也不会损失精度。我们强调了我们在这方面使用的启发式方法,以减少时序分析的复杂性,并减少总体计算时间。总体方法采用传统的动态电压降和定时分析方法。该方法提出了在传统设计闭合方法中理解动态电压降影响的选项,并强调了验证所做假设的方法。电压降假设导致的性能下降与传统的基于裕度的方法的比较结果表明,悲观情绪显著降低,并在本文中提出了这些结果。
{"title":"An Approach to Measure the Performance Impact of Dynamic Voltage Fluctuations Using Static Timing Analysis","authors":"Ramamurthy Vishweshwara, R. Venkatraman, H. Udayakumar, N. Arvind","doi":"10.1109/VLSI.Design.2009.45","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.45","url":null,"abstract":"Design closure for predictable silicon performance is emerging as the most challenging digital VLSI design problem in advanced deep-submicron technology nodes. One of the significant problems is effective power-grid distribution,and the comprehension of the impact of voltage drops in the power grid on design timing and performance. This paper proposes a way by which the complex interactions between timing and dynamic power drops can be comprehended without being significantly pessimistic, while also not losing out on accuracy. We highlight the heuristics that we have used in this regard to reduce the complexity of the timing analysis, and to reduce the overall computation time. The overall method uses conventional analysis approaches for dynamic voltage-drop and timing. This method proposes options for comprehending effects of dynamic voltage drops during traditional design-closure methods and also highlights means of validating any assumptions made. Comparison results between performance degradation due to voltage drop assumptions and the traditional margin based approaches show significant reduction in the pessimism and these are presented in this paper.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129281779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Accelerating System-Level Design Tasks Using Commodity Graphics Hardware: A Case Study 使用商品图形硬件加速系统级设计任务:一个案例研究
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.35
Unmesh D. Bordoloi, S. Chakraborty
Many system-level design tasks (e.g. timing analysis, hardware/software partitioning and design space exploration) involve computational kernels that are intractable (usually NP-hard). As a result, they involve high running times even for mid-sized problems. In this paper we explore the possibility of using commodity graphics processing units (GPUs) to accelerate such tasks that commonly arise in the electronic design automation (EDA) domain. We demonstrate this idea via a detailed case study on a general hardware/software design space exploration problem and propose a GPU-based engine for it. Not only does this problem commonly arise in the embedded systems domain, its computational kernel turns out to be a general combinatorial optimization problem (viz. the knapsack problem) which lies at the heart of several EDA applications. Our experimental results show that our GPU-based implementation offers very attractive speedups for this computational kernel (up to 100×), and speedups of up to 17× for the full problem. In contrast to ASIC/FPGA-based accelerators – since even low-end desktop and notebook computers are today equipped with GPUs – our solution involves no extra hardware cost. Although recent research has shown the benefits of using GPUs for a variety of non-graphics applications (e.g. in databases and bioinformatics), hardly any work has been done on harnessing the parallelism of GPUs to accelerate problems from the EDA domain. We hope that our results and the generality of the problem we address will motivate researchers from this community to explore the possibility of using GPUs for a wider variety of problems from the EDA domain.
许多系统级设计任务(例如时序分析,硬件/软件划分和设计空间探索)涉及难以处理的计算内核(通常是NP-hard)。因此,即使对于中等规模的问题,它们也涉及高运行时间。在本文中,我们探讨了使用商品图形处理单元(gpu)来加速电子设计自动化(EDA)领域中常见的此类任务的可能性。我们通过对一般硬件/软件设计空间探索问题的详细案例研究来证明这一想法,并为此提出了一个基于gpu的引擎。这个问题不仅经常出现在嵌入式系统领域,它的计算内核是一个通用的组合优化问题(即背包问题),这是几个EDA应用程序的核心。我们的实验结果表明,我们基于gpu的实现为这个计算内核提供了非常有吸引力的加速(高达100倍),对于整个问题的加速高达17倍。与基于ASIC/ fpga的加速器相比,我们的解决方案不需要额外的硬件成本,因为即使是低端的台式机和笔记本电脑也配备了gpu。尽管最近的研究已经显示了在各种非图形应用(例如数据库和生物信息学)中使用gpu的好处,但几乎没有任何工作已经完成了利用gpu的并行性来加速EDA领域的问题。我们希望我们的结果和我们解决的问题的普遍性将激励这个社区的研究人员探索使用gpu解决EDA领域更广泛问题的可能性。
{"title":"Accelerating System-Level Design Tasks Using Commodity Graphics Hardware: A Case Study","authors":"Unmesh D. Bordoloi, S. Chakraborty","doi":"10.1109/VLSI.Design.2009.35","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.35","url":null,"abstract":"Many system-level design tasks (e.g. timing analysis, hardware/software partitioning and design space exploration) involve computational kernels that are intractable (usually NP-hard). As a result, they involve high running times even for mid-sized problems. In this paper we explore the possibility of using commodity graphics processing units (GPUs) to accelerate such tasks that commonly arise in the electronic design automation (EDA) domain. We demonstrate this idea via a detailed case study on a general hardware/software design space exploration problem and propose a GPU-based engine for it. Not only does this problem commonly arise in the embedded systems domain, its computational kernel turns out to be a general combinatorial optimization problem (viz. the knapsack problem) which lies at the heart of several EDA applications. Our experimental results show that our GPU-based implementation offers very attractive speedups for this computational kernel (up to 100×), and speedups of up to 17× for the full problem. In contrast to ASIC/FPGA-based accelerators – since even low-end desktop and notebook computers are today equipped with GPUs – our solution involves no extra hardware cost. Although recent research has shown the benefits of using GPUs for a variety of non-graphics applications (e.g. in databases and bioinformatics), hardly any work has been done on harnessing the parallelism of GPUs to accelerate problems from the EDA domain. We hope that our results and the generality of the problem we address will motivate researchers from this community to explore the possibility of using GPUs for a wider variety of problems from the EDA domain.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124196174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
ReConfigurable Technologies 可重构技术
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.123
Mona Mathur
It has been envisioned that in the future it would be possible for the designer to have the complete flexibility that software offers at the hardware speeds which will ensure reduction in cost and the product turn-around time substantially. Optimal performance needs of applications can be met if fine-grained field reconfigurations can be made possible in hardware. There are several problems and challenges which need to be addressed – these include specification of reconfigurable architectures and processors, software environments that support reconfiguration, increasing heterogeneity and complexity of the systems and SoCs and power management. It is one of the goals of this talk to stimulate a discussion on reconfigurable design by introducing some key Issues.
它已经设想,在未来,它将有可能为设计师提供完整的灵活性,软件提供在硬件的速度,这将确保在成本和产品周转时间大幅降低。如果可以在硬件中实现细粒度的字段重新配置,则可以满足应用程序的最佳性能需求。有几个问题和挑战需要解决——这些包括可重构架构和处理器的规范,支持可重构的软件环境,增加系统和soc的异构性和复杂性以及电源管理。通过介绍一些关键问题来激发对可重构设计的讨论是本次演讲的目标之一。
{"title":"ReConfigurable Technologies","authors":"Mona Mathur","doi":"10.1109/VLSI.Design.2009.123","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.123","url":null,"abstract":"It has been envisioned that in the future it would be possible for the designer to have the complete flexibility that software offers at the hardware speeds which will ensure reduction in cost and the product turn-around time substantially. Optimal performance needs of applications can be met if fine-grained field reconfigurations can be made possible in hardware. There are several problems and challenges which need to be addressed – these include specification of reconfigurable architectures and processors, software environments that support reconfiguration, increasing heterogeneity and complexity of the systems and SoCs and power management. It is one of the goals of this talk to stimulate a discussion on reconfigurable design by introducing some key Issues.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126594808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Simultaneous Routing and Feedthrough Algorithm to Decongest Top Channel 同时路由和馈通算法减少顶部信道的拥塞
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.83
S. Prasad, Anuj Kumar
In macrocell based SoC design, a routing plan to decongest top channel is an important step during floor planning. While previous approaches attempt at reducing congestion of chip as a whole, there is no attempt to specifically decongest top channel. We present an algorithmic approach to decongest top channel by using very few feedthroughs. Results show that compared to conventional methods, we can decongest top channel by using 20% lesser feedthrough buffers, and better top channel routing resource utilization.
在基于macrocell的SoC设计中,减少顶层通道拥塞的路由规划是布局规划的重要步骤。虽然以前的方法试图减少芯片整体的拥塞,但没有尝试专门减少顶部通道的拥塞。我们提出了一种算法方法,通过使用很少的反馈来减少顶部通道的拥挤。结果表明,与传统方法相比,该方法减少了20%的馈通缓冲区,减少了顶部信道的拥塞,提高了顶部信道路由资源的利用率。
{"title":"Simultaneous Routing and Feedthrough Algorithm to Decongest Top Channel","authors":"S. Prasad, Anuj Kumar","doi":"10.1109/VLSI.Design.2009.83","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.83","url":null,"abstract":"In macrocell based SoC design, a routing plan to decongest top channel is an important step during floor planning. While previous approaches attempt at reducing congestion of chip as a whole, there is no attempt to specifically decongest top channel. We present an algorithmic approach to decongest top channel by using very few feedthroughs. Results show that compared to conventional methods, we can decongest top channel by using 20% lesser feedthrough buffers, and better top channel routing resource utilization.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127617546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the Limits of Port Reduction in Centralized Register Files 中心化寄存器文件端口缩减的极限探索
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.29
Sandeep Sirsi, Aneesh Aggarwal
Register file access falls on the critical path of a microprocessor because large heavily ported register files are used to exploit more parallelism. In this paper, we focus on reducing register file complexity by reducing the number of register file read ports. The goal of this paper is to explore the limits of read port reduction in a centralized integer register file i.e. how few read ports can be provided to a centralized integer register file, while still maintaining performance? A naïve port reduction may result in significant performance degradation and does not give a true measure of the limits, while clever techniques may be able to further reduce the number of ports. Hence, in this paper, we drastically reduce the number of ports and then investigate techniques to improve the performance of the reduced-ported register file. Our experiments show that the techniques allow further port reduction by improving the performance from reduced-ported RFs. For instance, with our experimental parameters, the naïve port reduction method requires at least five read ports to maintain a performance impact of less than 5%, whereas, our techniques require only three ports.
寄存器文件访问落在微处理器的关键路径上,因为使用了大量移植的大寄存器文件来利用更多的并行性。在本文中,我们的重点是通过减少寄存器文件读取端口的数量来降低寄存器文件的复杂性。本文的目标是探索集中式整数寄存器文件中读端口减少的限制,即在保持性能的同时,可以为集中式整数寄存器文件提供多少读端口?naïve端口减少可能会导致显著的性能下降,并且不能给出限制的真实度量,而巧妙的技术可能能够进一步减少端口的数量。因此,在本文中,我们大幅减少端口的数量,然后研究技术,以提高减少端口的寄存器文件的性能。我们的实验表明,这些技术可以通过提高端口减少的rf的性能来进一步减少端口。例如,使用我们的实验参数,naïve端口减少方法需要至少五个读端口来保持小于5%的性能影响,而我们的技术只需要三个端口。
{"title":"Exploring the Limits of Port Reduction in Centralized Register Files","authors":"Sandeep Sirsi, Aneesh Aggarwal","doi":"10.1109/VLSI.Design.2009.29","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.29","url":null,"abstract":"Register file access falls on the critical path of a microprocessor because large heavily ported register files are used to exploit more parallelism. In this paper, we focus on reducing register file complexity by reducing the number of register file read ports. The goal of this paper is to explore the limits of read port reduction in a centralized integer register file i.e. how few read ports can be provided to a centralized integer register file, while still maintaining performance? A naïve port reduction may result in significant performance degradation and does not give a true measure of the limits, while clever techniques may be able to further reduce the number of ports. Hence, in this paper, we drastically reduce the number of ports and then investigate techniques to improve the performance of the reduced-ported register file. Our experiments show that the techniques allow further port reduction by improving the performance from reduced-ported RFs. For instance, with our experimental parameters, the naïve port reduction method requires at least five read ports to maintain a performance impact of less than 5%, whereas, our techniques require only three ports.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126574469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2009 22nd International Conference on VLSI Design
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1