2008 45th ACM/IEEE Design Automation Conference最新文献

英文中文

Application-driven floorplan-aware voltage island design 应用驱动的平面感知电压岛设计

2008 45th ACM/IEEE Design Automation Conference

Pub Date : 2008-06-08 DOI: 10.1145/1391469.1391511

D. Sengupta, R. Saleh

Among the different methods of reducing power for core-based system-on-chip (SoC) designs, the voltage island technique has gained in popularity. Assigning cores to the different supply voltages and floorplanning to create contiguous voltage islands are the two important steps in the design process. We propose a new application-driven, floorplan-aware approach to voltage partitioning and island creation with the objective of reducing overall SoC power, area and runtime. Previous approaches used the voltage assignment table as the starting point for voltage island creation. In this paper, we present a technique to generate a voltage assignment table using dynamic programming. Next, we partition the cores into islands, based on the Power State Model (PSM) of the application, and connectivity information used in floorplanning. Finally, solutions are sent to the floorplanner in sequence until a valid solution is reached. Compared to previously reported techniques, a 10% reduction in power and 8% reduction in area are achieved using our approach, with an average runtime improvement of 2.3X.

在降低基于内核的片上系统(SoC)设计功耗的不同方法中，电压岛技术已经得到了广泛的应用。在设计过程中，两个重要的步骤是为不同的电源电压分配核心，并规划楼层以创建连续的电压岛。我们提出了一种新的应用驱动的、平面感知的电压划分和孤岛创建方法，目的是降低SoC的总体功耗、面积和运行时间。以前的方法使用电压分配表作为创建电压岛的起点。本文提出了一种利用动态规划生成电压分配表的方法。接下来，我们根据应用程序的功率状态模型(PSM)和平面规划中使用的连接信息，将核心划分为岛屿。最后，解决方案依次发送给地板规划师，直到达成有效的解决方案。与之前报道的技术相比，使用我们的方法可以减少10%的功率，减少8%的面积，平均运行时间提高2.3倍。

引用次数: 29

Input vector control for post-silicon leakage current minimization in the presence of manufacturing variability 输入矢量控制后硅泄漏电流最小化存在的制造可变性

2008 45th ACM/IEEE Design Automation Conference

Pub Date : 2008-06-08 DOI: 10.1145/1391469.1391624

Y. Alkabani, T. Massey, F. Koushanfar, M. Potkonjak

We present the first approach for post-silicon leakage power reduction through input vector control (IVC) that takes into account the impact of the manufacturing variability (MV). Because of the MV, the integrated circuits (ICs) implementing one design require different input vectors to achieve their lowest leakage states. We address two major challenges. The first is the extraction of the gate- level characteristics of an IC by measuring only the overall leakage power for different inputs. The second problem is the rapid generation of input vectors that result in a low leakage for a large number of unique ICs that implement a given design, but are different in the post-manufacturing phase. Experimental results on a large set of benchmark instances demonstrate the efficiency of the proposed methods. For example, the leakage power consumption could be reduced in average by more than 10.4%, when compared to the previously published IVC techniques that did not consider MV.

我们提出了通过输入矢量控制(IVC)降低硅泄漏后功率的第一种方法，该方法考虑了制造可变性(MV)的影响。由于MV的存在，实现一种设计的集成电路(ic)需要不同的输入向量来达到最低的泄漏状态。我们应对两大挑战。首先是通过测量不同输入的总泄漏功率来提取集成电路的门电平特性。第二个问题是输入向量的快速生成，导致大量实现给定设计的独特ic的低泄漏，但在后制造阶段是不同的。在大量基准实例上的实验结果证明了所提方法的有效性。例如，与之前发表的不考虑MV的IVC技术相比，泄漏功耗平均降低了10.4%以上。

引用次数: 50

Analog placement based on hierarchical module clustering 基于分层模块聚类的模拟放置

2008 45th ACM/IEEE Design Automation Conference

Pub Date : 2008-06-08 DOI: 10.1145/1391469.1391484

Mark Po-Hung Lin, Shyh-Chang Lin

In analog layout design, it is very important to reduce the parasitic coupling effects and improve the circuit performance. Consequently, the most important device-level placement constraints are matching, symmetry, and proximity. However, many previous works deal with these constraints separately, and none of them mention how to handle different constraints simultaneously and hierarchically. In this paper, we first give a case study to show the needs of integrating these constraints in a hierarchical manner. Then, we present the first formulation for analog placement based on hierarchical module clustering. Our approach can handle analog placement with various constraint groups including matching, (hierarchical) symmetry, and (hierarchical) proximity groups. To our best knowledge, this is also the first work in the literature to handle floorplanning with the clustering constraint using the B*-tree based representation. Experimental results based on industrial analog designs show that our approach is very effective and efficient.

在模拟布线设计中，减小寄生耦合效应，提高电路性能是非常重要的。因此，最重要的设备级放置约束是匹配、对称和接近。然而，许多以前的作品分别处理这些约束，没有一个提到如何同时和分层地处理不同的约束。在本文中，我们首先给出一个案例研究来说明以分层方式集成这些约束的必要性。然后，我们提出了第一个基于分层模块聚类的模拟放置公式。我们的方法可以处理各种约束组的模拟放置，包括匹配、(分层)对称和(分层)接近组。据我们所知，这也是文献中第一个使用基于B*树表示的聚类约束处理地板规划的工作。基于工业模拟设计的实验结果表明，该方法是非常有效的。

引用次数: 40

A practical approach of memory access parallelization to exploit multiple off-chip DDR memories 一种利用片外多个DDR存储器的存储器访问并行化的实用方法

2008 45th ACM/IEEE Design Automation Conference

Pub Date : 2008-06-08 DOI: 10.1145/1391469.1391585

Woo-Cheol Kwon, S. Yoo, Sungpack Hong, Byeong Min, Kyu-Myung Choi, Soo-Kwan Eo

3D stacked memory enables more off-chip DDR memories. Redesigning existing IPs to exploit the increased memory parallelism will be prohibitively costly. In our work, we propose a practical approach to exploit the increased bandwidth and reduced latency of multiple off-chip DDR memories while reusing existing IPs without modification. The proposed approach is based on two new concepts: transaction id renaming and distributed soft arbitration. We present two on-chip network components, request parallelizer and read data serializer, to realize the concepts. Experiments with synthetic test cases and an industrial strength DTV SoC design show that the proposed approach gives significant improvements in total execution cycle (21.6%) and average memory access latency (31.6%) in the DTV case with a small area overhead (30.1% in the on-chip network, and less than 1.4% in the entire chip).

3D堆叠存储器支持更多片外DDR存储器。重新设计现有的ip以利用增加的内存并行性将是非常昂贵的。在我们的工作中，我们提出了一种实用的方法来利用增加的带宽和减少多个片外DDR存储器的延迟，同时无需修改即可重用现有的ip。该方法基于两个新概念:事务id重命名和分布式软仲裁。我们提出了两个片上网络组件，请求并行化器和读取数据串行化器来实现这些概念。综合测试用例和工业强度DTV SoC设计的实验表明，该方法在DTV情况下的总执行周期(21.6%)和平均内存访问延迟(31.6%)有显着改善，并且面积开销很小(片上网络为30.1%，整个芯片小于1.4%)。

引用次数: 20

Sparse matrix computations on manycore GPU’s 多核GPU上的稀疏矩阵计算

2008 45th ACM/IEEE Design Automation Conference

Pub Date : 2008-06-08 DOI: 10.1145/1391469.1391473

M. Garland

Modern microprocessors are becoming increasingly parallel devices, and GPUs are at the leading edge of this trend. Designing parallel algorithms for manycore chips like the GPU can present interesting challenges, particularly for computations on sparse data structures. One particularly common example is the collection of sparse matrix solvers and combinatorial graph algorithms that form the core of many physical simulation techniques. Although seemingly irregular, these operations can often be implemented with data parallel operations that map very well to massively parallel processors.

现代微处理器正日益成为并行设备，而gpu处于这一趋势的前沿。为像GPU这样的多核芯片设计并行算法可能会带来有趣的挑战，特别是对于稀疏数据结构的计算。一个特别常见的例子是稀疏矩阵求解器和组合图算法的集合，它们构成了许多物理模拟技术的核心。尽管这些操作看似不规则，但通常可以通过数据并行操作来实现，这些操作可以很好地映射到大规模并行处理器。

引用次数: 96

Timing yield driven clock skew scheduling considering non-Gaussian distributions of critical path delays 考虑非高斯分布关键路径延迟的时序良率驱动时钟偏差调度

2008 45th ACM/IEEE Design Automation Conference

Pub Date : 2008-06-08 DOI: 10.1145/1391469.1391525

Yi Wang, W. Luk, Xuan Zeng, Jun Tao, Changhao Yan, J. Tong, W. Cai, Jia Ni

In nanometer technologies, process variations possess growing nonlinear impacts on circuit performance, which causes critical path delays of combinatorial circuits variate randomly with non-Gaussian distribution. In this paper, we propose a novel clock skew scheduling methodology that optimizes timing yield by handling non-Gaussian distributions of critical path delays. Firstly a general formulation of the optimization problem is proposed, which covers most of the previous formulations and indicates their limitations with statistical interpretations. Then a generalized minimum balancing algorithm is proposed for effectively solving the skew scheduling problem. Experimental results show that the proposed method significantly outperforms some representative methods previously proposed for yield optimization, and could obtain timing yield improvements up to 33.6% and averagely 17.7%.

在纳米技术中，工艺变化对电路性能的非线性影响越来越大，导致组合电路的关键路径延迟随机地呈非高斯分布。在本文中，我们提出了一种新的时钟偏差调度方法，该方法通过处理关键路径延迟的非高斯分布来优化时序产率。首先提出了优化问题的一般表述，它涵盖了以往的大多数表述，并用统计解释指出了它们的局限性。然后提出了一种广义最小平衡算法，有效地解决了倾斜调度问题。实验结果表明，所提方法明显优于已有的成品率优化方法，可获得最高达33.6%、平均17.7%的时序成品率改进。

引用次数: 11

Standard interfaces in mobile terminals — increasing the efficiency of device design and accelerating innovation 移动终端的标准接口——提高设备设计效率，加速创新

2008 45th ACM/IEEE Design Automation Conference

Pub Date : 2008-06-08 DOI: 10.1145/1391469.1391619

R. Savolainen, T. Rissa

Handheld mobile terminals have developed from simple phones to devices featuring a wide variety of modern multimedia functions, being in fact multimedia computers. In today's mobile terminals, computational demand is closes to that of personal desktop computers only a few years ago. All these new features need more power and bandwidth in interconnections. New innovations must be implemented in these devices with ever increasing speed.

手持移动终端已经从简单的电话发展到具有多种现代多媒体功能的设备，实际上是多媒体计算机。在今天的移动终端中，计算需求接近于几年前的个人台式电脑。所有这些新功能都需要在互连中增加功率和带宽。新的创新必须以越来越快的速度在这些设备中实施。

引用次数: 1

Tera-scale computing and interconnect challenges 万亿级计算和互连挑战

2008 45th ACM/IEEE Design Automation Conference

Pub Date : 2008-06-08 DOI: 10.1145/1391469.1391641

J. Bautista

Future CPU directions are increasingly emphasizing parallel compute platforms which are critically dependent upon upon greater core to core communication as well as generally stressing the overall memory and storage interconnect hierarchy to a much greater degree than extrapolations of past platform needs. Performance is critically dependent upon memory bandwidth and latency but must be moderated with power and cost considerations. 3D stacking of CPU's and memory (i.e. a last level cache) is a potential solution that provides the necessary bandwidth within a reasonable power envelope.

未来的CPU方向越来越强调并行计算平台，它严重依赖于更大的核心到核心通信，并且通常强调整体内存和存储互连层次结构，而不是过去平台需求的推断。性能严重依赖于内存带宽和延迟，但必须考虑到功耗和成本。CPU和内存的3D堆叠(即最后一级缓存)是一种潜在的解决方案，可以在合理的功率范围内提供必要的带宽。

引用次数: 13

Many-core design from a thermal perspective 多核设计从热的角度来看

2008 45th ACM/IEEE Design Automation Conference

Pub Date : 2008-06-08 DOI: 10.1145/1391469.1391660

Wei Huang, M. Stan, K. Sankaranarayanan, R. J. Ribando, K. Skadron

Air cooling limits have been a major design challenge in recent years for integrated circuits. Multi-core exacerbates thermal challenges because power scales with the number of cores, but also creates new opportunities for temperature-aware design, because multi-core designs offer more design parameters than single-core designs. This paper investigates the relationship between core size and on-chip hot spot temperature and shows that with the same power density, smaller cores are cooler than larger cores due to a spatial low-pass filtering effect of temperature. This phenomenon suggests that designs exploiting low-pass filtering can dissipate more power within the same cooling budget than contemporary designs.

近年来，空气冷却限制一直是集成电路设计的主要挑战。多核加剧了热挑战，因为功率随着核数的增加而增加，但也为温度感知设计创造了新的机会，因为多核设计比单核设计提供了更多的设计参数。本文研究了内核尺寸与片上热点温度之间的关系，并表明在相同功率密度下，由于温度的空间低通滤波效应，较小的内核比较大的内核更冷。这一现象表明，采用低通滤波的设计可以在相同的冷却预算下比现代设计消耗更多的功率。

引用次数: 115

Miss reduction in embedded processors through dynamic, power-friendly cache design 通过动态、节能的缓存设计减少嵌入式处理器的丢失

2008 45th ACM/IEEE Design Automation Conference

Pub Date : 2008-06-08 DOI: 10.1145/1391469.1391546

Garo Bournoutian, A. Orailoglu

Today, embedded processors are expected to be able to run complex, algorithm-heavy applications that were originally designed and coded for general-purpose processors. As a result, traditional methods for addressing performance and determinism become inadequate. This paper explores a new data cache design for use in modern high-performance embedded processors that will dynamically improve execution time, power efficiency, and determinism within the system. The simulation results show significant improvement in cache miss ratios and reduction in power consumption of approximately 30% and 15%, respectively.

今天，嵌入式处理器被期望能够运行复杂的、算法密集型的应用程序，这些应用程序最初是为通用处理器设计和编码的。因此，处理性能和确定性的传统方法变得不充分。本文探讨了一种用于现代高性能嵌入式处理器的新数据缓存设计，该设计将动态地改善系统内的执行时间、功率效率和确定性。仿真结果表明，该方法可以显著提高缓存丢失率，降低功耗，分别约为30%和15%。

引用次数: 15

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2008 45th ACM/IEEE Design Automation Conference

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀