As the power consumption of an electronic equipment varies more severely, the device voltages in a modern design may fluctuate violently as well. Consideration of dynamic IR-drop becomes indispensable to current power network design. Since solving voltage violations according to all power consumption files in all time slots is impractical in reality, this paper applies a clustering based approach to find representative power consumption files and shows that most IR-drop violations can be repaired if we repair the power network according to these files. In order to further reduce runtime, we also propose an efficient and effective power network optimization approach. Compared to the intuitive approach which repairs a power network file by file, our approach alternates between different power consumption files and always repairs the file which has the worst IR-drop violation region that involves more power consumption files in each iteration. Since many violations can be resolved at the same time, this method is much faster than the iterative approach. The experimental results show that the proposed algorithm can not only eliminate voltage violations efficiently but also construct a power network with less routing resource.
{"title":"A Fast Power Network Optimization Algorithm for Improving Dynamic IR-drop","authors":"Jai-Ming Lin, Yang-Tai Kung, Zhengqiu Huang, I-Ru Chen","doi":"10.1145/3439706.3447042","DOIUrl":"https://doi.org/10.1145/3439706.3447042","url":null,"abstract":"As the power consumption of an electronic equipment varies more severely, the device voltages in a modern design may fluctuate violently as well. Consideration of dynamic IR-drop becomes indispensable to current power network design. Since solving voltage violations according to all power consumption files in all time slots is impractical in reality, this paper applies a clustering based approach to find representative power consumption files and shows that most IR-drop violations can be repaired if we repair the power network according to these files. In order to further reduce runtime, we also propose an efficient and effective power network optimization approach. Compared to the intuitive approach which repairs a power network file by file, our approach alternates between different power consumption files and always repairs the file which has the worst IR-drop violation region that involves more power consumption files in each iteration. Since many violations can be resolved at the same time, this method is much faster than the iterative approach. The experimental results show that the proposed algorithm can not only eliminate voltage violations efficiently but also construct a power network with less routing resource.","PeriodicalId":184050,"journal":{"name":"Proceedings of the 2021 International Symposium on Physical Design","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127411308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Groeneveld, Michael James, V. Kibardin, I. Sharapov, Marvin Tom, Leo Wang
Solving 3-D partial differential equations in a Finite Element model is computationally intensive and requires extremely high memory and communication bandwidth. This paper describes a novel way where the Finite Element mesh points of varying resolution are mapped on a large 2-D homogenous array of processors. Cerebras developed a novel supercomputer that is powered by a 21.5cm by 21.5cm Wafer-Scale Engine (WSE) with 850,000 programmable compute cores. With 2.6 trillion transistors in a 7nm process this is by far the largest chip in the world. It is structured as a regular array of 800 by 1060 identical processing elements, each with its own local fast SRAM memory and direct high bandwidth connection to its neighboring cores. For the 2021 ISPD competition we propose a challenge to optimize placement of computational physics problems to achieve the highest possible performance on the Cerebras supercomputer. The objectives are to maximize performance and accuracy by optimizing the mapping of the problem to cores in the system. This involves partitioning and placement algorithms.
{"title":"ISPD 2021 Wafer-Scale Physics Modeling Contest: A New Frontier for Partitioning, Placement and Routing","authors":"P. Groeneveld, Michael James, V. Kibardin, I. Sharapov, Marvin Tom, Leo Wang","doi":"10.1145/3439706.3446904","DOIUrl":"https://doi.org/10.1145/3439706.3446904","url":null,"abstract":"Solving 3-D partial differential equations in a Finite Element model is computationally intensive and requires extremely high memory and communication bandwidth. This paper describes a novel way where the Finite Element mesh points of varying resolution are mapped on a large 2-D homogenous array of processors. Cerebras developed a novel supercomputer that is powered by a 21.5cm by 21.5cm Wafer-Scale Engine (WSE) with 850,000 programmable compute cores. With 2.6 trillion transistors in a 7nm process this is by far the largest chip in the world. It is structured as a regular array of 800 by 1060 identical processing elements, each with its own local fast SRAM memory and direct high bandwidth connection to its neighboring cores. For the 2021 ISPD competition we propose a challenge to optimize placement of computational physics problems to achieve the highest possible performance on the Cerebras supercomputer. The objectives are to maximize performance and accuracy by optimizing the mapping of the problem to cores in the system. This involves partitioning and placement algorithms.","PeriodicalId":184050,"journal":{"name":"Proceedings of the 2021 International Symposium on Physical Design","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115383250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The 2021 International Symposium on Physical Design lifetime achievement award goes to Dr. Louis K. Scheffer for his outstand contributions to the field. This autobiography in Lou's own words provides a glimpse of what has happened through his career.
2021年国际物理设计研讨会终身成就奖授予Louis K. Scheffer博士,以表彰他在该领域的杰出贡献。这本自传用卢自己的话讲述了他职业生涯中发生的事情。
{"title":"A Lifetime of ICs, and Cross-field Exploration: ISPD 2021 Lifetime Achievement Award Bio","authors":"L. Scheffer","doi":"10.1145/3439706.3447046","DOIUrl":"https://doi.org/10.1145/3439706.3447046","url":null,"abstract":"The 2021 International Symposium on Physical Design lifetime achievement award goes to Dr. Louis K. Scheffer for his outstand contributions to the field. This autobiography in Lou's own words provides a glimpse of what has happened through his career.","PeriodicalId":184050,"journal":{"name":"Proceedings of the 2021 International Symposium on Physical Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115723851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tonmoy Dhar, K. Kunal, Yaguang Li, Yishuang Lin, Meghna Madhusudan, Jitesh Poojary, A. Sharma, S. Burns, R. Harjani, Jiang Hu, P. Mukherjee, Soner Yaldiz, S. Sapatnekar
The quality of layouts generated by automated analog design have traditionally not been able to match those from human designers over a wide range of analog designs. The ALIGN (Analog Layout, Intelligently Generated from Netlists) project [2, 3, 6] aims to build an open-source analog layout engine [1] that overcomes these challenges, using a variety of approaches. An important part of the toolbox is the use of machine learning (ML) methods, combined with traditional methods, and this talk overviews our efforts. The input to ALIGN is a SPICE-like netlist and a set of perfor- mance specifications, and the output is a GDSII layout. ALIGN automatically recognizes hierarchies in the input netlist. To detect variations of known blocks in the netlist, approximate subgraph iso- morphism methods based on graph convolutional networks can be used [5]. Repeated structures in a netlist are typically constrained by layout requirements related to symmetry or matching. In [7], we use a mix of graph methods and ML to detect symmetric and array structures, including the use of neural network based approximate matching through the use of the notion of graph edit distances. Once the circuit is annotated, ALIGN generates the layout, going from the lowest level cells to higher levels of the netlist hierarchy. Based on an abstraction of the process design rules, ALIGN builds parameterized cell layouts for each structure, accounting for the need for common centroid layouts where necessary [11]. These cells then undergo placement and routing that honors the geomet- ric constraints (symmetry, common-centroid). The chief parameter that changes during layout is the set of interconnect RC parasitics: excessively large RCs could result in an inability to meet perfor- mance. These values can be controlled by reducing the distance between blocks, or, in the case of R, by using larger effective wire widths (using multiple parallel connections in FinFET technologies where wire widths are quantized) to reduce the effective resistance. ALIGN has developed several approaches based on ML for this purpose [4, 8, 9] that rapidly predict whether a layout will meet the performance constraints that are imposed at the circuit level, and these can be deployed together with conventional algorithmic methods [10] to rapidly prune out infeasible layouts. This presentation overviews our experience in the use of ML- based methods in conjunction with conventional algorithmic ap- proaches for analog design. We will show (a) results from our efforts so far, (b) appropriate methods for mixing ML methods with tra- ditional algorithmic techniques for solving the larger problem of analog layout, (c) limitations of ML methods, and (d) techniques for overcoming these limitations to deliver workable solutions for analog layout automation.
传统上,自动化模拟设计生成的布局质量在广泛的模拟设计中无法与人类设计师的设计相匹配。ALIGN (Analog Layout, intelligent Generated from Netlists)项目[2,3,6]旨在构建一个开源的模拟布局引擎[1],使用多种方法克服这些挑战。工具箱的一个重要部分是使用机器学习(ML)方法,结合传统方法,本演讲概述了我们的努力。ALIGN的输入是一个类似spice的网络列表和一组性能规范,输出是一个GDSII布局。ALIGN自动识别输入网络列表中的层次结构。为了检测网表中已知块的变化,可以使用基于图卷积网络的近似子图同态方法[5]。网表中的重复结构通常受到与对称或匹配相关的布局要求的约束。在[7]中,我们混合使用图方法和ML来检测对称和数组结构,包括通过使用图编辑距离的概念使用基于神经网络的近似匹配。一旦对电路进行了注释,ALIGN就会生成布局,从最低级别的单元到网表层次结构的更高级别。ALIGN基于流程设计规则的抽象,为每个结构构建参数化的单元布局,并在必要时考虑到公共质心布局的需要[11]。然后根据几何约束(对称、共质心)对这些细胞进行放置和布线。在布局过程中改变的主要参数是互连RC寄生集:过大的RC可能导致无法满足性能。这些值可以通过减小块之间的距离来控制,或者在R的情况下,通过使用更大的有效线宽(在线宽量化的FinFET技术中使用多个并行连接)来减少有效电阻。ALIGN为此目的开发了几种基于ML的方法[4,8,9],这些方法可以快速预测布局是否满足电路级施加的性能约束,这些方法可以与传统算法方法[10]一起部署,以快速剔除不可行的布局。本报告概述了我们在使用基于机器学习的方法与传统的模拟设计算法相结合的经验。我们将展示(a)迄今为止我们努力的结果,(b)将ML方法与传统算法技术混合的适当方法,以解决更大的模拟布局问题,(c) ML方法的局限性,以及(d)克服这些局限性的技术,为模拟布局自动化提供可行的解决方案。
{"title":"Machine Learning Techniques in Analog Layout Automation","authors":"Tonmoy Dhar, K. Kunal, Yaguang Li, Yishuang Lin, Meghna Madhusudan, Jitesh Poojary, A. Sharma, S. Burns, R. Harjani, Jiang Hu, P. Mukherjee, Soner Yaldiz, S. Sapatnekar","doi":"10.1145/3439706.3446896","DOIUrl":"https://doi.org/10.1145/3439706.3446896","url":null,"abstract":"The quality of layouts generated by automated analog design have traditionally not been able to match those from human designers over a wide range of analog designs. The ALIGN (Analog Layout, Intelligently Generated from Netlists) project [2, 3, 6] aims to build an open-source analog layout engine [1] that overcomes these challenges, using a variety of approaches. An important part of the toolbox is the use of machine learning (ML) methods, combined with traditional methods, and this talk overviews our efforts. The input to ALIGN is a SPICE-like netlist and a set of perfor- mance specifications, and the output is a GDSII layout. ALIGN automatically recognizes hierarchies in the input netlist. To detect variations of known blocks in the netlist, approximate subgraph iso- morphism methods based on graph convolutional networks can be used [5]. Repeated structures in a netlist are typically constrained by layout requirements related to symmetry or matching. In [7], we use a mix of graph methods and ML to detect symmetric and array structures, including the use of neural network based approximate matching through the use of the notion of graph edit distances. Once the circuit is annotated, ALIGN generates the layout, going from the lowest level cells to higher levels of the netlist hierarchy. Based on an abstraction of the process design rules, ALIGN builds parameterized cell layouts for each structure, accounting for the need for common centroid layouts where necessary [11]. These cells then undergo placement and routing that honors the geomet- ric constraints (symmetry, common-centroid). The chief parameter that changes during layout is the set of interconnect RC parasitics: excessively large RCs could result in an inability to meet perfor- mance. These values can be controlled by reducing the distance between blocks, or, in the case of R, by using larger effective wire widths (using multiple parallel connections in FinFET technologies where wire widths are quantized) to reduce the effective resistance. ALIGN has developed several approaches based on ML for this purpose [4, 8, 9] that rapidly predict whether a layout will meet the performance constraints that are imposed at the circuit level, and these can be deployed together with conventional algorithmic methods [10] to rapidly prune out infeasible layouts. This presentation overviews our experience in the use of ML- based methods in conjunction with conventional algorithmic ap- proaches for analog design. We will show (a) results from our efforts so far, (b) appropriate methods for mixing ML methods with tra- ditional algorithmic techniques for solving the larger problem of analog layout, (c) limitations of ML methods, and (d) techniques for overcoming these limitations to deliver workable solutions for analog layout automation.","PeriodicalId":184050,"journal":{"name":"Proceedings of the 2021 International Symposium on Physical Design","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123583585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The human brain - which we consider to be the prototypal biological computer - in its current incarnation is the result of more than a billion years of evolution. Its main functions have always been to regulate the internal milieu and to help the organism/being to survive and reproduce. With growing complexity, the brain has adapted a number of design principles that serve to maximize its efficiency in performing a broad range of tasks. The physical computer, on the other hand, had only 200 years or so to evolve, and its perceived function was considerably different and far more constraint - that is to solve a set of mathematical functions. This however is rapidly changing. One may argue that the functions of brains and computers are converging. If so, the question arises if the underlaying design principles will converge or cross-breed as well, or will the different underlaying mechanisms (physics versus biology) lead to radically different solutions.
{"title":"Of Brains and Computers","authors":"J. Rabaey","doi":"10.1145/3439706.3446899","DOIUrl":"https://doi.org/10.1145/3439706.3446899","url":null,"abstract":"The human brain - which we consider to be the prototypal biological computer - in its current incarnation is the result of more than a billion years of evolution. Its main functions have always been to regulate the internal milieu and to help the organism/being to survive and reproduce. With growing complexity, the brain has adapted a number of design principles that serve to maximize its efficiency in performing a broad range of tasks. The physical computer, on the other hand, had only 200 years or so to evolve, and its perceived function was considerably different and far more constraint - that is to solve a set of mathematical functions. This however is rapidly changing. One may argue that the functions of brains and computers are converging. If so, the question arises if the underlaying design principles will converge or cross-breed as well, or will the different underlaying mechanisms (physics versus biology) lead to radically different solutions.","PeriodicalId":184050,"journal":{"name":"Proceedings of the 2021 International Symposium on Physical Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126747660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In Placement Legalization, it is often assumed that (almost) all standard cells possess the same height and can therefore be aligned in cell rows, which can then be treated independently. However, this is no longer true for recent technologies, where a substantial number of cells of double- or even arbitrary multiple-row height is to be expected. Due to interdependencies between the cell placements within several rows, the legalization task becomes considerably harder. In this paper, we show how to optimize quadratic cell movement for pairs of adjacent rows comprising cells of single- as well as double-row height with a fixed left-to-right ordering in time $mathcalO (ncdotłog(n))$, whereby n denotes the number of cells involved. Opposed to prior works, we thereby do not artificially bound the maximum cell movement and can guarantee to find an optimum solution. Experimental results show an average percental decrease of over $26%$ in the total quadratic movement when compared to a legalization approach that fixes cells of more than single-row height after Global Placement.
{"title":"A Fast Optimal Double Row Legalization Algorithm","authors":"S. Hougardy, Meike Neuwohner, Ulrike Schorr","doi":"10.1145/3439706.3447044","DOIUrl":"https://doi.org/10.1145/3439706.3447044","url":null,"abstract":"In Placement Legalization, it is often assumed that (almost) all standard cells possess the same height and can therefore be aligned in cell rows, which can then be treated independently. However, this is no longer true for recent technologies, where a substantial number of cells of double- or even arbitrary multiple-row height is to be expected. Due to interdependencies between the cell placements within several rows, the legalization task becomes considerably harder. In this paper, we show how to optimize quadratic cell movement for pairs of adjacent rows comprising cells of single- as well as double-row height with a fixed left-to-right ordering in time $mathcalO (ncdotłog(n))$, whereby n denotes the number of cells involved. Opposed to prior works, we thereby do not artificially bound the maximum cell movement and can guarantee to find an optimum solution. Experimental results show an average percental decrease of over $26%$ in the total quadratic movement when compared to a legalization approach that fixes cells of more than single-row height after Global Placement.","PeriodicalId":184050,"journal":{"name":"Proceedings of the 2021 International Symposium on Physical Design","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130534517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As with most aspects of electronic systems and integrated circuits, hardware security has traditionally evolved around the dominant CMOS technology. However, with the rise of various emerging technologies, whose main purpose is to overcome the fundamental limitations for scaling and power consumption of CMOS technology, unique opportunities arise to advance the notion of hardware security. In this paper, I first provide an overview on hardware security in general. Next, I review selected emerging technologies, namely (i) spintronics, (ii) memristors, (iii) carbon nanotubes and related transistors, (iv) nanowires and related transistors, and (v) 3D and 2.5D integration. I then discuss their application to advance hardware security and also outline related challenges.
{"title":"Hardware Security for and beyond CMOS Technology","authors":"J. Knechtel","doi":"10.1145/3439706.3446902","DOIUrl":"https://doi.org/10.1145/3439706.3446902","url":null,"abstract":"As with most aspects of electronic systems and integrated circuits, hardware security has traditionally evolved around the dominant CMOS technology. However, with the rise of various emerging technologies, whose main purpose is to overcome the fundamental limitations for scaling and power consumption of CMOS technology, unique opportunities arise to advance the notion of hardware security. In this paper, I first provide an overview on hardware security in general. Next, I review selected emerging technologies, namely (i) spintronics, (ii) memristors, (iii) carbon nanotubes and related transistors, (iv) nanowires and related transistors, and (v) 3D and 2.5D integration. I then discuss their application to advance hardware security and also outline related challenges.","PeriodicalId":184050,"journal":{"name":"Proceedings of the 2021 International Symposium on Physical Design","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134561847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}