This paper discusses the use of networks-on-chip (NoCs) consisting of multiple voltage-frequency islands to cope with power consumption, clock distribution and parameter variation problems in future multiprocessor systems-on-chip (MPSoCs). In this architecture, communication within each island is synchronous, while communication across different islands is achieved via mixed-clock mixed-voltage queues. In order to dynamically control the speed of each domain in the presence of parameter and workload variations, we propose a robust feedback control methodology. Towards this end, we first develop a state-space model based on the utilization of the inter-domain queues. Then, we identify the theoretical conditions under which the network is controllable. Finally, we synthesize state feedback controllers to cope with workload variations and minimize power consumption. Experimental results demonstrate robustness to parameter variations and more than 40% energy savings by exploiting workload variations through dynamic voltage-frequency scaling (DVFS) for a hardware MPEG-2 encoder design.
{"title":"Variation-adaptive feedback control for networks-on-chip with multiple clock domains","authors":"Ümit Y. Ogras, R. Marculescu, Diana Marculescu","doi":"10.1145/1391469.1391627","DOIUrl":"https://doi.org/10.1145/1391469.1391627","url":null,"abstract":"This paper discusses the use of networks-on-chip (NoCs) consisting of multiple voltage-frequency islands to cope with power consumption, clock distribution and parameter variation problems in future multiprocessor systems-on-chip (MPSoCs). In this architecture, communication within each island is synchronous, while communication across different islands is achieved via mixed-clock mixed-voltage queues. In order to dynamically control the speed of each domain in the presence of parameter and workload variations, we propose a robust feedback control methodology. Towards this end, we first develop a state-space model based on the utilization of the inter-domain queues. Then, we identify the theoretical conditions under which the network is controllable. Finally, we synthesize state feedback controllers to cope with workload variations and minimize power consumption. Experimental results demonstrate robustness to parameter variations and more than 40% energy savings by exploiting workload variations through dynamic voltage-frequency scaling (DVFS) for a hardware MPEG-2 encoder design.","PeriodicalId":412696,"journal":{"name":"2008 45th ACM/IEEE Design Automation Conference","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133265088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
During the development of a next-generation Intel processor, the project's formal verification team verified a new coherence protocol and portions of its RTL implementation against the protocol's specification within project deadlines. Typically, FV teams apply formal property verification (FPV) after RTL is coded and, though it continues to be an effective complement to pre-silicon validation, this late application prevents it from keeping pace with the continual complexity increases in hardware designs. Our discussion centers around how applying FV early in the development cycle of this processor enabled continual verification as the design progressed, culminating with the targeted RTL verification. We also present the languages and methodologies used, the reasons behind the choices, and where improvements can be made.
{"title":"Pre-RTL formal verification: An Intel experience","authors":"R. Beers","doi":"10.1145/1391469.1391675","DOIUrl":"https://doi.org/10.1145/1391469.1391675","url":null,"abstract":"During the development of a next-generation Intel processor, the project's formal verification team verified a new coherence protocol and portions of its RTL implementation against the protocol's specification within project deadlines. Typically, FV teams apply formal property verification (FPV) after RTL is coded and, though it continues to be an effective complement to pre-silicon validation, this late application prevents it from keeping pace with the continual complexity increases in hardware designs. Our discussion centers around how applying FV early in the development cycle of this processor enabled continual verification as the design progressed, culminating with the targeted RTL verification. We also present the languages and methodologies used, the reasons behind the choices, and where improvements can be made.","PeriodicalId":412696,"journal":{"name":"2008 45th ACM/IEEE Design Automation Conference","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130556839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Empirically characterized equation- and table-based cell models have been applied in static timing analysis for decades. These models have been extended to handle a variety of environmental and circuit phenomena over the years. This has given rise to a profusion of cell models that are used to verify circuit functionality and performance. The recent invention of a second-generation of current source models shows the promise of a unified electrical cell model that comprehensively addresses most of the effects that are perceived as accuracy limiters. In this paper, we describe these accuracy limiters and present comprehensive results for a particular current source model [11].
{"title":"A “true” electrical cell model for timing, noise, and power grid verification","authors":"N. Menezes, Chandramouli V. Kashyap, C. Amin","doi":"10.1145/1391469.1391589","DOIUrl":"https://doi.org/10.1145/1391469.1391589","url":null,"abstract":"Empirically characterized equation- and table-based cell models have been applied in static timing analysis for decades. These models have been extended to handle a variety of environmental and circuit phenomena over the years. This has given rise to a profusion of cell models that are used to verify circuit functionality and performance. The recent invention of a second-generation of current source models shows the promise of a unified electrical cell model that comprehensively addresses most of the effects that are perceived as accuracy limiters. In this paper, we describe these accuracy limiters and present comprehensive results for a particular current source model [11].","PeriodicalId":412696,"journal":{"name":"2008 45th ACM/IEEE Design Automation Conference","volume":"218 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123394916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Sapatnekar, Eshel Haritan, K. Keutzer, A. Devgan, D. Kirkpatrick, S. Meier, Duaine Pryor, Tom Spyrou
Faced with continually coping with Moore's Law, computer-aided design (CAD) for integrated circuits is used to facing challenges in our ever-evolving design problem. Increasing device complexity is a perennial challenge and has led to several discontinuities in design methodology. Over the last decade deep submicron physical effects have significantly complicated the design process and required new efforts in design for manufacturability. With the emergence of multicore and manycore microprocessor systems we face a new type of challenge: Not only will our design object (the microprocessor systems themselves) take another leap in complexity, but for the first time in our industry's history we will need to fundamentally change the way we design and implement our software solutions as well, this panel a broad set of representatives at the front lines of addressing this challenge will outline how they plan to respond.
{"title":"Reinventing EDA with manycore processors","authors":"S. Sapatnekar, Eshel Haritan, K. Keutzer, A. Devgan, D. Kirkpatrick, S. Meier, Duaine Pryor, Tom Spyrou","doi":"10.1145/1391469.1391502","DOIUrl":"https://doi.org/10.1145/1391469.1391502","url":null,"abstract":"Faced with continually coping with Moore's Law, computer-aided design (CAD) for integrated circuits is used to facing challenges in our ever-evolving design problem. Increasing device complexity is a perennial challenge and has led to several discontinuities in design methodology. Over the last decade deep submicron physical effects have significantly complicated the design process and required new efforts in design for manufacturability. With the emergence of multicore and manycore microprocessor systems we face a new type of challenge: Not only will our design object (the microprocessor systems themselves) take another leap in complexity, but for the first time in our industry's history we will need to fundamentally change the way we design and implement our software solutions as well, this panel a broad set of representatives at the front lines of addressing this challenge will outline how they plan to respond.","PeriodicalId":412696,"journal":{"name":"2008 45th ACM/IEEE Design Automation Conference","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128808733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents an automated analog synthesis tool for topology generation and subsequent circuit sizing. Though sizing is indispensable, the paper mainly concentrates on topology generation. A new kind of GA is developed, where a fraction of the offsprings in each generation is built from building blocks or cells obtained from previous generations. The cells are stored in a hierarchically arranged library that also contains information on the preferred neighborhood of each cell. The adaptively formed cell library starts only with basic elements and gradually includes functionally useful and bigger blocks, pertinent to the design. The techniques have been applied to synthesize an operational amplifier and a ring oscillator design. Results show that with reasonable computational effort, topologies have evolved that are designer understandable.
{"title":"Topology synthesis of analog circuits based on adaptively generated building blocks","authors":"Angan Das, R. Vemuri","doi":"10.1145/1391469.1391483","DOIUrl":"https://doi.org/10.1145/1391469.1391483","url":null,"abstract":"This paper presents an automated analog synthesis tool for topology generation and subsequent circuit sizing. Though sizing is indispensable, the paper mainly concentrates on topology generation. A new kind of GA is developed, where a fraction of the offsprings in each generation is built from building blocks or cells obtained from previous generations. The cells are stored in a hierarchically arranged library that also contains information on the preferred neighborhood of each cell. The adaptively formed cell library starts only with basic elements and gradually includes functionally useful and bigger blocks, pertinent to the design. The techniques have been applied to synthesize an operational amplifier and a ring oscillator design. Results show that with reasonable computational effort, topologies have evolved that are designer understandable.","PeriodicalId":412696,"journal":{"name":"2008 45th ACM/IEEE Design Automation Conference","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121160737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Integrated circuits with bounded lifetimes can have many business advantages. We give some simple examples of methods to enforce tunable expiration dates for chips using nanometer reliability mechanisms.
{"title":"Bounded-lifetime integrated circuits","authors":"Puneet Gupta, A. Kahng","doi":"10.1145/1391469.1391560","DOIUrl":"https://doi.org/10.1145/1391469.1391560","url":null,"abstract":"Integrated circuits with bounded lifetimes can have many business advantages. We give some simple examples of methods to enforce tunable expiration dates for chips using nanometer reliability mechanisms.","PeriodicalId":412696,"journal":{"name":"2008 45th ACM/IEEE Design Automation Conference","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121404313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent advances in digital microfluidics have enabled lab-on-a-chip devices for DNA sequencing, immunoassays, clinical chemistry, and protein crystallization. Basic operations such as droplet dispensing, mixing, dilution, localized heating, and incubation can be carried out using a two-dimensional array of electrodes and nanoliter volumes of liquid. The number of independent input pins used to control the electrodes in such microfluidic "biochips" is an important cost-driver, especially for disposable PCB devices that are being developed for clinical and point-of-care diagnostics. However, most prior work on biochip design-automation has assumed independent control of the electrodes using a large number of input pins. Another limitation of prior work is that the mapping of control pins to electrodes is only applicable for a specific bioassay. We present a broadcast-addressing-based design technique for pin-constrained multi-functional biochips. The proposed method provides high throughput for bioassays and it reduces the number of control pins by identifying and connecting control pins with "compatible" actuation sequences. The proposed method is evaluated using a multifunctional chip designed to execute a set of multiplexed bioassays, the polymerase chain reaction, and a protein dilution assay.
{"title":"Broadcast electrode-addressing for pin-constrained multi-functional digital microfluidic biochips","authors":"Tao Xu, K. Chakrabarty","doi":"10.1145/1391469.1391514","DOIUrl":"https://doi.org/10.1145/1391469.1391514","url":null,"abstract":"Recent advances in digital microfluidics have enabled lab-on-a-chip devices for DNA sequencing, immunoassays, clinical chemistry, and protein crystallization. Basic operations such as droplet dispensing, mixing, dilution, localized heating, and incubation can be carried out using a two-dimensional array of electrodes and nanoliter volumes of liquid. The number of independent input pins used to control the electrodes in such microfluidic \"biochips\" is an important cost-driver, especially for disposable PCB devices that are being developed for clinical and point-of-care diagnostics. However, most prior work on biochip design-automation has assumed independent control of the electrodes using a large number of input pins. Another limitation of prior work is that the mapping of control pins to electrodes is only applicable for a specific bioassay. We present a broadcast-addressing-based design technique for pin-constrained multi-functional biochips. The proposed method provides high throughput for bioassays and it reduces the number of control pins by identifying and connecting control pins with \"compatible\" actuation sequences. The proposed method is evaluated using a multifunctional chip designed to execute a set of multiplexed bioassays, the polymerase chain reaction, and a protein dilution assay.","PeriodicalId":412696,"journal":{"name":"2008 45th ACM/IEEE Design Automation Conference","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121345658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Increasing variability in the manufacturing process and growing complexity of the integrated circuits has given rise to many design and verification challenges. Statistical analysis of circuits and current source based gate delay models have started to replace the conventional static timing analysis which uses lookup tables for gate delays. In this paper we develop a statistical current source based gate model. We use accurate analytical models for representing the parameters of the gate model as functions of process parameters. Using the proposed statistical gate model, the gate output signal is generated and modeled as process dependent variational waveform. We present a compact model for representation of the variational signal waveform. The proposed waveform model can accurately generate the signal waveform at any process corner for accurate timing analysis. We generated the prosed model for gates of a 90 nm industry library and validated with SPICE simulations. Our model for logic gates and variational waveforms showed very good correlation with SPICE. The maximum error across all validation experiments was close to 3%.
{"title":"Statistical waveform and current source based standard cell models for accurate timing analysis","authors":"A. Goel, S. Vrudhula","doi":"10.1145/1391469.1391526","DOIUrl":"https://doi.org/10.1145/1391469.1391526","url":null,"abstract":"Increasing variability in the manufacturing process and growing complexity of the integrated circuits has given rise to many design and verification challenges. Statistical analysis of circuits and current source based gate delay models have started to replace the conventional static timing analysis which uses lookup tables for gate delays. In this paper we develop a statistical current source based gate model. We use accurate analytical models for representing the parameters of the gate model as functions of process parameters. Using the proposed statistical gate model, the gate output signal is generated and modeled as process dependent variational waveform. We present a compact model for representation of the variational signal waveform. The proposed waveform model can accurately generate the signal waveform at any process corner for accurate timing analysis. We generated the prosed model for gates of a 90 nm industry library and validated with SPICE simulations. Our model for logic gates and variational waveforms showed very good correlation with SPICE. The maximum error across all validation experiments was close to 3%.","PeriodicalId":412696,"journal":{"name":"2008 45th ACM/IEEE Design Automation Conference","volume":"194 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132508079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In demand of more computing power and less energy use, multiprocessor with power management facility emerges in embedded system design. Dynamic voltage scaling is such a facility that varies clock speed and supply voltage to save more energy. In this paper, we propose ETAHM to allocate tasks on a target multiprocessor system. In pursuit of global optimal solution, it mixes task scheduling, mapping and DVS utilization in one phase and couples ant colony optimization algorithm. Extensive experiments show ETAHM could save 22.71% more energy than CASPER (V. Kianzad et al., 2005), a state-of-the-art integrated framework that tackles the identical problem with genetic algorithm instead.
在嵌入式系统设计中,为了满足更高的计算能力和更低的能耗需求,具有电源管理功能的多处理器应运而生。动态电压缩放是这样一种设备,改变时钟速度和供电电压,以节省更多的能源。在本文中,我们提出ETAHM在目标多处理器系统上分配任务。为了追求全局最优解,该算法将任务调度、映射和分布式交换机利用混合在一个阶段,并结合蚁群优化算法。大量实验表明,ETAHM比CASPER (V. Kianzad et al., 2005)多节省22.71%的能源,CASPER是一种最先进的集成框架,用遗传算法来解决相同的问题。
{"title":"ETAHM: An energy-aware task allocation algorithm for heterogeneous multiprocessor","authors":"Po-Chun Chang, I-Wei Wu, J. Shann, C. Chung","doi":"10.1145/1391469.1391667","DOIUrl":"https://doi.org/10.1145/1391469.1391667","url":null,"abstract":"In demand of more computing power and less energy use, multiprocessor with power management facility emerges in embedded system design. Dynamic voltage scaling is such a facility that varies clock speed and supply voltage to save more energy. In this paper, we propose ETAHM to allocate tasks on a target multiprocessor system. In pursuit of global optimal solution, it mixes task scheduling, mapping and DVS utilization in one phase and couples ant colony optimization algorithm. Extensive experiments show ETAHM could save 22.71% more energy than CASPER (V. Kianzad et al., 2005), a state-of-the-art integrated framework that tackles the identical problem with genetic algorithm instead.","PeriodicalId":412696,"journal":{"name":"2008 45th ACM/IEEE Design Automation Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132607434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Explaining the mismatch between predicted timing behavior from modeling and simulation, and the observed timing behavior measured on silicon chips can be very challenging. Given a list of potential sources, the mismatch can be the aggregate result caused by some of them both individually and collectively, resulting in a very large search space. Furthermore, observed data are always corrupted by some unknown statistical random noises. To overcome both challenges, this paper proposes a statistical diagnosis framework that formulates the diagnosis problem as a regression learning problem. In this diagnosis framework, the objective is to rank a set of features corresponding to the list of potential sources of concern. The rank is based on measured silicon path delay data such that a feature inducing a larger unexpected timing deviation is ranked higher. Experimental results are presented to explain the learning method. Diagnosis effectiveness will be demonstrated through benchmark experiments and on an industrial design.
{"title":"Statistical diagnosis of unmodeled systematic timing effects","authors":"P. Bastani, N. Callegari, Li-C. Wang, M. Abadir","doi":"10.1145/1391469.1391566","DOIUrl":"https://doi.org/10.1145/1391469.1391566","url":null,"abstract":"Explaining the mismatch between predicted timing behavior from modeling and simulation, and the observed timing behavior measured on silicon chips can be very challenging. Given a list of potential sources, the mismatch can be the aggregate result caused by some of them both individually and collectively, resulting in a very large search space. Furthermore, observed data are always corrupted by some unknown statistical random noises. To overcome both challenges, this paper proposes a statistical diagnosis framework that formulates the diagnosis problem as a regression learning problem. In this diagnosis framework, the objective is to rank a set of features corresponding to the list of potential sources of concern. The rank is based on measured silicon path delay data such that a feature inducing a larger unexpected timing deviation is ranked higher. Experimental results are presented to explain the learning method. Diagnosis effectiveness will be demonstrated through benchmark experiments and on an industrial design.","PeriodicalId":412696,"journal":{"name":"2008 45th ACM/IEEE Design Automation Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130791020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}