A new SAT-Based algorithm for symbolic model checking has been gaining popularity. This algorithm, referred to as “Incremental Construction of Inductive Clauses for Indubitable Correctness” (IC3) or “Property Directed Reachability” (PDR), uses information learned from SAT instances of isolated time frames to either prove that an invariant exists, or provide a counter example. The information learned between each time frame is recorded in the form of cubes of the state variables. In this work, we study the effect of extending PDR to use cubes of intermediate variables representing the logic gates in the transition relation. We demonstrate that we can improve the runtime for satisfiable benchmarks by up to 3.2X, with an average speedup of 1.23X. Our approach also provides a speedup of up to 3.84X for unsatisfiable benchmarks.
{"title":"Using cubes of non-state variables with Property Directed Reachability","authors":"John D. Backes, Marc D. Riedel","doi":"10.7873/DATE.2013.171","DOIUrl":"https://doi.org/10.7873/DATE.2013.171","url":null,"abstract":"A new SAT-Based algorithm for symbolic model checking has been gaining popularity. This algorithm, referred to as “Incremental Construction of Inductive Clauses for Indubitable Correctness” (IC3) or “Property Directed Reachability” (PDR), uses information learned from SAT instances of isolated time frames to either prove that an invariant exists, or provide a counter example. The information learned between each time frame is recorded in the form of cubes of the state variables. In this work, we study the effect of extending PDR to use cubes of intermediate variables representing the logic gates in the transition relation. We demonstrate that we can improve the runtime for satisfiable benchmarks by up to 3.2X, with an average speedup of 1.23X. Our approach also provides a speedup of up to 3.84X for unsatisfiable benchmarks.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"19 1","pages":"807-810"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83572360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The growing variability and complexity of advanced CMOS technologies makes the physical design of clocked logic in large Systems-on-Chip more and more challenging. Asynchronous logic has been studied for many years and become an attractive solution for a broad range of applications, from massively parallel multi-media systems to systems with ultra-low power & low-noise constraints, like cryptography, energy autonomous systems, and sensor-network nodes. The objective of this embedded tutorial is to give a comprehensive and recent overview of asynchronous logic. The tutorial will cover the basic principles and advantages of asynchronous logic, some insights on new research challenges, and will present the GALS scheme as an intermediate design style with recent results in asynchronous Network-on-Chip for future Many Core architectures. Regarding industrial acceptance, recent asynchronous logic applications within the microelectronics industry will be presented, with a main focus on the commercial CAD tools available today.
{"title":"Advances in asynchronous logic: From principles to GALS & NoC, recent industry applications, and commercial CAD tools","authors":"A. Yakovlev, P. Vivet, M. Renaudin","doi":"10.7873/DATE.2013.346","DOIUrl":"https://doi.org/10.7873/DATE.2013.346","url":null,"abstract":"The growing variability and complexity of advanced CMOS technologies makes the physical design of clocked logic in large Systems-on-Chip more and more challenging. Asynchronous logic has been studied for many years and become an attractive solution for a broad range of applications, from massively parallel multi-media systems to systems with ultra-low power & low-noise constraints, like cryptography, energy autonomous systems, and sensor-network nodes. The objective of this embedded tutorial is to give a comprehensive and recent overview of asynchronous logic. The tutorial will cover the basic principles and advantages of asynchronous logic, some insights on new research challenges, and will present the GALS scheme as an intermediate design style with recent results in asynchronous Network-on-Chip for future Many Core architectures. Regarding industrial acceptance, recent asynchronous logic applications within the microelectronics industry will be presented, with a main focus on the commercial CAD tools available today.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"10 1","pages":"1715-1724"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90179839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The high heat flux and compact structure of three-dimensional circuits (3D ICs) make conventional air-cooled devices more subsceptible to overheating. Liquid cooling is an alternative that can improve heat dissipation, and reduce thermal issues. Fast and accurate thermal models are needed to appropriately dimension the cooling system at design time. Several models have been proposed to study different designs, but generally with low simulation performance. In this paper, we present an efficient model of the transient thermal behaviour of liquid-cooled 3D ICs. In our experiments, our approach is 60 times faster and uses 600 times less memory than state-of-the-art models, while maintaining the same level of accuracy.
{"title":"Explicit transient thermal simulation of liquid-cooled 3D ICs","authors":"Alain Fourmigue, G. Beltrame, G. Nicolescu","doi":"10.7873/DATE.2013.283","DOIUrl":"https://doi.org/10.7873/DATE.2013.283","url":null,"abstract":"The high heat flux and compact structure of three-dimensional circuits (3D ICs) make conventional air-cooled devices more subsceptible to overheating. Liquid cooling is an alternative that can improve heat dissipation, and reduce thermal issues. Fast and accurate thermal models are needed to appropriately dimension the cooling system at design time. Several models have been proposed to study different designs, but generally with low simulation performance. In this paper, we present an efficient model of the transient thermal behaviour of liquid-cooled 3D ICs. In our experiments, our approach is 60 times faster and uses 600 times less memory than state-of-the-art models, while maintaining the same level of accuracy.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"8 1 1","pages":"1385-1390"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90364363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Defects in TSVs due to fabrication steps decrease the yield and reliability of 3D stacked ICs, hence these defects need to be screened early in the manufacturing flow. Before wafer thinning, TSVs are buried in silicon and cannot be mechanically contacted, which severely limits test access. Although TSVs become exposed after wafer thinning, probing on them is difficult because of TSV dimensions and the risk of probe-induced damage. To circumvent these problems, we propose a non-invasive method for pre-bond TSV test that does not require TSV probing. We use open TSVs as capacitive loads of their driving gates and measure the propagation delay by means of ring oscillators. Defects in TSVs cause variations in their RC parameters and therefore lead to variations in the propagation delay. By measuring these variations, we can detect resistive open and leakage faults. We exploit different voltage levels to increase the sensitivity of the test and its robustness against random process variations. Results on fault detection effectiveness are presented through HSPICE simulations using realistic models for 45nm CMOS technology. The estimated DfT area cost of our method is negligible for realistic dies.
{"title":"Non-invasive pre-bond TSV test using ring oscillators and multiple voltage levels","authors":"Sergej Deutsch, K. Chakrabarty","doi":"10.7873/DATE.2013.225","DOIUrl":"https://doi.org/10.7873/DATE.2013.225","url":null,"abstract":"Defects in TSVs due to fabrication steps decrease the yield and reliability of 3D stacked ICs, hence these defects need to be screened early in the manufacturing flow. Before wafer thinning, TSVs are buried in silicon and cannot be mechanically contacted, which severely limits test access. Although TSVs become exposed after wafer thinning, probing on them is difficult because of TSV dimensions and the risk of probe-induced damage. To circumvent these problems, we propose a non-invasive method for pre-bond TSV test that does not require TSV probing. We use open TSVs as capacitive loads of their driving gates and measure the propagation delay by means of ring oscillators. Defects in TSVs cause variations in their RC parameters and therefore lead to variations in the propagation delay. By measuring these variations, we can detect resistive open and leakage faults. We exploit different voltage levels to increase the sensitivity of the test and its robustness against random process variations. Results on fault detection effectiveness are presented through HSPICE simulations using realistic models for 45nm CMOS technology. The estimated DfT area cost of our method is negligible for realistic dies.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"32 1","pages":"1065-1070"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90374731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Optical networks-on-chip (ONoCs) are currently still in the concept stage, and would benefit from explorative studies capable of bridging the gap between abstract analysis frameworks and the constraints and challenges posed by the physical layer. This paper aims to go beyond the traditional comparison of wavelength-routed ONoC topologies based only on their abstract properties, and for the first time assesses their physical implementation efficiency in an homogeneous experimental setting of practical relevance. As a result, the paper can demonstrate the significant and different deviation of topology layouts from their logic schemes under the effect of placement constraints on the target system. This becomes then the preliminary step for the accurate characterization of technology-specific metrics such as the insertion loss critical path, and to derive the ultimate impact on power efficiency and feasibility of each design.
{"title":"Contrasting wavelength-routed optical NoC topologies for power-efficient 3d-stacked multicore processors using physical-layer analysis","authors":"L. Ramini, P. Grani, S. Bartolini, D. Bertozzi","doi":"10.7873/DATE.2013.323","DOIUrl":"https://doi.org/10.7873/DATE.2013.323","url":null,"abstract":"Optical networks-on-chip (ONoCs) are currently still in the concept stage, and would benefit from explorative studies capable of bridging the gap between abstract analysis frameworks and the constraints and challenges posed by the physical layer. This paper aims to go beyond the traditional comparison of wavelength-routed ONoC topologies based only on their abstract properties, and for the first time assesses their physical implementation efficiency in an homogeneous experimental setting of practical relevance. As a result, the paper can demonstrate the significant and different deviation of topology layouts from their logic schemes under the effect of placement constraints on the target system. This becomes then the preliminary step for the accurate characterization of technology-specific metrics such as the insertion loss critical path, and to derive the ultimate impact on power efficiency and feasibility of each design.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"22 1","pages":"1589-1594"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90918246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Kleef, T. L. Massey, P. Ledochowitsch, R. Muller, R. Tiefenauer, T. Blanche, Hirotaka Sato, M. Maharbiz
In the original demonstration of insect flight control [2,4], flight initiation, cessation and elevation control were accomplished through neural stimulus of the brain which elicited, suppressed or modulated wing oscillation. Turns were triggered through the direct muscular stimulus of either of the basalar muscles. We characterized the response times, success rates, and free-flight trajectories elicited by our neural control systems in remotely controlled beetles.
{"title":"Cyborg insects, neural interfaces and other things: Building interfaces between the synthetic and the multicellular","authors":"J. Kleef, T. L. Massey, P. Ledochowitsch, R. Muller, R. Tiefenauer, T. Blanche, Hirotaka Sato, M. Maharbiz","doi":"10.7873/DATE.2013.314","DOIUrl":"https://doi.org/10.7873/DATE.2013.314","url":null,"abstract":"In the original demonstration of insect flight control [2,4], flight initiation, cessation and elevation control were accomplished through neural stimulus of the brain which elicited, suppressed or modulated wing oscillation. Turns were triggered through the direct muscular stimulus of either of the basalar muscles. We characterized the response times, success rates, and free-flight trajectories elicited by our neural control systems in remotely controlled beetles.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"22 1","pages":"1546-1546"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85294851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents an efficient algorithm for the placement of power supply pads in flip-chip packaging for high-performance VLSI circuits. The placement problem is formulated as a mixed-integer linear program (MILP), subject to the constraints on mean-time-to-failure (MTTF) for the pads and the voltage drop in the power grid. To improve the performance of the optimizer, the pad placement problem is solved based on the divide-and-conquer principle, and the locality properties of the power grid are exploited by modeling the distant nodes and sources coarsely, following the coarsening stage in multi-grid-like approach. An accurate electromigration (EM) model that captures current crowding and Joule heating effects is developed and integrated with our C4 placement approach. The effectiveness of the proposed approach is demonstrated on several designs adapted from publicly released benchmarks.
{"title":"Placement optimization of power supply pads based on locality","authors":"Pingqiang Zhou, Vivek Mishra, S. Sapatnekar","doi":"10.5555/2485288.2485681","DOIUrl":"https://doi.org/10.5555/2485288.2485681","url":null,"abstract":"This paper presents an efficient algorithm for the placement of power supply pads in flip-chip packaging for high-performance VLSI circuits. The placement problem is formulated as a mixed-integer linear program (MILP), subject to the constraints on mean-time-to-failure (MTTF) for the pads and the voltage drop in the power grid. To improve the performance of the optimizer, the pad placement problem is solved based on the divide-and-conquer principle, and the locality properties of the power grid are exploited by modeling the distant nodes and sources coarsely, following the coarsening stage in multi-grid-like approach. An accurate electromigration (EM) model that captures current crowding and Joule heating effects is developed and integrated with our C4 placement approach. The effectiveness of the proposed approach is demonstrated on several designs adapted from publicly released benchmarks.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"27 1","pages":"1655-1660"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78050457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
During various stages of hardware design, different types of control signals get introduced; clock, reset are specified and connected at the RTL stage whereas signals like scan enable, isolation enable, power switch enable get added to implemented devices later in the flow.
{"title":"Automated determination of Top Level Control Signals","authors":"R. Jain, Praveen Tiwari, Soumen Ghosh","doi":"10.7873/DATE.2013.115","DOIUrl":"https://doi.org/10.7873/DATE.2013.115","url":null,"abstract":"During various stages of hardware design, different types of control signals get introduced; clock, reset are specified and connected at the RTL stage whereas signals like scan enable, isolation enable, power switch enable get added to implemented devices later in the flow.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"92 1","pages":"509-512"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80929235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a new, low hardware overhead solution for permanent fault detection at the micro-architecture/instruction level. The proposed technique is based on an ultra-reduced instruction set co-processor (URISC) that, in its simplest form, executes only one Turing complete instruction — the subleq instruction. Thus, any instruction on the main core can be redundantly executed on the URISC using a sequence of subleq instructions, and the results can be compared, also on the URISC, to detect faults. A number of novel software and hardware techniques are proposed to decrease the performance overhead of online fault detection while keeping the error detection latency bounded including: (i) URISC routines and hardware support to check both control and data flow instructions; (ii) checking only a subset of instructions in the code based on a novel check window criterion; and (iii) URISC instruction set extensions. Our experimental results, based on FPGA synthesis and RTL simulations, illustrate the benefits of the proposed techniques.
{"title":"Low cost permanent fault detection using ultra-reduced instruction set co-processors","authors":"S. Ananthanarayanan, S. Garg, Hiren D. Patel","doi":"10.7873/DATE.2013.196","DOIUrl":"https://doi.org/10.7873/DATE.2013.196","url":null,"abstract":"In this paper, we propose a new, low hardware overhead solution for permanent fault detection at the micro-architecture/instruction level. The proposed technique is based on an ultra-reduced instruction set co-processor (URISC) that, in its simplest form, executes only one Turing complete instruction — the subleq instruction. Thus, any instruction on the main core can be redundantly executed on the URISC using a sequence of subleq instructions, and the results can be compared, also on the URISC, to detect faults. A number of novel software and hardware techniques are proposed to decrease the performance overhead of online fault detection while keeping the error detection latency bounded including: (i) URISC routines and hardware support to check both control and data flow instructions; (ii) checking only a subset of instructions in the code based on a novel check window criterion; and (iii) URISC instruction set extensions. Our experimental results, based on FPGA synthesis and RTL simulations, illustrate the benefits of the proposed techniques.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"32 1","pages":"933-938"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76084141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhen Li, S. L. Beux, C. Monat, X. Letartre, I. O’Connor
The computation capacity of conventional FPGAs is directly proportional to the size and expressive power of Look Up Table (LUT) resources. Individual LUT performance is limited by transistor switching time and power dissipation, defined by the CMOS fabrication process. In this paper we propose OLUT, an optical core implementation of LUT, which has the potential for low latency and low power computation. In addition, the use of Wavelength Division Multiplexing (WDM) allows parallel computation, which can further increase computation capacity. Preliminary experimental results demonstrate the potential for optically assisted on-chip computation.
{"title":"Optical Look Up Table","authors":"Zhen Li, S. L. Beux, C. Monat, X. Letartre, I. O’Connor","doi":"10.7873/DATE.2013.184","DOIUrl":"https://doi.org/10.7873/DATE.2013.184","url":null,"abstract":"The computation capacity of conventional FPGAs is directly proportional to the size and expressive power of Look Up Table (LUT) resources. Individual LUT performance is limited by transistor switching time and power dissipation, defined by the CMOS fabrication process. In this paper we propose OLUT, an optical core implementation of LUT, which has the potential for low latency and low power computation. In addition, the use of Wavelength Division Multiplexing (WDM) allows parallel computation, which can further increase computation capacity. Preliminary experimental results demonstrate the potential for optically assisted on-chip computation.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"11 1","pages":"873-876"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76227922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}