In this paper, a global clock network that incorporates standing waves and coupled oscillators to distribute a high-frequency clock signal with low skew and low jitter is described. The key design issues involved in generating standing waves on a chip are discussed, including minimizing wire loss within an available technology. A standing-wave oscillator, a distributed oscillator that sustains ideal standing waves on lossy wires, is introduced. A clock grid architecture comprised of coupled, standing-wave oscillators and differential, low-swing clock buffers is presented. The measured results for a prototyped standing-wave clock grid operating at 10GHz and fabricated in a 0.18/spl mu/m 6M CMOS logic process are presented. A technique is proposed for on-chip skew measurements with subpicosecond precision.
{"title":"Design of a 10GHz clock distribution network using coupled standing-wave oscillators","authors":"F. O’Mahony, C. Yue, M. Horowitz, S. Wong","doi":"10.1145/775832.776005","DOIUrl":"https://doi.org/10.1145/775832.776005","url":null,"abstract":"In this paper, a global clock network that incorporates standing waves and coupled oscillators to distribute a high-frequency clock signal with low skew and low jitter is described. The key design issues involved in generating standing waves on a chip are discussed, including minimizing wire loss within an available technology. A standing-wave oscillator, a distributed oscillator that sustains ideal standing waves on lossy wires, is introduced. A clock grid architecture comprised of coupled, standing-wave oscillators and differential, low-swing clock buffers is presented. The measured results for a prototyped standing-wave clock grid operating at 10GHz and fabricated in a 0.18/spl mu/m 6M CMOS logic process are presented. A technique is proposed for on-chip skew measurements with subpicosecond precision.","PeriodicalId":167477,"journal":{"name":"Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126743031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present an efficient method for computing switching windows in the presence of delay noise. In static timing analysis, delay noise has traditionally been modeled using a simple switch-factor based noise model and the computation of switching windows is performed using an iterative algorithm, resulting in an overall run time of O(n/sup 2/), where n is the number of gates in the circuit. It has also been shown that the iterations converge to different solutions, depending on the initial assumptions, making it unclear which solution is correct. In this paper, we show that the iterative nature of the problem is due to the switching-factor based noise model and the order in which events are evaluated. We utilize a delay noise model based on superposition and propose a new algorithm with a run time that is linear with the circuit size. Since the algorithm is non-iterative and does not operate with initial assumptions, it also eliminates the multiple solution problems. We tested the algorithm on a number of designs and show that it achieves significant speedup over the iterative approach.
{"title":"Non-iterative switching window computation for delay-noise","authors":"Bhavana Thudi, D. Blaauw","doi":"10.1145/775832.775934","DOIUrl":"https://doi.org/10.1145/775832.775934","url":null,"abstract":"In this paper, we present an efficient method for computing switching windows in the presence of delay noise. In static timing analysis, delay noise has traditionally been modeled using a simple switch-factor based noise model and the computation of switching windows is performed using an iterative algorithm, resulting in an overall run time of O(n/sup 2/), where n is the number of gates in the circuit. It has also been shown that the iterations converge to different solutions, depending on the initial assumptions, making it unclear which solution is correct. In this paper, we show that the iterative nature of the problem is due to the switching-factor based noise model and the order in which events are evaluated. We utilize a delay noise model based on superposition and propose a new algorithm with a run time that is linear with the circuit size. Since the algorithm is non-iterative and does not operate with initial assumptions, it also eliminates the multiple solution problems. We tested the algorithm on a number of designs and show that it achieves significant speedup over the iterative approach.","PeriodicalId":167477,"journal":{"name":"Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121631294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we propose a method for generating reduced models for a class of nonlinear dynamical systems, based on truncated balanced realization (TBR) algorithm and a recently developed trajectory piecewise-linear (TPWL) model order reduction approach. We also present a scheme which uses both Krylov-based and TBR-based projections. Computational results, obtained for examples of nonlinear circuits and a micro-electro-mechanical system (MEMS), indicate that the proposed reduction scheme generates nonlinear macromodels with superior accuracy as compared to reduction algorithms based solely on Krylov subspace projections, while maintaining a relatively low model extraction cost.
{"title":"A TBR-based trajectory piecewise-linear algorithm for generating accurate low-order models for nonlinear analog circuits and MEMS","authors":"D. Vasilyev, M. Rewienski, Jacob K. White","doi":"10.1145/775832.775958","DOIUrl":"https://doi.org/10.1145/775832.775958","url":null,"abstract":"In this paper we propose a method for generating reduced models for a class of nonlinear dynamical systems, based on truncated balanced realization (TBR) algorithm and a recently developed trajectory piecewise-linear (TPWL) model order reduction approach. We also present a scheme which uses both Krylov-based and TBR-based projections. Computational results, obtained for examples of nonlinear circuits and a micro-electro-mechanical system (MEMS), indicate that the proposed reduction scheme generates nonlinear macromodels with superior accuracy as compared to reduction algorithms based solely on Krylov subspace projections, while maintaining a relatively low model extraction cost.","PeriodicalId":167477,"journal":{"name":"Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114833826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, a new technique for testing the interconnects of an arbitrary design mapped into an FPGA is presented. In this technique, only the configuration of logic blocks used in the design is changed. The test vector and configuration generation problem is systematically converted to a satisfiability (SAT) problem, and state of the art SAT-solvers are exploited for test configuration generation. Experimental results on various benchmark circuits show that only two test configurations are required to test for all bridging faults, achieving 100% fault coverage, with respect to the fault list.
{"title":"Using satisfiability in application-dependent testing of FPGA interconnects","authors":"M. Tahoori","doi":"10.1145/775832.776003","DOIUrl":"https://doi.org/10.1145/775832.776003","url":null,"abstract":"In this paper, a new technique for testing the interconnects of an arbitrary design mapped into an FPGA is presented. In this technique, only the configuration of logic blocks used in the design is changed. The test vector and configuration generation problem is systematically converted to a satisfiability (SAT) problem, and state of the art SAT-solvers are exploited for test configuration generation. Experimental results on various benchmark circuits show that only two test configurations are required to test for all bridging faults, achieving 100% fault coverage, with respect to the fault list.","PeriodicalId":167477,"journal":{"name":"Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115886999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cyclic circuits that do not hold state or oscillate are often the most convenient representation for certain functions, such as arbiters, and can easily by produced inadvertently in high-level synthesis, yet are troublesome for most circuit analysis tools. This paper presents an algorithm that generates an acyclic circuit that computes the same function as a given cyclic circuit for those inputs where the cyclic circuit does not oscillate or hold state. The algorithm identifies all patterns on inputs and internal nodes that lead to acyclic evaluation orders for the cyclic circuit, which are represented as acyclic circuit fragments, and then combines these to produce an acyclic circuit that can exhibit all of these behaviors. Experiments results suggest this potentially exponential algorithm is practical for small circuits and may be improved to handle larger circuits. This algorithm should make dealing with cyclic combinational circuits nearly as easy as dealing with their acyclic counterparts.
{"title":"Making cyclic circuits acyclic","authors":"S. Edwards","doi":"10.1145/775832.775874","DOIUrl":"https://doi.org/10.1145/775832.775874","url":null,"abstract":"Cyclic circuits that do not hold state or oscillate are often the most convenient representation for certain functions, such as arbiters, and can easily by produced inadvertently in high-level synthesis, yet are troublesome for most circuit analysis tools. This paper presents an algorithm that generates an acyclic circuit that computes the same function as a given cyclic circuit for those inputs where the cyclic circuit does not oscillate or hold state. The algorithm identifies all patterns on inputs and internal nodes that lead to acyclic evaluation orders for the cyclic circuit, which are represented as acyclic circuit fragments, and then combines these to produce an acyclic circuit that can exhibit all of these behaviors. Experiments results suggest this potentially exponential algorithm is practical for small circuits and may be improved to handle larger circuits. This algorithm should make dealing with cyclic combinational circuits nearly as easy as dealing with their acyclic counterparts.","PeriodicalId":167477,"journal":{"name":"Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117191673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper discusses high level techniques for designing fault tolerant systems in SRAM-based FPGAs, without modification in the FPGA architecture. Triple Modular Redundancy (TMR) has been successfully applied in FPGAs to mitigate transient faults, which are likely to occur in space applications. However, TMR comes with high area and power dissipation penalties. The new technique proposed in this paper was specifically developed for FPGAs to cope with transient faults in the user combinational and sequential logic, while also reducing pin count, area and power dissipation. The methodology was validated by fault injection experiments in an emulation board. We present some fault coverage results and a comparison with the TMR approach.
{"title":"Designing fault tolerant systems into SRAM-based FPGAs","authors":"F. Lima, L. Carro, R. Reis","doi":"10.1145/775832.775997","DOIUrl":"https://doi.org/10.1145/775832.775997","url":null,"abstract":"This paper discusses high level techniques for designing fault tolerant systems in SRAM-based FPGAs, without modification in the FPGA architecture. Triple Modular Redundancy (TMR) has been successfully applied in FPGAs to mitigate transient faults, which are likely to occur in space applications. However, TMR comes with high area and power dissipation penalties. The new technique proposed in this paper was specifically developed for FPGAs to cope with transient faults in the user combinational and sequential logic, while also reducing pin count, area and power dissipation. The methodology was validated by fault injection experiments in an emulation board. We present some fault coverage results and a comparison with the TMR approach.","PeriodicalId":167477,"journal":{"name":"Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115557565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a novel method to efficiently generate, compress and apply test patterns in a logic BIST architecture. Patterns are generated by a modified automatic test pattern generator (ATPG) and are encoded as linear feedback shift register (LFSR) initial values (seeds); one or more patterns can be encoded into a single LFSR seed. During test application, seeds are loaded into the LFSR with no cycle overhead. The method presented achieves reductions of at least 100x in test data and 10x in tester cycles compared to deterministic ATPG while maintaining complete fault coverage, as confirmed by experimental results on industrial designs.
{"title":"Efficient compression and application of deterministic patterns in a logic BIST architecture","authors":"P. Wohl, J. Waicukauski, Sanjay B. Patel, M. Amin","doi":"10.1145/775832.775976","DOIUrl":"https://doi.org/10.1145/775832.775976","url":null,"abstract":"We present a novel method to efficiently generate, compress and apply test patterns in a logic BIST architecture. Patterns are generated by a modified automatic test pattern generator (ATPG) and are encoded as linear feedback shift register (LFSR) initial values (seeds); one or more patterns can be encoded into a single LFSR seed. During test application, seeds are loaded into the LFSR with no cycle overhead. The method presented achieves reductions of at least 100x in test data and 10x in tester cycles compared to deterministic ATPG while maintaining complete fault coverage, as confirmed by experimental results on industrial designs.","PeriodicalId":167477,"journal":{"name":"Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451)","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123384183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software-based self-test (SBST) is an emerging approach to address the challenges of high-quality, at-speed test for complex programmable processors and systems-on chips (SoCs) that contain them. While early work on SBST has proposed several promising ideas, many challenges remain in applying SBST to realistic embedded processors. We propose a systematic scalable methodology for SBST that automates several key steps. The proposed methodology consists of (i) identifying test program templates that are well suited for test delivery to each module within the processor, (ii) extracting input/output mapping functions that capture the controllability/observability constraints imposed by a test program template for a specific module-under-test, (iii) generating module-level tests by representing the input/output mapping functions as virtual constraint circuits, and (iv) automatic synthesis of a software self-test program from the module-level tests. We propose novel RTL simulation-based techniques for template ranking and selection, and techniques based on the theory of statistical regression for extraction of input/output mapping functions. An important advantage of the proposed techniques is their scalability, which is necessitated by the significant and growing complexity of embedded processors. To demonstrate the utility of the proposed methodology, we have applied it to a commercial state-of-the-art embedded processor (Xtensa form Tensilica Inc.). We believe this is the first practical demonstration of software-based self-test on a processor of such complexity. Experimental results demonstrate that software self-test programs generated using the proposed methodology are able to detect most (95.2%) of the functionally testable faults, and achieve significant simultaneous improvements in fault coverage and test length compared with conventional functional test.
{"title":"A scalable software-based self-test methodology for programmable processors","authors":"Li Chen, S. Ravi, A. Raghunathan, S. Dey","doi":"10.1145/775832.775973","DOIUrl":"https://doi.org/10.1145/775832.775973","url":null,"abstract":"Software-based self-test (SBST) is an emerging approach to address the challenges of high-quality, at-speed test for complex programmable processors and systems-on chips (SoCs) that contain them. While early work on SBST has proposed several promising ideas, many challenges remain in applying SBST to realistic embedded processors. We propose a systematic scalable methodology for SBST that automates several key steps. The proposed methodology consists of (i) identifying test program templates that are well suited for test delivery to each module within the processor, (ii) extracting input/output mapping functions that capture the controllability/observability constraints imposed by a test program template for a specific module-under-test, (iii) generating module-level tests by representing the input/output mapping functions as virtual constraint circuits, and (iv) automatic synthesis of a software self-test program from the module-level tests. We propose novel RTL simulation-based techniques for template ranking and selection, and techniques based on the theory of statistical regression for extraction of input/output mapping functions. An important advantage of the proposed techniques is their scalability, which is necessitated by the significant and growing complexity of embedded processors. To demonstrate the utility of the proposed methodology, we have applied it to a commercial state-of-the-art embedded processor (Xtensa form Tensilica Inc.). We believe this is the first practical demonstration of software-based self-test on a processor of such complexity. Experimental results demonstrate that software self-test programs generated using the proposed methodology are able to detect most (95.2%) of the functionally testable faults, and achieve significant simultaneous improvements in fault coverage and test length compared with conventional functional test.","PeriodicalId":167477,"journal":{"name":"Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129580311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Goren, M. Zelikson, R. Gordin, I. Wagner, A. Barger, Alon Amir, B. Livshitz, Anatoly Sherman, Y. Tretiakov, R. Groves, J. Park, D. Jordan, Sue E. Strang, Raminderpal Singh, C. Dickey, D. Harame
This paper expands the on-chip interconnect-aware methodology for high-speed analog and mixed signal design, presented in D. Goren et al. (2002), into a wider class of designs, including dense layout CMOS design. The proposed solution employs a set of parameterized on-chip transmission line (T-line) devices for the critical interconnects, which is expanded to include coplanar structures while considering the silicon substrate effect. The generalized methodology contains treatment of the crossing line effects at the various design stages, including two way interactions between the post layout extraction tool and the T-line devices. The T-line device models are passive by construction, easily migratable among design environments, and allow for both time and frequency domain simulations. These models are verified by S-parameter measurements up to 110GHz, as well as by EM solver results. It is experimentally shown that the effect of properly designed discontinuities is negligible in most practical cases. The basic on-chip T-line methodology is being used extensively for numerous high-speed designs.
{"title":"On-chip interconnect-aware design and modeling methodology based on high bandwidth transmission line devices","authors":"D. Goren, M. Zelikson, R. Gordin, I. Wagner, A. Barger, Alon Amir, B. Livshitz, Anatoly Sherman, Y. Tretiakov, R. Groves, J. Park, D. Jordan, Sue E. Strang, Raminderpal Singh, C. Dickey, D. Harame","doi":"10.1145/775832.776017","DOIUrl":"https://doi.org/10.1145/775832.776017","url":null,"abstract":"This paper expands the on-chip interconnect-aware methodology for high-speed analog and mixed signal design, presented in D. Goren et al. (2002), into a wider class of designs, including dense layout CMOS design. The proposed solution employs a set of parameterized on-chip transmission line (T-line) devices for the critical interconnects, which is expanded to include coplanar structures while considering the silicon substrate effect. The generalized methodology contains treatment of the crossing line effects at the various design stages, including two way interactions between the post layout extraction tool and the T-line devices. The T-line device models are passive by construction, easily migratable among design environments, and allow for both time and frequency domain simulations. These models are verified by S-parameter measurements up to 110GHz, as well as by EM solver results. It is experimentally shown that the effect of properly designed discontinuities is negligible in most practical cases. The basic on-chip T-line methodology is being used extensively for numerous high-speed designs.","PeriodicalId":167477,"journal":{"name":"Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451)","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129805079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Procedures for Boolean satisfiability most commonly work with Conjunctive Normal Form. Powerful SAT techniques based on implications and conflicts can be retained when the usual CNF clauses are replaced with BDDs. BDDs provide more powerful implication analysis, which can reduce the computational effort required to determine satisfiability.
{"title":"Checking satisfiability of a conjunction of BDDs","authors":"R. Damiano, J. Kukula","doi":"10.1145/775832.776039","DOIUrl":"https://doi.org/10.1145/775832.776039","url":null,"abstract":"Procedures for Boolean satisfiability most commonly work with Conjunctive Normal Form. Powerful SAT techniques based on implications and conflicts can be retained when the usual CNF clauses are replaced with BDDs. BDDs provide more powerful implication analysis, which can reduce the computational effort required to determine satisfiability.","PeriodicalId":167477,"journal":{"name":"Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129955266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}