Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419902
S. Murali, L. Benini, G. Micheli
Three-dimensional integrated circuits, where multiple silicon layers are stacked vertically have emerged recently. The3DICs have smaller form factor, shorter and efficient use of wires and allow integration of diverse technologies in the same device. The use of Networks on Chips (NoCs) to connect components in a 3D chip is a necessity. In this short paper, we present an outline on designing application-specific NoCs for 3D ICs.
{"title":"Design of Networks on Chips for 3D ICs","authors":"S. Murali, L. Benini, G. Micheli","doi":"10.1109/ASPDAC.2010.5419902","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419902","url":null,"abstract":"Three-dimensional integrated circuits, where multiple silicon layers are stacked vertically have emerged recently. The3DICs have smaller form factor, shorter and efficient use of wires and allow integration of diverse technologies in the same device. The use of Networks on Chips (NoCs) to connect components in a 3D chip is a necessity. In this short paper, we present an outline on designing application-specific NoCs for 3D ICs.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130744485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In high-performance nanometer synchronous chip design, a buffered clock tree with high tolerance of process variations is essential. The nominal clock skew always plays a crucial role in determining circuit performance and thus should be a first-order objective for clock-tree synthesis. The clock latency range (CLR), which is the latency difference under different supply voltages, is defined by the 2009 ACM ISPD Clock Network Synthesis Contest as the major optimization objective to measure the effects of process variation on clock-tree synthesis. In this paper, we propose a three-level framework which effectively constructs clock trees by performing blockage-avoiding buffer insertion with both nominal skew and CLR minimization. To cope with the objectives, we present a novel three-stage TTR clock-tree construction algorithm which consists of clock-tree Topology Generation, Tapping-Point Determination, and Routing. Experimental results show that our framework with the TTR algorithm achieves the best average quality for both nominal skew and CLR, compared to all the participating teams for the 2009 ISPD Clock Network Synthesis Contest.
{"title":"Blockage-avoiding buffered clock-tree synthesis for clock latency-range and skew minimization","authors":"Xin-Wei Shih, Chung-Chun Cheng, Yuan-Kai Ho, Yao-Wen Chang","doi":"10.1109/ASPDAC.2010.5419850","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419850","url":null,"abstract":"In high-performance nanometer synchronous chip design, a buffered clock tree with high tolerance of process variations is essential. The nominal clock skew always plays a crucial role in determining circuit performance and thus should be a first-order objective for clock-tree synthesis. The clock latency range (CLR), which is the latency difference under different supply voltages, is defined by the 2009 ACM ISPD Clock Network Synthesis Contest as the major optimization objective to measure the effects of process variation on clock-tree synthesis. In this paper, we propose a three-level framework which effectively constructs clock trees by performing blockage-avoiding buffer insertion with both nominal skew and CLR minimization. To cope with the objectives, we present a novel three-stage TTR clock-tree construction algorithm which consists of clock-tree Topology Generation, Tapping-Point Determination, and Routing. Experimental results show that our framework with the TTR algorithm achieves the best average quality for both nominal skew and CLR, compared to all the participating teams for the 2009 ISPD Clock Network Synthesis Contest.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133406598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419892
Jun Zhu, I. Sander, A. Jantsch
We present a global scheduling framework for synchronous data flow (SDF) streaming applications on MPSoCs, based on optimized computation and contention-free routing. The global scheduling of processors computing and communication transactions are formulated as constraint based problem, to avoid the scheduling overhead in TDMA-like heuristic schemes. A public domain constraint solver is exploited to solve the NP-complete scheduling efficiently, together with problem specific constraint modeling techniques. Experimental results show that the proposed framework can achieve a high predictable application throughput with minimized buffer cost. For instance, for applications in communication domain, higher throughput (up to 87%) has been observed with less buffer cost, compared to scenarios considering the heuristic scheduling overhead.
{"title":"Constrained global scheduling of streaming applications on MPSoCs","authors":"Jun Zhu, I. Sander, A. Jantsch","doi":"10.1109/ASPDAC.2010.5419892","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419892","url":null,"abstract":"We present a global scheduling framework for synchronous data flow (SDF) streaming applications on MPSoCs, based on optimized computation and contention-free routing. The global scheduling of processors computing and communication transactions are formulated as constraint based problem, to avoid the scheduling overhead in TDMA-like heuristic schemes. A public domain constraint solver is exploited to solve the NP-complete scheduling efficiently, together with problem specific constraint modeling techniques. Experimental results show that the proposed framework can achieve a high predictable application throughput with minimized buffer cost. For instance, for applications in communication domain, higher throughput (up to 87%) has been observed with less buffer cost, compared to scenarios considering the heuristic scheduling overhead.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133674273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419875
J. Chun, Jae Wook Lee, J. Abraham
Timing problems in high-speed serial communications are mitigated with phase-interpolator (PI) circuitry. Linearity testing of PI has been challenging, even though PI is widely used in modern high speed I/O architectures. Previous research has focused on implementing additional built-in circuits to measure PI linearity. In this paper, we present a cost effective PI linearity measurement technique which requires no significant modification of existing I/O circuits. Our method uses jitter distributions obtained from random jitter injected into the data channel. Two distributions are separately obtained using undersampling and sampling using PI. The proposed algorithm calculates the differential nonlinearity (DNL) from the difference of these distributions. Simulation results show that the average prediction RMS error for the DNL calculation is 0.31 LSB.
{"title":"A novel characterization technique for high speed I/O mixed signal circuit components using random jitter injection","authors":"J. Chun, Jae Wook Lee, J. Abraham","doi":"10.1109/ASPDAC.2010.5419875","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419875","url":null,"abstract":"Timing problems in high-speed serial communications are mitigated with phase-interpolator (PI) circuitry. Linearity testing of PI has been challenging, even though PI is widely used in modern high speed I/O architectures. Previous research has focused on implementing additional built-in circuits to measure PI linearity. In this paper, we present a cost effective PI linearity measurement technique which requires no significant modification of existing I/O circuits. Our method uses jitter distributions obtained from random jitter injected into the data channel. Two distributions are separately obtained using undersampling and sampling using PI. The proposed algorithm calculates the differential nonlinearity (DNL) from the difference of these distributions. Simulation results show that the average prediction RMS error for the DNL calculation is 0.31 LSB.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132370821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present a new voltage IR drop analysis approach for large on-chip power delivery networks. The new approach is based on recently proposed sampling based reduction technique to reduce the circuit matrices before the simulation. Due to the disruptive nature of tap current waveforms in typical industry power grid networks, input current sources typically has wide frequency power spectrum. To avoid the excessively sampling, the new approach introduces an error check mechanism and on-the-fly error reduction scheme during the simulation of the reduced circuits to improve the accuracy of estimating the the large IR drops. The proposed method presents a new way to combine model order reduction and simulation to achieve the overall efficiency of simulation. The new method can also easily trade errors for speed for different applications. Experimental results show the proposed IR drop analysis method can significantly reduce the errors of the existing ETBR method at the similar computing cost, while it can have 10X and more speedup over the the commercial power grid simulator in UltraSim with about 1–2% errors on a number of real industry benchmark circuits.
{"title":"Efficient power grid integrity analysis using on-the-fly error check and reduction","authors":"Duo Li, S. Tan, N. Mi, Yici Cai","doi":"10.5555/1899721.1899897","DOIUrl":"https://doi.org/10.5555/1899721.1899897","url":null,"abstract":"In this paper, we present a new voltage IR drop analysis approach for large on-chip power delivery networks. The new approach is based on recently proposed sampling based reduction technique to reduce the circuit matrices before the simulation. Due to the disruptive nature of tap current waveforms in typical industry power grid networks, input current sources typically has wide frequency power spectrum. To avoid the excessively sampling, the new approach introduces an error check mechanism and on-the-fly error reduction scheme during the simulation of the reduced circuits to improve the accuracy of estimating the the large IR drops. The proposed method presents a new way to combine model order reduction and simulation to achieve the overall efficiency of simulation. The new method can also easily trade errors for speed for different applications. Experimental results show the proposed IR drop analysis method can significantly reduce the errors of the existing ETBR method at the similar computing cost, while it can have 10X and more speedup over the the commercial power grid simulator in UltraSim with about 1–2% errors on a number of real industry benchmark circuits.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115174650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419813
Po-Hsien Chang, Li-C. Wang
This paper studies the problem of automatic assertion extraction at the input boundary of a given unit embedded in a system. This paper proposes a data mining approach that analyzes simulation traces to extract the assertions. We borrow two key concepts from the sequential data mining and develop an effective assertion extraction approach specific to our problem. These two concepts are (1) the slide-window-based episode definition that decides the space of all potential assertions and (2) the Support-Confidence framework that evaluates the meaningfulness of potential assertions using a given simulation trace. We implement the approach in a system simulation environment built on the AMBA 2.0 standard. Experimental results demonstrate the feasibility of the proposed approach and validity of extracted assertions are verified by comparing to the transactions defined in the specification.
{"title":"Automatic assertion extraction via sequential data mining of simulation traces","authors":"Po-Hsien Chang, Li-C. Wang","doi":"10.1109/ASPDAC.2010.5419813","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419813","url":null,"abstract":"This paper studies the problem of automatic assertion extraction at the input boundary of a given unit embedded in a system. This paper proposes a data mining approach that analyzes simulation traces to extract the assertions. We borrow two key concepts from the sequential data mining and develop an effective assertion extraction approach specific to our problem. These two concepts are (1) the slide-window-based episode definition that decides the space of all potential assertions and (2) the Support-Confidence framework that evaluates the meaningfulness of potential assertions using a given simulation trace. We implement the approach in a system simulation environment built on the AMBA 2.0 standard. Experimental results demonstrate the feasibility of the proposed approach and validity of extracted assertions are verified by comparing to the transactions defined in the specification.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115246366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419901
Weiwei Chen, R. Dömer
Embedded system design usually starts from an executable specification model described in a C-based System Level Description Language (SLDL), such as SystemC or SpecC. In this paper, we identify a subset of well-defined C-based design models, called periodic ConcurrenC models, that can be statically scheduled, resulting in significant higher simulation and execution speed. We propose a novel heuristic scheduling algorithm that not only is faster than classic matrix-based synchronous dataflow (SDF) scheduling approaches, but also reduces the model execution time by an order of magnitude over the default discrete event simulation.
{"title":"A fast heuristic scheduling algorithm for periodic ConcurrenC models","authors":"Weiwei Chen, R. Dömer","doi":"10.1109/ASPDAC.2010.5419901","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419901","url":null,"abstract":"Embedded system design usually starts from an executable specification model described in a C-based System Level Description Language (SLDL), such as SystemC or SpecC. In this paper, we identify a subset of well-defined C-based design models, called periodic ConcurrenC models, that can be statically scheduled, resulting in significant higher simulation and execution speed. We propose a novel heuristic scheduling algorithm that not only is faster than classic matrix-based synchronous dataflow (SDF) scheduling approaches, but also reduces the model execution time by an order of magnitude over the default discrete event simulation.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115548442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419791
A. Su
Electronic System Level (ESL) design methodology has been widely adopted in SoC designing, especially for designs with multiple cores. High level synthesis is now becoming a standard tool in the ESL design flow. People use the term ESL Synthesis to suggest the solution for multicore system synthesis. In this paper we argue that ESL Synthesis is architecture synthesis, high level synthesis and software synthesis combined. A multicore architecture synthesis algorithm had been implemented and proven in an experimental industry use. We successfully synthesized the target application, a GSM Edge algorithm for base station, into single and multicore systems. With this experience we developed the theory how high level synthesis and software synthesis should work with architecture synthesis to perform the task of ESL synthesis. Possible future research directions inspired by this work are also proposed. Key contributions of this work are (1) a user-defined cost function mechanism, (2) a warranted convergence mechanism and (3) combine above two mechanisms to waive the need for a universal cost function.
{"title":"Application of ESL Synthesis on GSM Edge algorithm for base station","authors":"A. Su","doi":"10.1109/ASPDAC.2010.5419791","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419791","url":null,"abstract":"Electronic System Level (ESL) design methodology has been widely adopted in SoC designing, especially for designs with multiple cores. High level synthesis is now becoming a standard tool in the ESL design flow. People use the term ESL Synthesis to suggest the solution for multicore system synthesis. In this paper we argue that ESL Synthesis is architecture synthesis, high level synthesis and software synthesis combined. A multicore architecture synthesis algorithm had been implemented and proven in an experimental industry use. We successfully synthesized the target application, a GSM Edge algorithm for base station, into single and multicore systems. With this experience we developed the theory how high level synthesis and software synthesis should work with architecture synthesis to perform the task of ESL synthesis. Possible future research directions inspired by this work are also proposed. Key contributions of this work are (1) a user-defined cost function mechanism, (2) a warranted convergence mechanism and (3) combine above two mechanisms to waive the need for a universal cost function.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115847049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419912
Wei-Tsun Sun, Z. Salcic, Avinash Malik
LibGALS is a library and run-time environment that extends a multi-process host operating system (OS) to support the design of Globally Asynchronous Locally Synchronous (GALS) software systems and models. LibGALS provides an application programming interface (API) that enables the designer to describe GALS concurrent programs and reactivity in sequential programming languages. Moreover, it facilitates the interface between the GALS concurrent program and other processes through the services provided by the host OS. LibGALS is also suitable as a target for code generation from GALS and synchronous concurrent languages. The experiments demonstrate code size and run-time gains when compared with other approaches to GALS system implementation.
{"title":"LibGALS: A library for GALS systems design and modeling","authors":"Wei-Tsun Sun, Z. Salcic, Avinash Malik","doi":"10.1109/ASPDAC.2010.5419912","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419912","url":null,"abstract":"LibGALS is a library and run-time environment that extends a multi-process host operating system (OS) to support the design of Globally Asynchronous Locally Synchronous (GALS) software systems and models. LibGALS provides an application programming interface (API) that enables the designer to describe GALS concurrent programs and reactivity in sequential programming languages. Moreover, it facilitates the interface between the GALS concurrent program and other processes through the services provided by the host OS. LibGALS is also suitable as a target for code generation from GALS and synchronous concurrent languages. The experiments demonstrate code size and run-time gains when compared with other approaches to GALS system implementation.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121155595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419880
Zheng Liu, Lihong Zhang
Performance of analog integrated circuits is highly sensitive to layout parasitics. This paper presents an improved template-based algorithm that automatically conducts performance-constrained parasitic-aware retargeting and optimization of analog layouts. In order to achieve desired circuit performance, performance sensitivities with respect to layout parasitics are first determined. Then the algorithm applies a piecewise-sensitivity model to control parasitic-related layout geometries by directly constructing a set of performance constraints subject to maximum performance deviation due to parasitics. The formulated problem is finally solved using graph-based techniques combined with mixed-integer nonlinear programming. The proposed method has been incorporated into a parasitic-aware automatic layout optimization and retargeting tool. It has been demonstrated to be effective and efficient especially when adapting layout design for new technologies or updated specifications.
{"title":"A performance-constrained template-based layout retargeting algorithm for analog integrated circuits","authors":"Zheng Liu, Lihong Zhang","doi":"10.1109/ASPDAC.2010.5419880","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419880","url":null,"abstract":"Performance of analog integrated circuits is highly sensitive to layout parasitics. This paper presents an improved template-based algorithm that automatically conducts performance-constrained parasitic-aware retargeting and optimization of analog layouts. In order to achieve desired circuit performance, performance sensitivities with respect to layout parasitics are first determined. Then the algorithm applies a piecewise-sensitivity model to control parasitic-related layout geometries by directly constructing a set of performance constraints subject to maximum performance deviation due to parasitics. The formulated problem is finally solved using graph-based techniques combined with mixed-integer nonlinear programming. The proposed method has been incorporated into a parasitic-aware automatic layout optimization and retargeting tool. It has been demonstrated to be effective and efficient especially when adapting layout design for new technologies or updated specifications.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117153986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}