Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5653800
Bing Li, Ning Chen, Ulf Schlichtmann
Latch-controlled circuits have a remarkable advantage in timing performance as process variations become more relevant for circuit design. Existing methods of statistical timing analysis for such circuits, however, still need improvement in runtime and their results should be extended to provide yield information for any given clock period. In this paper, we propose a method combining a simplified iteration and a graph transformation algorithm. The result of this method is in a parametric form so that the yield for any given clock period can easily be evaluated. The graph transformation algorithm handles the constraints from nonpositive loops effectively, completely avoiding the heuristics used in other existing methods. Therefore the accuracy of the timing analysis is well maintained. Additionally, the proposed method is much faster than other existing methods. Especially for large circuits it offers about 100 times performance improvement in timing verification.
{"title":"Fast statistical timing analysis of latch-controlled circuits for arbitrary clock periods","authors":"Bing Li, Ning Chen, Ulf Schlichtmann","doi":"10.1109/ICCAD.2010.5653800","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5653800","url":null,"abstract":"Latch-controlled circuits have a remarkable advantage in timing performance as process variations become more relevant for circuit design. Existing methods of statistical timing analysis for such circuits, however, still need improvement in runtime and their results should be extended to provide yield information for any given clock period. In this paper, we propose a method combining a simplified iteration and a graph transformation algorithm. The result of this method is in a parametric form so that the yield for any given clock period can easily be evaluated. The graph transformation algorithm handles the constraints from nonpositive loops effectively, completely avoiding the heuristics used in other existing methods. Therefore the accuracy of the timing analysis is well maintained. Additionally, the proposed method is much faster than other existing methods. Especially for large circuits it offers about 100 times performance improvement in timing verification.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123400247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5654118
Szu-Pang Mu, Yi-Ming Wang, Hao-Yu Yang, M. Chao, Shi-Hao Chen, C. Tseng, Tsung-Ying Tsai
Coarse-grain multi-threshold CMOS (MTCMOS) is an effective power-gating technique to reduce IC's leakage power consumption by turning off idle devices with MTCMOS power switches. In this paper, we study the usage of coarse-grain MTCMOS power switches for both logic circuits and SRAMs, and then propose corresponding methods of testing stuck-open power switches for each of them. For logic circuits, a specialized ATPG framework is proposed to generate a longest possible robust test while creating as many effective transitions in the switch-centered region as possible. For SRAMs, a novel test algorithm is proposed to exercise the worst-case power consumption and performance when stuck-open power switches exist. The experimental results based on an industrial MTCMOS technology demonstrate the advantage of our proposed testing methods on detecting stuck-open power switches for both logic circuits and SRAMs, when compared to conventional testing methods.
{"title":"Testing methods for detecting stuck-open power switches in coarse-grain MTCMOS designs","authors":"Szu-Pang Mu, Yi-Ming Wang, Hao-Yu Yang, M. Chao, Shi-Hao Chen, C. Tseng, Tsung-Ying Tsai","doi":"10.1109/ICCAD.2010.5654118","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654118","url":null,"abstract":"Coarse-grain multi-threshold CMOS (MTCMOS) is an effective power-gating technique to reduce IC's leakage power consumption by turning off idle devices with MTCMOS power switches. In this paper, we study the usage of coarse-grain MTCMOS power switches for both logic circuits and SRAMs, and then propose corresponding methods of testing stuck-open power switches for each of them. For logic circuits, a specialized ATPG framework is proposed to generate a longest possible robust test while creating as many effective transitions in the switch-centered region as possible. For SRAMs, a novel test algorithm is proposed to exercise the worst-case power consumption and performance when stuck-open power switches exist. The experimental results based on an industrial MTCMOS technology demonstrate the advantage of our proposed testing methods on detecting stuck-open power switches for both logic circuits and SRAMs, when compared to conventional testing methods.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122540299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Analysis of power grid IR-drop during scan test application has drawn growing attention because excessive IR-drop may cause a functionally correct device to fail at-speed testing. The analysis is challenging since the power grid IR-drop profile depends on not only the switching cells locations but also the power grid structure. This paper presents a scalable implementation methodology for quantifying the IR-drop effects of a set of switching cells. An example of its application to guide power-safe scan pattern generation is illustrated. The scalability and effectiveness of the proposed quantitative measure is evaluated with a 130 nm industrial design with 800 K cells.
{"title":"A scalable quantitative measure of IR-drop effects for scan pattern generation","authors":"Meng-Fan Wu, Kun-Han Tsai, Wu-Tung Cheng, Hsin-Cheih Pan, Jiun-Lang Huang, A. Kifli","doi":"10.1109/ICCAD.2010.5654130","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654130","url":null,"abstract":"Analysis of power grid IR-drop during scan test application has drawn growing attention because excessive IR-drop may cause a functionally correct device to fail at-speed testing. The analysis is challenging since the power grid IR-drop profile depends on not only the switching cells locations but also the power grid structure. This paper presents a scalable implementation methodology for quantifying the IR-drop effects of a set of switching cells. An example of its application to guide power-safe scan pattern generation is illustrated. The scalability and effectiveness of the proposed quantitative measure is evaluated with a 130 nm industrial design with 800 K cells.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122661260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5654160
Li Jiang, Rong Ye, Q. Xu
Three-dimensional (3D) memory products are emerging to fulfill the ever-increasing demands of storage capacity. In 3D-stacked memory, redundancy sharing between neighboring vertical memory blocks using short through-silicon vias (TSVs) is a promising solution for yield enhancement. Since different memory dies are with distinct fault bitmaps, how to selectively matching them together to maximize the yield for the bonded 3D-stacked memory is an interesting and relevant problem. In this paper, we present novel solutions to tackle the above problem. Experimental results show that the proposed methodology can significantly increase memory yield when compared to the case that we only bond self-reparable dies together.
{"title":"Yield enhancement for 3D-stacked memory by redundancy sharing across dies","authors":"Li Jiang, Rong Ye, Q. Xu","doi":"10.1109/ICCAD.2010.5654160","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654160","url":null,"abstract":"Three-dimensional (3D) memory products are emerging to fulfill the ever-increasing demands of storage capacity. In 3D-stacked memory, redundancy sharing between neighboring vertical memory blocks using short through-silicon vias (TSVs) is a promising solution for yield enhancement. Since different memory dies are with distinct fault bitmaps, how to selectively matching them together to maximize the yield for the bonded 3D-stacked memory is an interesting and relevant problem. In this paper, we present novel solutions to tackle the above problem. Experimental results show that the proposed methodology can significantly increase memory yield when compared to the case that we only bond self-reparable dies together.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"306 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123092801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5653769
J. Ouyang, Jing Xie, Matthew Poremba, Yuan Xie
In recent 3DIC studies, through silicon vias (TSV) are usually employed as the vertical interconnects in the 3D stack. Despite its benefit of short latency and low power, forming TSVs adds additional complexities to the fabrication process. Recently, inductive/capactive-coupling links are proposed to replace TSVs in 3D stacking because the fabrication complexities of them are lower. Although state-of-the-art inductive/capacitive-coupling links show comparable bandwidth and power as TSV, the relatively large footprints of those links compromise their area efficiencies. In this work, we study the design of 3D network-on-chip (NoC) using inductive/capacitive-coupling links. We propose three techniques to mitigate the area overhead introduced by using these links: (a) serialization, (b) in-transceiver data compression, and (c) high-speed asynchronous transmission. With the combination of these three techniques, evaluation results show that the overheads of all aspects caused by using inductive/capacitive-coupling vertical links can be bounded under 10%.
{"title":"Evaluation of using inductive/capacitive-coupling vertical interconnects in 3D network-on-chip","authors":"J. Ouyang, Jing Xie, Matthew Poremba, Yuan Xie","doi":"10.1109/ICCAD.2010.5653769","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5653769","url":null,"abstract":"In recent 3DIC studies, through silicon vias (TSV) are usually employed as the vertical interconnects in the 3D stack. Despite its benefit of short latency and low power, forming TSVs adds additional complexities to the fabrication process. Recently, inductive/capactive-coupling links are proposed to replace TSVs in 3D stacking because the fabrication complexities of them are lower. Although state-of-the-art inductive/capacitive-coupling links show comparable bandwidth and power as TSV, the relatively large footprints of those links compromise their area efficiencies. In this work, we study the design of 3D network-on-chip (NoC) using inductive/capacitive-coupling links. We propose three techniques to mitigate the area overhead introduced by using these links: (a) serialization, (b) in-transceiver data compression, and (c) high-speed asynchronous transmission. With the combination of these three techniques, evaluation results show that the overheads of all aspects caused by using inductive/capacitive-coupling vertical links can be bounded under 10%.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115777885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5654220
Tao Huang, Evangeline F. Y. Young
In this paper, we present an efficient method to solve the obstacle-avoiding rectilinear Steiner minimum tree (OARSMT) problem optimally. Our work is a major improvement over the work proposed in [1]. First, a new kind of full Steiner trees (FSTs) called obstacle-avoiding full Steiner trees (OAFSTs) is proposed. We show that for any OARSMT problem there exists an optimal tree composed of OAFSTs only. We then extend the proofs on the possible topologies of FSTs in [2] to find the possible topologies of OAFSTs, showing that OAFSTs can be constructed easily. A two-phase algorithm for the construction of OARSMTs is then developed. In the first phase, a sufficient number of OAFSTs are generated. In the second phase, the OAFSTs are used to construct an OARSMT. Experimental results on several benchmarks show that the proposed method achieves 185 times speedup on average and is able to solve more benchmarks than the approach in [1].
{"title":"Obstacle-avoiding rectilinear Steiner minimum tree construction: An optimal approach","authors":"Tao Huang, Evangeline F. Y. Young","doi":"10.1109/ICCAD.2010.5654220","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654220","url":null,"abstract":"In this paper, we present an efficient method to solve the obstacle-avoiding rectilinear Steiner minimum tree (OARSMT) problem optimally. Our work is a major improvement over the work proposed in [1]. First, a new kind of full Steiner trees (FSTs) called obstacle-avoiding full Steiner trees (OAFSTs) is proposed. We show that for any OARSMT problem there exists an optimal tree composed of OAFSTs only. We then extend the proofs on the possible topologies of FSTs in [2] to find the possible topologies of OAFSTs, showing that OAFSTs can be constructed easily. A two-phase algorithm for the construction of OARSMTs is then developed. In the first phase, a sufficient number of OAFSTs are generated. In the second phase, the OAFSTs are used to construct an OARSMT. Experimental results on several benchmarks show that the proposed method achieves 185 times speedup on average and is able to solve more benchmarks than the approach in [1].","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127924065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a new technique for scaling the intermediate variables in implementing fixed-point polynomial-based arithmetic circuits. Analysis of precision has been used first to set the input and coefficient bit-widths of the polynomial so that a given error bound is satisfied. Then, we present an efficient approach to scale and truncate different intermediate variables with no need of re-computing precision at each stage. After applying it to all the intermediate variables, a final precision computation and sensitivity analysis is performed to set the final values of truncation bits so that the given error bound remains satisfied. Experimental results on a set of polynomial benchmarks show the robustness and efficiency of the proposed technique.
{"title":"Analysis of precision for scaling the intermediate variables in fixed-point arithmetic circuits","authors":"O. Sarbishei, K. Radecka","doi":"10.5555/2133429.2133586","DOIUrl":"https://doi.org/10.5555/2133429.2133586","url":null,"abstract":"This paper presents a new technique for scaling the intermediate variables in implementing fixed-point polynomial-based arithmetic circuits. Analysis of precision has been used first to set the input and coefficient bit-widths of the polynomial so that a given error bound is satisfied. Then, we present an efficient approach to scale and truncate different intermediate variables with no need of re-computing precision at each stage. After applying it to all the intermediate variables, a final precision computation and sensitivity analysis is performed to set the final values of truncation bits so that the given error bound remains satisfied. Experimental results on a set of polynomial benchmarks show the robustness and efficiency of the proposed technique.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124024293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5654070
Kun Yuan, D. Pan
In Double Patterning Lithography (DPL), conflict and stitch minimization are two main challenges. Post-routing mask decomposition algorithms [1–4] may not be enough to achieve high quality solution for DPL-unfriendly designs, due to complex metal patterns. In this paper, we propose an efficient framework of WISDOM to perform wire spreading and mask assignment simultaneously for enhanced decomposability. A set of Wire Spreading Candidates (WSC) are identified to eliminate coloring constraints or create additional splitting locations. Based on these candidates, an Integer Linear Programming (ILP) formulation is proposed to simultaneously minimize the number of conflicts and stitches, while introducing as less layout perturbation as possible. To improve scalability, we further propose three acceleration techniques without loss of solution quality: odd-cycle union optimization, coloring-independent group computing, and suboptimal solution pruning. The experimental results show that, compared to a postrouting mask decomposition method [2], we are able to reduce the number of conflicts and stitches by 41% and 23% respectively, with only 0.43% wire length increase. Moreover, with proposed acceleration methods, we achieve 9x speed-up.
{"title":"WISDOM: Wire spreading enhanced decomposition of masks in Double Patterning Lithography","authors":"Kun Yuan, D. Pan","doi":"10.1109/ICCAD.2010.5654070","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654070","url":null,"abstract":"In Double Patterning Lithography (DPL), conflict and stitch minimization are two main challenges. Post-routing mask decomposition algorithms [1–4] may not be enough to achieve high quality solution for DPL-unfriendly designs, due to complex metal patterns. In this paper, we propose an efficient framework of WISDOM to perform wire spreading and mask assignment simultaneously for enhanced decomposability. A set of Wire Spreading Candidates (WSC) are identified to eliminate coloring constraints or create additional splitting locations. Based on these candidates, an Integer Linear Programming (ILP) formulation is proposed to simultaneously minimize the number of conflicts and stitches, while introducing as less layout perturbation as possible. To improve scalability, we further propose three acceleration techniques without loss of solution quality: odd-cycle union optimization, coloring-independent group computing, and suboptimal solution pruning. The experimental results show that, compared to a postrouting mask decomposition method [2], we are able to reduce the number of conflicts and stitches by 41% and 23% respectively, with only 0.43% wire length increase. Moreover, with proposed acceleration methods, we achieve 9x speed-up.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126156602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5653636
M. Shafique, L. Bauer, J. Henkel
We propose a new way to save energy in adaptive processors. According to an execution context the custom instruction set of an adaptive processor is selectively 'muted' at run time and thus the energy efficiency is significantly increased. Implemented are multiple so-called 'muting modes' each leading to particular leakage energy savings. A key challenge of this work is to determine which of the muting modes are beneficial for which part of the custom instruction set in a specific execution context. We demonstrate the feasibility by means of an H.264 video encoder (although not limited to that) for various technology nodes. The complex and unpredictable processing behavior of an H.264 encoder represents thereby a real-world scenario. Our results show on average more than 30% energy savings compared to state-of-the-art. We claim that adaptive processors (and reconfigurable computing in general) would be far more energy efficient if FPGA vendors would provide a basic infrastructure that is necessary to exert our novel technique.
{"title":"Selective instruction set muting for energy-aware adaptive processors","authors":"M. Shafique, L. Bauer, J. Henkel","doi":"10.1109/ICCAD.2010.5653636","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5653636","url":null,"abstract":"We propose a new way to save energy in adaptive processors. According to an execution context the custom instruction set of an adaptive processor is selectively 'muted' at run time and thus the energy efficiency is significantly increased. Implemented are multiple so-called 'muting modes' each leading to particular leakage energy savings. A key challenge of this work is to determine which of the muting modes are beneficial for which part of the custom instruction set in a specific execution context. We demonstrate the feasibility by means of an H.264 video encoder (although not limited to that) for various technology nodes. The complex and unpredictable processing behavior of an H.264 encoder represents thereby a real-world scenario. Our results show on average more than 30% energy savings compared to state-of-the-art. We claim that adaptive processors (and reconfigurable computing in general) would be far more energy efficient if FPGA vendors would provide a basic infrastructure that is necessary to exert our novel technique.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130410402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5653749
A. Sridhar, A. Vincenzi, M. Ruggiero, T. Brunschwiler, David Atienza Alonso
Three dimensional stacked integrated circuits (3D ICs) are extremely attractive for overcoming the barriers in interconnect scaling, offering an opportunity to continue the CMOS performance trends for the next decade. However, from a thermal perspective, vertical integration of high-performance ICs in the form of 3D stacks is highly demanding since the effective areal heat dissipation increases with number of dies (with hotspot heat fluxes up to 250W/cm2) generating high chip temperatures. In this context, inter-tier integrated microchannel cooling is a promising and scalable solution for high heat flux removal. A robust design of a 3D IC and its subsequent thermal management depend heavily upon accurate modeling of the effects of liquid cooling on the thermal behavior of the IC during the early stages of design. In this paper we present 3D-ICE, a compact transient thermal model (CTTM) for the thermal simulation of 3D ICs with multiple inter-tier microchannel liquid cooling. The proposed model is compatible with existing thermal CAD tools for ICs, and offers significant speed-up (up to 975x) over a typical commercial computational fluid dynamics simulation tool while preserving accuracy (i.e., maximum temperature error of 3.4%). In addition, a thermal simulator has been built based on 3D-ICE, which is capable of running in parallel on multicore architectures, offering further savings in simulation time and demonstrating efficient parallelization of the proposed approach.
{"title":"3D-ICE: Fast compact transient thermal modeling for 3D ICs with inter-tier liquid cooling","authors":"A. Sridhar, A. Vincenzi, M. Ruggiero, T. Brunschwiler, David Atienza Alonso","doi":"10.1109/ICCAD.2010.5653749","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5653749","url":null,"abstract":"Three dimensional stacked integrated circuits (3D ICs) are extremely attractive for overcoming the barriers in interconnect scaling, offering an opportunity to continue the CMOS performance trends for the next decade. However, from a thermal perspective, vertical integration of high-performance ICs in the form of 3D stacks is highly demanding since the effective areal heat dissipation increases with number of dies (with hotspot heat fluxes up to 250W/cm2) generating high chip temperatures. In this context, inter-tier integrated microchannel cooling is a promising and scalable solution for high heat flux removal. A robust design of a 3D IC and its subsequent thermal management depend heavily upon accurate modeling of the effects of liquid cooling on the thermal behavior of the IC during the early stages of design. In this paper we present 3D-ICE, a compact transient thermal model (CTTM) for the thermal simulation of 3D ICs with multiple inter-tier microchannel liquid cooling. The proposed model is compatible with existing thermal CAD tools for ICs, and offers significant speed-up (up to 975x) over a typical commercial computational fluid dynamics simulation tool while preserving accuracy (i.e., maximum temperature error of 3.4%). In addition, a thermal simulator has been built based on 3D-ICE, which is capable of running in parallel on multicore architectures, offering further savings in simulation time and demonstrating efficient parallelization of the proposed approach.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124859468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}