Analysis of power grid IR-drop during scan test application has drawn growing attention because excessive IR-drop may cause a functionally correct device to fail at-speed testing. The analysis is challenging since the power grid IR-drop profile depends on not only the switching cells locations but also the power grid structure. This paper presents a scalable implementation methodology for quantifying the IR-drop effects of a set of switching cells. An example of its application to guide power-safe scan pattern generation is illustrated. The scalability and effectiveness of the proposed quantitative measure is evaluated with a 130 nm industrial design with 800 K cells.
{"title":"A scalable quantitative measure of IR-drop effects for scan pattern generation","authors":"Meng-Fan Wu, Kun-Han Tsai, Wu-Tung Cheng, Hsin-Cheih Pan, Jiun-Lang Huang, A. Kifli","doi":"10.1109/ICCAD.2010.5654130","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654130","url":null,"abstract":"Analysis of power grid IR-drop during scan test application has drawn growing attention because excessive IR-drop may cause a functionally correct device to fail at-speed testing. The analysis is challenging since the power grid IR-drop profile depends on not only the switching cells locations but also the power grid structure. This paper presents a scalable implementation methodology for quantifying the IR-drop effects of a set of switching cells. An example of its application to guide power-safe scan pattern generation is illustrated. The scalability and effectiveness of the proposed quantitative measure is evaluated with a 130 nm industrial design with 800 K cells.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122661260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a new technique for scaling the intermediate variables in implementing fixed-point polynomial-based arithmetic circuits. Analysis of precision has been used first to set the input and coefficient bit-widths of the polynomial so that a given error bound is satisfied. Then, we present an efficient approach to scale and truncate different intermediate variables with no need of re-computing precision at each stage. After applying it to all the intermediate variables, a final precision computation and sensitivity analysis is performed to set the final values of truncation bits so that the given error bound remains satisfied. Experimental results on a set of polynomial benchmarks show the robustness and efficiency of the proposed technique.
{"title":"Analysis of precision for scaling the intermediate variables in fixed-point arithmetic circuits","authors":"O. Sarbishei, K. Radecka","doi":"10.5555/2133429.2133586","DOIUrl":"https://doi.org/10.5555/2133429.2133586","url":null,"abstract":"This paper presents a new technique for scaling the intermediate variables in implementing fixed-point polynomial-based arithmetic circuits. Analysis of precision has been used first to set the input and coefficient bit-widths of the polynomial so that a given error bound is satisfied. Then, we present an efficient approach to scale and truncate different intermediate variables with no need of re-computing precision at each stage. After applying it to all the intermediate variables, a final precision computation and sensitivity analysis is performed to set the final values of truncation bits so that the given error bound remains satisfied. Experimental results on a set of polynomial benchmarks show the robustness and efficiency of the proposed technique.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124024293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5654114
Nabeel Iqbal, J. Henkel
The advent of multicore platforms has renewed the interest in scheduling techniques for real-time systems. Historically, ‘scheduling decisions’ are implemented considering fixed task execution times, as for the case of Worst Case Execution Time (WCET). The limitations of scheduling considering WCET manifest in terms of under-utilization of resources for large application classes. In the realm of multicore systems, the notion of WCET is hardly meaningful due to the large set of factors influencing it. Within soft real-time systems, a more realistic modeling approach would be to consider tasks featuring varying execution times (i.e. stochastic). This paper addresses the problem of stochastic task execution time scheduling that is agnostic to statistical properties of the execution time. Our proposed method is orthogonal to any number of linear acyclic task graphs and their underlying architecture. The joint estimation of execution time and the associated parameters, relying on the interdependence of parallel tasks, help build a ‘nonlinear Non-Gaussian state space’ model. To obtain nearly Bayesian estimates, irrespective of the execution time characteristics, a recursive solution of the state space model is found by means of the Monte Carlo method. The recursive solution reduces the computational and memory overhead and adapts statistical properties of execution times at run time. Finally, the variable laxity EDF scheduler schedules the tasks considering the predicted execution times. We show that variable execution time scheduling improves the utilization of resources and ensures the quality of service. Our proposed new solution does not require any a priori knowledge of any kind and eliminates the fundamental constraints associated with the estimation of execution times. Results clearly show the advantage of the proposed method as it achieves 76% better task utilization, 68% more task scheduling and deadline miss reduction by 53% compared to current state-of-the-art methods.
{"title":"SETS: Stochastic execution time scheduling for multicore systems by joint state space and Monte Carlo","authors":"Nabeel Iqbal, J. Henkel","doi":"10.1109/ICCAD.2010.5654114","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654114","url":null,"abstract":"The advent of multicore platforms has renewed the interest in scheduling techniques for real-time systems. Historically, ‘scheduling decisions’ are implemented considering fixed task execution times, as for the case of Worst Case Execution Time (WCET). The limitations of scheduling considering WCET manifest in terms of under-utilization of resources for large application classes. In the realm of multicore systems, the notion of WCET is hardly meaningful due to the large set of factors influencing it. Within soft real-time systems, a more realistic modeling approach would be to consider tasks featuring varying execution times (i.e. stochastic). This paper addresses the problem of stochastic task execution time scheduling that is agnostic to statistical properties of the execution time. Our proposed method is orthogonal to any number of linear acyclic task graphs and their underlying architecture. The joint estimation of execution time and the associated parameters, relying on the interdependence of parallel tasks, help build a ‘nonlinear Non-Gaussian state space’ model. To obtain nearly Bayesian estimates, irrespective of the execution time characteristics, a recursive solution of the state space model is found by means of the Monte Carlo method. The recursive solution reduces the computational and memory overhead and adapts statistical properties of execution times at run time. Finally, the variable laxity EDF scheduler schedules the tasks considering the predicted execution times. We show that variable execution time scheduling improves the utilization of resources and ensures the quality of service. Our proposed new solution does not require any a priori knowledge of any kind and eliminates the fundamental constraints associated with the estimation of execution times. Results clearly show the advantage of the proposed method as it achieves 76% better task utilization, 68% more task scheduling and deadline miss reduction by 53% compared to current state-of-the-art methods.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"24 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120846502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5654210
M. Choudhury, K. Mohanram
Bi-decomposition techniques have been known to significantly reduce area, delay, and power during logic synthesis since they can explore multi-level and, or, and xor decompositions in a scalable technology-independent manner. The complexity of bi-decompo-sition techniques is in achieving a good variable partition for the given logic function. State-of-the-art techniques use heuristics and/or brute-force enumeration for variable partitioning, which results in sub-optimal results and/or poor scalability with function complexity. This paper describes a fast, scalable algorithm for obtaining provably optimum variable partitions for bi-decomposition of Boolean functions by constructing an undirected graph called the blocking edge graph (BEG). To the best of our knowledge, this is the first algorithm that demonstrates a systematic approach to derive disjoint and overlapping variable partitions for bi-decomposition. Since a BEG has only one vertex per input, our technique scales to Boolean functions with hundreds of inputs. Results indicate that on average, BEG-based bi-decomposition reduces the number of logic levels (mapped delay) of 16 benchmark circuits by 60%, 34%, 45%, and 30% (20%, 19%, 16% and 20%) over the best results of state-of-the-art tools FBDD, SIS, ABC, and an industry-standard synthesizer, respectively.
{"title":"Bi-decomposition of large Boolean functions using blocking edge graphs","authors":"M. Choudhury, K. Mohanram","doi":"10.1109/ICCAD.2010.5654210","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654210","url":null,"abstract":"Bi-decomposition techniques have been known to significantly reduce area, delay, and power during logic synthesis since they can explore multi-level and, or, and xor decompositions in a scalable technology-independent manner. The complexity of bi-decompo-sition techniques is in achieving a good variable partition for the given logic function. State-of-the-art techniques use heuristics and/or brute-force enumeration for variable partitioning, which results in sub-optimal results and/or poor scalability with function complexity. This paper describes a fast, scalable algorithm for obtaining provably optimum variable partitions for bi-decomposition of Boolean functions by constructing an undirected graph called the blocking edge graph (BEG). To the best of our knowledge, this is the first algorithm that demonstrates a systematic approach to derive disjoint and overlapping variable partitions for bi-decomposition. Since a BEG has only one vertex per input, our technique scales to Boolean functions with hundreds of inputs. Results indicate that on average, BEG-based bi-decomposition reduces the number of logic levels (mapped delay) of 16 benchmark circuits by 60%, 34%, 45%, and 30% (20%, 19%, 16% and 20%) over the best results of state-of-the-art tools FBDD, SIS, ABC, and an industry-standard synthesizer, respectively.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124989503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5653800
Bing Li, Ning Chen, Ulf Schlichtmann
Latch-controlled circuits have a remarkable advantage in timing performance as process variations become more relevant for circuit design. Existing methods of statistical timing analysis for such circuits, however, still need improvement in runtime and their results should be extended to provide yield information for any given clock period. In this paper, we propose a method combining a simplified iteration and a graph transformation algorithm. The result of this method is in a parametric form so that the yield for any given clock period can easily be evaluated. The graph transformation algorithm handles the constraints from nonpositive loops effectively, completely avoiding the heuristics used in other existing methods. Therefore the accuracy of the timing analysis is well maintained. Additionally, the proposed method is much faster than other existing methods. Especially for large circuits it offers about 100 times performance improvement in timing verification.
{"title":"Fast statistical timing analysis of latch-controlled circuits for arbitrary clock periods","authors":"Bing Li, Ning Chen, Ulf Schlichtmann","doi":"10.1109/ICCAD.2010.5653800","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5653800","url":null,"abstract":"Latch-controlled circuits have a remarkable advantage in timing performance as process variations become more relevant for circuit design. Existing methods of statistical timing analysis for such circuits, however, still need improvement in runtime and their results should be extended to provide yield information for any given clock period. In this paper, we propose a method combining a simplified iteration and a graph transformation algorithm. The result of this method is in a parametric form so that the yield for any given clock period can easily be evaluated. The graph transformation algorithm handles the constraints from nonpositive loops effectively, completely avoiding the heuristics used in other existing methods. Therefore the accuracy of the timing analysis is well maintained. Additionally, the proposed method is much faster than other existing methods. Especially for large circuits it offers about 100 times performance improvement in timing verification.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123400247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5654204
J. Gu, G. Qu, Lin Yuan, Qiang Zhou
Peak current is one of the important considerations for circuit design and testing in the deep sub-micron technology. In a synchronous finite state machine (FSM), it is observed that the peak current happens at the moment of state transitions and it has a strong correlation with the maximum number of state registers switching in the same direction simultaneously [2], which we refer to as the peak switching value (PSV). We propose a FSM synthesis method to reduce P SV by seamlessly combining state replication and state re-encoding techniques. Our experiments show that out of 52 FSM benchmarks encoded by a state-of-the-art power-driven encoding algorithm POW3 [1], 36 of them are not optimal in terms of PSV. Our approach can improve on 34 of them with an average 39.2% PSV reduction, while the only comparable PSV-driven FSM synthesis technique [2] can improve on 27 benchmarks with an average 24.5% reduction. Furthermore, we compare our approach with [2] after the FSMs are implemented using an industry EDA tool. The results show that our approach reduces the peak current in the circuits by 13% on average and the total power by 3% with a mere 2% overhead in area.
{"title":"Peak current reduction by simultaneous state replication and re-encoding","authors":"J. Gu, G. Qu, Lin Yuan, Qiang Zhou","doi":"10.1109/ICCAD.2010.5654204","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654204","url":null,"abstract":"Peak current is one of the important considerations for circuit design and testing in the deep sub-micron technology. In a synchronous finite state machine (FSM), it is observed that the peak current happens at the moment of state transitions and it has a strong correlation with the maximum number of state registers switching in the same direction simultaneously [2], which we refer to as the peak switching value (PSV). We propose a FSM synthesis method to reduce P SV by seamlessly combining state replication and state re-encoding techniques. Our experiments show that out of 52 FSM benchmarks encoded by a state-of-the-art power-driven encoding algorithm POW3 [1], 36 of them are not optimal in terms of PSV. Our approach can improve on 34 of them with an average 39.2% PSV reduction, while the only comparable PSV-driven FSM synthesis technique [2] can improve on 27 benchmarks with an average 24.5% reduction. Furthermore, we compare our approach with [2] after the FSMs are implemented using an industry EDA tool. The results show that our approach reduces the peak current in the circuits by 13% on average and the total power by 3% with a mere 2% overhead in area.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123009797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5654160
Li Jiang, Rong Ye, Q. Xu
Three-dimensional (3D) memory products are emerging to fulfill the ever-increasing demands of storage capacity. In 3D-stacked memory, redundancy sharing between neighboring vertical memory blocks using short through-silicon vias (TSVs) is a promising solution for yield enhancement. Since different memory dies are with distinct fault bitmaps, how to selectively matching them together to maximize the yield for the bonded 3D-stacked memory is an interesting and relevant problem. In this paper, we present novel solutions to tackle the above problem. Experimental results show that the proposed methodology can significantly increase memory yield when compared to the case that we only bond self-reparable dies together.
{"title":"Yield enhancement for 3D-stacked memory by redundancy sharing across dies","authors":"Li Jiang, Rong Ye, Q. Xu","doi":"10.1109/ICCAD.2010.5654160","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654160","url":null,"abstract":"Three-dimensional (3D) memory products are emerging to fulfill the ever-increasing demands of storage capacity. In 3D-stacked memory, redundancy sharing between neighboring vertical memory blocks using short through-silicon vias (TSVs) is a promising solution for yield enhancement. Since different memory dies are with distinct fault bitmaps, how to selectively matching them together to maximize the yield for the bonded 3D-stacked memory is an interesting and relevant problem. In this paper, we present novel solutions to tackle the above problem. Experimental results show that the proposed methodology can significantly increase memory yield when compared to the case that we only bond self-reparable dies together.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"306 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123092801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clock skew minimization has been an important design constraint. However, due to the complexity of Process, Voltage, and Temperature (PVT) variations, the minimization of clock skew has faced a great challenge. To overcome the influence of PVT variations, several previous works proposed Post Silicon Tuning (PST) architecture to dynamically balance the skew of a clock tree. In the PST architecture, there are two main components: Adjustable Delay Buffer (ADB) and Phase Detector (PD). Most previous works focus on determining good positions of ADBs in a PST design. In this paper, we first show that which pairs of FFs are connected to PDs, called PD structure, also greatly influence the complexity of hardware control for a PST design. Without careful planning of a PD structure, we need large number of control signals to adjust the delays of ADBs. In addition, we also show that a PD structure may influence the accuracy of the clock skew. Among possible connection structures, this paper proposes an efficient PD structure which not only simplifies the hardware control but also minimizes the clock skew of a PST design.
{"title":"Synthesis of an efficient controlling structure for post-silicon clock skew minimization","authors":"Yu-Chien Kao, Hsuan-Ming Chou, Kun-Ting Tsai, Shih-Chieh Chang","doi":"10.1109/ICCAD.2010.5654274","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654274","url":null,"abstract":"Clock skew minimization has been an important design constraint. However, due to the complexity of Process, Voltage, and Temperature (PVT) variations, the minimization of clock skew has faced a great challenge. To overcome the influence of PVT variations, several previous works proposed Post Silicon Tuning (PST) architecture to dynamically balance the skew of a clock tree. In the PST architecture, there are two main components: Adjustable Delay Buffer (ADB) and Phase Detector (PD). Most previous works focus on determining good positions of ADBs in a PST design. In this paper, we first show that which pairs of FFs are connected to PDs, called PD structure, also greatly influence the complexity of hardware control for a PST design. Without careful planning of a PD structure, we need large number of control signals to adjust the delays of ADBs. In addition, we also show that a PD structure may influence the accuracy of the clock skew. Among possible connection structures, this paper proposes an efficient PD structure which not only simplifies the hardware control but also minimizes the clock skew of a PST design.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128200407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5654245
K. Athikulwongse, A. Chakraborty, Jae-Seok Yang, D. Pan, S. Lim
Through-silicon via (TSV) fabrication causes tensile stress around TSVs which results in significant carrier mobility variation in the devices in their neighborhood. Keep-out zone (KOZ) is a conservative way to prevent any devices/cells from being impacted by the TSV-induced stress. However, owing to already large TSV size, large KOZ can significantly reduce the placement area available for cells, thus requiring larger dies which negate improvement in wirelength and timing due to 3D integration. In this paper, we study the impact of KOZ dimension on stress, carrier mobility variation, area, wirelength, and performance of 3D ICs. We demonstrate that, instead of requiring large KOZ, 3D-IC placers must exploit TSV stress-induced carrier mobility variation to improve the timing and area objectives during placement. We propose a new TSV stress-driven force-directed 3D placement that consistently provides placement result with, on average, 21.6% better worst negative slack (WNS) and 28.0% better total negative slack (TNS) than wirelength-driven placement.
{"title":"Stress-driven 3D-IC placement with TSV keep-out zone and regularity study","authors":"K. Athikulwongse, A. Chakraborty, Jae-Seok Yang, D. Pan, S. Lim","doi":"10.1109/ICCAD.2010.5654245","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654245","url":null,"abstract":"Through-silicon via (TSV) fabrication causes tensile stress around TSVs which results in significant carrier mobility variation in the devices in their neighborhood. Keep-out zone (KOZ) is a conservative way to prevent any devices/cells from being impacted by the TSV-induced stress. However, owing to already large TSV size, large KOZ can significantly reduce the placement area available for cells, thus requiring larger dies which negate improvement in wirelength and timing due to 3D integration. In this paper, we study the impact of KOZ dimension on stress, carrier mobility variation, area, wirelength, and performance of 3D ICs. We demonstrate that, instead of requiring large KOZ, 3D-IC placers must exploit TSV stress-induced carrier mobility variation to improve the timing and area objectives during placement. We propose a new TSV stress-driven force-directed 3D placement that consistently provides placement result with, on average, 21.6% better worst negative slack (WNS) and 28.0% better total negative slack (TNS) than wirelength-driven placement.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122774620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-11-07DOI: 10.1109/ICCAD.2010.5654288
Chin-Hsiung Hsu, Yao-Wen Chang, S. Nassif
Double patterning technology (DPT) has recently gained much attention and is viewed as the most promising solution for the sub-32-nm node process. DPT decomposes a layout into two masks and applies double exposure patterning to increase the pitch size and thus printability. This paper proposes the first mask-sharing methodology for DPT, which can share masks among different designs, to reduce the number of costly masks for double patterning. The design methodology consists of two tasks: template-mask design and template-mask-aware routing. A graph matching-based algorithm is developed to design a flexible template mask that tries to accommodate as many design patterns as possible. We also present a template-mask-aware routing (TMR) algorithm, focusing on DPT-related issues to generate routing solutions that satisfy the constraints induced from double patterning and template masks. Experimental results show that our designed template mask is mask-saving, and our TMR can achieve conflict-free routing with 100% routability and save at least two masks for each circuit with reasonable wirelength and runtime overheads.
{"title":"Template-mask design methodology for double patterning technology","authors":"Chin-Hsiung Hsu, Yao-Wen Chang, S. Nassif","doi":"10.1109/ICCAD.2010.5654288","DOIUrl":"https://doi.org/10.1109/ICCAD.2010.5654288","url":null,"abstract":"Double patterning technology (DPT) has recently gained much attention and is viewed as the most promising solution for the sub-32-nm node process. DPT decomposes a layout into two masks and applies double exposure patterning to increase the pitch size and thus printability. This paper proposes the first mask-sharing methodology for DPT, which can share masks among different designs, to reduce the number of costly masks for double patterning. The design methodology consists of two tasks: template-mask design and template-mask-aware routing. A graph matching-based algorithm is developed to design a flexible template mask that tries to accommodate as many design patterns as possible. We also present a template-mask-aware routing (TMR) algorithm, focusing on DPT-related issues to generate routing solutions that satisfy the constraints induced from double patterning and template masks. Experimental results show that our designed template mask is mask-saving, and our TMR can achieve conflict-free routing with 100% routability and save at least two masks for each circuit with reasonable wirelength and runtime overheads.","PeriodicalId":344703,"journal":{"name":"2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116253524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}