Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105421
Sheng Wei, A. Nahapetian, M. Potkonjak
Current hardware metering techniques, which use manifestational properties of gates for ID extraction, are weakened by the non-uniform effects of aging in conjunction with variations in temperature and supply voltage. As an integrated circuit (IC) ages, the manifestational properties of the gates change, and thus the ID used for hardware metering can not be valid over time. Additionally, the previous approaches require large amounts of costly measurements and often are difficult to scale to large designs. We resolve the deleterious effects of aging by going to the physical level and primarily targeting the characterization of threshold voltage. Although threshold voltage is modified with aging, we can recover its original value for use as the IC identifier. Another key aspect of our approach involves using IC segmentation for gate-level characterization. This results in a cost effective approach by limiting measurements, and has a significant effect on the approach scalability. Finally, by using threshold voltage for ID creation, we are able to quantify the probability of coincidence between legitimate and pirated ICs, thus for the first time quantitatively and accurately demonstrating the effectiveness of a hardware metering approach.
{"title":"Robust passive hardware metering","authors":"Sheng Wei, A. Nahapetian, M. Potkonjak","doi":"10.1109/ICCAD.2011.6105421","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105421","url":null,"abstract":"Current hardware metering techniques, which use manifestational properties of gates for ID extraction, are weakened by the non-uniform effects of aging in conjunction with variations in temperature and supply voltage. As an integrated circuit (IC) ages, the manifestational properties of the gates change, and thus the ID used for hardware metering can not be valid over time. Additionally, the previous approaches require large amounts of costly measurements and often are difficult to scale to large designs. We resolve the deleterious effects of aging by going to the physical level and primarily targeting the characterization of threshold voltage. Although threshold voltage is modified with aging, we can recover its original value for use as the IC identifier. Another key aspect of our approach involves using IC segmentation for gate-level characterization. This results in a cost effective approach by limiting measurements, and has a significant effect on the approach scalability. Finally, by using threshold voltage for ID creation, we are able to quantify the probability of coincidence between legitimate and pirated ICs, thus for the first time quantitatively and accurately demonstrating the effectiveness of a hardware metering approach.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"7 1","pages":"802-809"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88476322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105391
Debapriya Chatterjee, Calvin McCarter, V. Bertacco
Post-silicon validation has become a crucial part of modern integrated circuit design to capture and eliminate functional bugs that escape pre-silicon verification. The most critical roadblock in post-silicon validation is the limited observability of internal signals of a design, since this aspect hinders the ability to diagnose detected bugs. A solution to address this issue leverage trace buffers: these are register buffers embedded into the design with the goal of recording the value of a small number of state elements, over a time interval, triggered by a user-specified event. Due to the trace buffer's area overhead, only a very small fraction of signals can be traced. Thus, the selection of which signals to trace is of paramount importance in post-silicon debugging and diagnosis. Ideally, we would like to select signals enabling the maximum amount of reconstruction of internal signal values. Several signal selection algorithms for post-silicon debug have been proposed in the literature: they rely on a probability-based state-restoration capacity metric coupled with a greedy algorithm. In this work we propose a more accurate restoration capacity metric, based on simulation information, and present a novel algorithm that overcomes some key shortcomings of previous solutions. We show that our technique provides up to 34% better state restoration compared to all previous techniques while showing a much better trend with increasing trace buffer size.
{"title":"Simulation-based signal selection for state restoration in silicon debug","authors":"Debapriya Chatterjee, Calvin McCarter, V. Bertacco","doi":"10.1109/ICCAD.2011.6105391","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105391","url":null,"abstract":"Post-silicon validation has become a crucial part of modern integrated circuit design to capture and eliminate functional bugs that escape pre-silicon verification. The most critical roadblock in post-silicon validation is the limited observability of internal signals of a design, since this aspect hinders the ability to diagnose detected bugs. A solution to address this issue leverage trace buffers: these are register buffers embedded into the design with the goal of recording the value of a small number of state elements, over a time interval, triggered by a user-specified event. Due to the trace buffer's area overhead, only a very small fraction of signals can be traced. Thus, the selection of which signals to trace is of paramount importance in post-silicon debugging and diagnosis. Ideally, we would like to select signals enabling the maximum amount of reconstruction of internal signal values. Several signal selection algorithms for post-silicon debug have been proposed in the literature: they rely on a probability-based state-restoration capacity metric coupled with a greedy algorithm. In this work we propose a more accurate restoration capacity metric, based on simulation information, and present a novel algorithm that overcomes some key shortcomings of previous solutions. We show that our technique provides up to 34% better state restoration compared to all previous techniques while showing a much better trend with increasing trace buffer size.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"16 1","pages":"595-601"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86973899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105349
Muhsen Owaida, Nikolaos Bellas, C. Antonopoulos, Konstantis Daloukas, C. Antoniadis
The problem of automatically generating hardware modules from high level application representations has been at the forefront of EDA research during the last few years. In this paper, we introduce a methodology to automatically synthesize hardware accelerators from OpenCL applications. OpenCL is a recent industry supported standard for writing programs that execute on multicore platforms and accelerators such as GPUs. Our methodology maps OpenCL kernels into hardware accelerators, based on architectural templates that explicitly decouple computation from memory communication whenever this is possible. The templates can be tuned to provide a wide repertoire of accelerators that meet user performance requirements and FPGA device characteristics. Furthermore, a set of high- and low-level compiler optimizations is applied to generate optimized accelerators. Our experimental evaluation shows that the generated accelerators are tuned efficiently to match the applications memory access pattern and computational complexity, and to achieve user performance requirements. An important objective of our tool is to expand the FPGA development user base to software engineers, thereby expanding the scope of FPGAs beyond the realm of hardware design.
{"title":"Massively parallel programming models used as hardware description languages: The OpenCL case","authors":"Muhsen Owaida, Nikolaos Bellas, C. Antonopoulos, Konstantis Daloukas, C. Antoniadis","doi":"10.1109/ICCAD.2011.6105349","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105349","url":null,"abstract":"The problem of automatically generating hardware modules from high level application representations has been at the forefront of EDA research during the last few years. In this paper, we introduce a methodology to automatically synthesize hardware accelerators from OpenCL applications. OpenCL is a recent industry supported standard for writing programs that execute on multicore platforms and accelerators such as GPUs. Our methodology maps OpenCL kernels into hardware accelerators, based on architectural templates that explicitly decouple computation from memory communication whenever this is possible. The templates can be tuned to provide a wide repertoire of accelerators that meet user performance requirements and FPGA device characteristics. Furthermore, a set of high- and low-level compiler optimizations is applied to generate optimized accelerators. Our experimental evaluation shows that the generated accelerators are tuned efficiently to match the applications memory access pattern and computational complexity, and to achieve user performance requirements. An important objective of our tool is to expand the FPGA development user base to software engineers, thereby expanding the scope of FPGAs beyond the realm of hardware design.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"15 1","pages":"326-333"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86471421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105335
G. Medeiros-Ribeiro, J. Nickel, J. Yang
The fast improvements that have been realized over the past 3 years in the understanding of the materials science, physics and engineering of memristors are briefly reviewed. The electroforming phenomena and the associated importance for the understanding of novel device structures has been revealed from a materials science standpoint, complemented with a spectromicroscopy study and electronic microscopy. These studies were utilized to substantiate a realistic physical model that permitted the development of differential equations governing device behavior, as well as SPICE models and stochastic analysis. Finally, we briefly highlight recent progress in device endurance, which surpasses that of FLASH by several orders of magnitude.
{"title":"Progress in CMOS-memristor integration","authors":"G. Medeiros-Ribeiro, J. Nickel, J. Yang","doi":"10.1109/ICCAD.2011.6105335","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105335","url":null,"abstract":"The fast improvements that have been realized over the past 3 years in the understanding of the materials science, physics and engineering of memristors are briefly reviewed. The electroforming phenomena and the associated importance for the understanding of novel device structures has been revealed from a materials science standpoint, complemented with a spectromicroscopy study and electronic microscopy. These studies were utilized to substantiate a realistic physical model that permitted the development of differential equations governing device behavior, as well as SPICE models and stochastic analysis. Finally, we briefly highlight recent progress in device endurance, which surpasses that of FLASH by several orders of magnitude.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"26 1","pages":"246-249"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84890946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105419
Hui Zhao, M. Kandemir, W. Ding, M. J. Irwin
The Network-on-Chip (NoC) plays a crucial role in designing low cost chip multiprocessors (CMPs) as the number of cores on a chip keeps increasing. However, buffers in NoC routers increase the cost of CMPs in terms of both area and power. Recently, bufferless routers have been proposed to reduce such costs by removing buffers from the routers. However, bufferless routers can provide competitive performance only when network utilization is moderate. In this paper, we propose a novel heterogeneous design that employs both buffered and bufferless routers in the same NoC to achieve high performance at low cost. We evaluate a variety of plans to place buffered and bufferless routers in an NoC based CMP according to performance requirements and power allowances. In order to take full advantage of these heterogeneous NoCs, we also propose novel strategies for buffered-router-aware application thread mapping and a routing algorithm (once the router placement is fixed). Our evaluations show that, by utilizing the techniques we proposed, a heterogeneous NoC does not only achieve performance comparable to that of the NoCs with buffered routers but also reduces buffer costs and energy consumption.
{"title":"Exploring heterogeneous NoC design space","authors":"Hui Zhao, M. Kandemir, W. Ding, M. J. Irwin","doi":"10.1109/ICCAD.2011.6105419","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105419","url":null,"abstract":"The Network-on-Chip (NoC) plays a crucial role in designing low cost chip multiprocessors (CMPs) as the number of cores on a chip keeps increasing. However, buffers in NoC routers increase the cost of CMPs in terms of both area and power. Recently, bufferless routers have been proposed to reduce such costs by removing buffers from the routers. However, bufferless routers can provide competitive performance only when network utilization is moderate. In this paper, we propose a novel heterogeneous design that employs both buffered and bufferless routers in the same NoC to achieve high performance at low cost. We evaluate a variety of plans to place buffered and bufferless routers in an NoC based CMP according to performance requirements and power allowances. In order to take full advantage of these heterogeneous NoCs, we also propose novel strategies for buffered-router-aware application thread mapping and a routing algorithm (once the router placement is fixed). Our evaluations show that, by utilizing the techniques we proposed, a heterogeneous NoC does not only achieve performance comparable to that of the NoCs with buffered routers but also reduces buffer costs and energy consumption.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"1 1","pages":"787-793"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78463282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105359
Hsiou-Yuan Liu, Yen-Cheng Chou, Chen-Hsuan Lin, J. H. Jiang
Upon receiving the output sequence streaming from a sequential encoder, a decoder reconstructs the corresponding input sequence that streamed to the encoder. Such an encoding and decoding scheme is commonly encountered in communication, cryptography, signal processing, and other applications. Given an encoder specification, decoder design can be error-prone and time consuming. Its automation may help designers improve productivity and justify encoder correctness. Though recent advances showed promising progress, there is still no complete method that decides whether a decoder exists for a finite state transition system. The quest for completely automatic decoder synthesis remains. This paper presents a complete and practical approach to automating decoder synthesis via incremental SAT solving and Craig interpolation. Experiments show that, for decoder-existent cases, our method synthesizes decoders effectively; for decoder-nonexistent cases, our method concludes the non-existence instantly while prior methods may fail.
{"title":"Towards completely automatic decoder synthesis","authors":"Hsiou-Yuan Liu, Yen-Cheng Chou, Chen-Hsuan Lin, J. H. Jiang","doi":"10.1109/ICCAD.2011.6105359","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105359","url":null,"abstract":"Upon receiving the output sequence streaming from a sequential encoder, a decoder reconstructs the corresponding input sequence that streamed to the encoder. Such an encoding and decoding scheme is commonly encountered in communication, cryptography, signal processing, and other applications. Given an encoder specification, decoder design can be error-prone and time consuming. Its automation may help designers improve productivity and justify encoder correctness. Though recent advances showed promising progress, there is still no complete method that decides whether a decoder exists for a finite state transition system. The quest for completely automatic decoder synthesis remains. This paper presents a complete and practical approach to automating decoder synthesis via incremental SAT solving and Craig interpolation. Experiments show that, for decoder-existent cases, our method synthesizes decoders effectively; for decoder-nonexistent cases, our method concludes the non-existence instantly while prior methods may fail.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"175 1","pages":"389-395"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76885641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105343
Yen-Hung Lin, Y. Ban, D. Pan, Yih-Lang Li
The printed image of a layout that satisfies the double patterning lithograph (DPL) constraints may not have good fidelity if the layout neglects optical proximity correction (OPC). Simultaneously considering DPL and OPC becomes necessary when gene rating layouts, especially in routing stage. Moreover, one decomposed design with balanced mask density has a lower edge placement error (EPE)than an unbalance done[6]. This work proposes a compre-hensive conflict graph (CCG)to enable detailed routers to simultaneously consider DPL, OPC, and mask density to gene rate litho-friendly layouts. This work then develops an DPL-aware and OPC-friendly gridless detailed routing (DOPPLER) by applying CCG in a gridless routing model. A density variation threshold annealing-based routing flow is also proposed to prevent DOPPLER from falling into a sub-optimal mask density balance. Compared with existing DPL-aware detailed routing works, DOPPLER demon-stratesanaverage 73.84% of EPE hotspot reduction with a satisfactory mask density at the cost of an average increase of 0.08% wire-length, 15.14% number of stitches, and 77.28% runtime.
{"title":"Doppler: DPL-aware and OPC-friendly gridless detailed routing with mask density balancing","authors":"Yen-Hung Lin, Y. Ban, D. Pan, Yih-Lang Li","doi":"10.1109/ICCAD.2011.6105343","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105343","url":null,"abstract":"The printed image of a layout that satisfies the double patterning lithograph (DPL) constraints may not have good fidelity if the layout neglects optical proximity correction (OPC). Simultaneously considering DPL and OPC becomes necessary when gene rating layouts, especially in routing stage. Moreover, one decomposed design with balanced mask density has a lower edge placement error (EPE)than an unbalance done[6]. This work proposes a compre-hensive conflict graph (CCG)to enable detailed routers to simultaneously consider DPL, OPC, and mask density to gene rate litho-friendly layouts. This work then develops an DPL-aware and OPC-friendly gridless detailed routing (DOPPLER) by applying CCG in a gridless routing model. A density variation threshold annealing-based routing flow is also proposed to prevent DOPPLER from falling into a sub-optimal mask density balance. Compared with existing DPL-aware detailed routing works, DOPPLER demon-stratesanaverage 73.84% of EPE hotspot reduction with a satisfactory mask density at the cost of an average increase of 0.08% wire-length, 15.14% number of stitches, and 77.28% runtime.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"21 1","pages":"283-289"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75076074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105341
A. M. El-Husseini, Matthew Morrise
In the competitive CPU market, emphasis is placed on automations that promotes innovation and supports design of multiple CPU configuration which meets schedule and timely turnaround to market. Design automation is essential in the design of different areas of CPU including the global clock distributions. This paper talks about design automation tools developed to handle the design of the global clock distributions for various Intel microprocessors.
{"title":"Clocking design automation in Intel's Core i7 and future designs","authors":"A. M. El-Husseini, Matthew Morrise","doi":"10.1109/ICCAD.2011.6105341","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105341","url":null,"abstract":"In the competitive CPU market, emphasis is placed on automations that promotes innovation and supports design of multiple CPU configuration which meets schedule and timely turnaround to market. Design automation is essential in the design of different areas of CPU including the global clock distributions. This paper talks about design automation tools developed to handle the design of the global clock distributions for various Intel microprocessors.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"50 1","pages":"276-278"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75819599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105377
Hui-Fang Tsao, Pang-Yen Chou, Shih-Lun Huang, Yao-Wen Chang, Mark Po-Hung Lin, Duan-Ping Chen, Dick Liu
Modern circuit placement, especially analog placement, often needs to consider various constraints, such as symmetry, proximity, preplaced, variant, fixed-boundary, minimum separation, boundary, and fixed-outline constraints, for better electrical effects and higher performance. To handle these diverse constraints, topo-logical floorplan representations are pervasively used because of their higher flexibility and smaller solution space. Due to their intrinsic limitation in deriving module adjacency information directly from the representations themselves, however, they might incur difficulties in handling related constraints. In this paper, we work on B∗-trees, which have been shown to be most effective and efficient for floor-plan/placement problems, and present a corner stitching compliant B∗-tree (CB-tree, for short) to remedy the significant deficiency in its module adjacency handling. A CB-tree is a B∗-tree integrated with modified corner stitching to offer much higher flexibility/efficiency, especially for adjacent module identification/packing. Compared with the previous works, CB-trees can achieve the lowest time complexity for module packing with the aforementioned constraints. Experimental results show that the CB-trees achieve the best solution quality and consume the least running time for industrial designs with various constraints. In particular, our work provides key insights into the handling of comprehensive placement constraints with a topological representation.
{"title":"A corner stitching compliant B∗-tree representation and its applications to analog placement","authors":"Hui-Fang Tsao, Pang-Yen Chou, Shih-Lun Huang, Yao-Wen Chang, Mark Po-Hung Lin, Duan-Ping Chen, Dick Liu","doi":"10.1109/ICCAD.2011.6105377","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105377","url":null,"abstract":"Modern circuit placement, especially analog placement, often needs to consider various constraints, such as symmetry, proximity, preplaced, variant, fixed-boundary, minimum separation, boundary, and fixed-outline constraints, for better electrical effects and higher performance. To handle these diverse constraints, topo-logical floorplan representations are pervasively used because of their higher flexibility and smaller solution space. Due to their intrinsic limitation in deriving module adjacency information directly from the representations themselves, however, they might incur difficulties in handling related constraints. In this paper, we work on B∗-trees, which have been shown to be most effective and efficient for floor-plan/placement problems, and present a corner stitching compliant B∗-tree (CB-tree, for short) to remedy the significant deficiency in its module adjacency handling. A CB-tree is a B∗-tree integrated with modified corner stitching to offer much higher flexibility/efficiency, especially for adjacent module identification/packing. Compared with the previous works, CB-trees can achieve the lowest time complexity for module packing with the aforementioned constraints. Experimental results show that the CB-trees achieve the best solution quality and consume the least running time for industrial designs with various constraints. In particular, our work provides key insights into the handling of comprehensive placement constraints with a topological representation.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"141 1","pages":"507-511"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73293596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105370
Yaojun Zhang, Xiaobin Wang, Yiran Chen
The rapidly increased demands for memory in electronic industry and the significant technical scaling challenges of all conventional memory technologies motivated the researches on the next generation memory technology. As one promising candidate, spin-transfer torque random access memory (STT-RAM) features fast access time, high density, non-volatility, and good CMOS process compatibility. However, like all other nano-scale devices, the performance and reliability of STT-RAM cells are severely affected by process variations, intrinsic device operating uncertainties and environmental fluctuations. In this work, we systematically analyze the impacts of CMOS and MTJ process variations, MTJ switching uncertainties induced by thermal fluctuations and working temperature on the performance and reliability of STT-RAM cells. A combined circuit and magnetic simulation platform is also established to quantitatively analyze the persistent and non-persistent error rates during the STT-RAM cell operations. Finally, an optimization flow and its effectiveness are depicted by using some STT-RAM cell designs as case study.
{"title":"STT-RAM cell design optimization for persistent and non-persistent error rate reduction: A statistical design view","authors":"Yaojun Zhang, Xiaobin Wang, Yiran Chen","doi":"10.1109/ICCAD.2011.6105370","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105370","url":null,"abstract":"The rapidly increased demands for memory in electronic industry and the significant technical scaling challenges of all conventional memory technologies motivated the researches on the next generation memory technology. As one promising candidate, spin-transfer torque random access memory (STT-RAM) features fast access time, high density, non-volatility, and good CMOS process compatibility. However, like all other nano-scale devices, the performance and reliability of STT-RAM cells are severely affected by process variations, intrinsic device operating uncertainties and environmental fluctuations. In this work, we systematically analyze the impacts of CMOS and MTJ process variations, MTJ switching uncertainties induced by thermal fluctuations and working temperature on the performance and reliability of STT-RAM cells. A combined circuit and magnetic simulation platform is also established to quantitatively analyze the persistent and non-persistent error rates during the STT-RAM cell operations. Finally, an optimization flow and its effectiveness are depicted by using some STT-RAM cell designs as case study.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"1 1","pages":"471-477"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83287047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}