Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523666
Richard Lee, Karim Abdel-Khalek, S. Abdi, Frederic Risacher
This paper describes a methodology for developing abstract and executable system-level model in SystemC of real-time embedded software, targeted to an RTOS. We design a RTOS emulation layer, called RESC, on top of the SystemC kernel. The application software is linked against the emulation layer to create an executable model of the software. The model can be integrated into system level HW-SW models which can be used for fast, accurate and early system validation. We first identify key real-time software constructs such as task-level concurrency, priorities, timers, pulses, and message-passing communication. We, then, define equivalent abstractions of the constructs in RESC on top of the SystemC library. We validated our models using industrial-size examples such as MP3 decoder and Vocoder. The experimental results show that our models are very accurate (<; 1% error) and significantly faster (up to 11X) than real-time software execution on target platform.
{"title":"Early system level modeling of real-time applications on embedded platforms","authors":"Richard Lee, Karim Abdel-Khalek, S. Abdi, Frederic Risacher","doi":"10.1109/ISQED.2013.6523666","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523666","url":null,"abstract":"This paper describes a methodology for developing abstract and executable system-level model in SystemC of real-time embedded software, targeted to an RTOS. We design a RTOS emulation layer, called RESC, on top of the SystemC kernel. The application software is linked against the emulation layer to create an executable model of the software. The model can be integrated into system level HW-SW models which can be used for fast, accurate and early system validation. We first identify key real-time software constructs such as task-level concurrency, priorities, timers, pulses, and message-passing communication. We, then, define equivalent abstractions of the constructs in RESC on top of the SystemC library. We validated our models using industrial-size examples such as MP3 decoder and Vocoder. The experimental results show that our models are very accurate (<; 1% error) and significantly faster (up to 11X) than real-time software execution on target platform.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115970108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523628
Chih-han Hsu, S. Ruan, Ying-Jung Chen, Tsang-Chi Kan
Vertical integration of layers in a 3D IC exacerbates thermal problem especially for reliability degradation. Low reliability can not only damage the whole circuits but also cause unexpected performance loss. In this paper, we conduct the SA engine with rectangle-STSVs and double-STSVs for improving reliability. The earlier research indicates that the more STSVs a chip has, the better the reliability is. However, it also implies a larger area. Therefore, we develop a methodology to manipulate thermal-aware floorplan with the tradeoff among the number of STSVs, reliability, and area of a chip. Moreover, we manage our manipulated floorplan with precise thermal model for TTSVs insertion at via channel. Experimental results show that more than 80% of single-STSVs can be replaced by rectangle-STSVs or double-STSVs, thereby improving reliability. Furthermore, temperature can be maintained around 80°C with minimal TTSVs after inserting TTSVs.
{"title":"Reliability consideration with rectangle- and double-signal through silicon vias insertion in 3D thermal-aware floorplanning","authors":"Chih-han Hsu, S. Ruan, Ying-Jung Chen, Tsang-Chi Kan","doi":"10.1109/ISQED.2013.6523628","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523628","url":null,"abstract":"Vertical integration of layers in a 3D IC exacerbates thermal problem especially for reliability degradation. Low reliability can not only damage the whole circuits but also cause unexpected performance loss. In this paper, we conduct the SA engine with rectangle-STSVs and double-STSVs for improving reliability. The earlier research indicates that the more STSVs a chip has, the better the reliability is. However, it also implies a larger area. Therefore, we develop a methodology to manipulate thermal-aware floorplan with the tradeoff among the number of STSVs, reliability, and area of a chip. Moreover, we manage our manipulated floorplan with precise thermal model for TTSVs insertion at via channel. Experimental results show that more than 80% of single-STSVs can be replaced by rectangle-STSVs or double-STSVs, thereby improving reliability. Furthermore, temperature can be maintained around 80°C with minimal TTSVs after inserting TTSVs.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121993309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523643
D. Ghai, S. Mohanty, G. Thakral
The mature electronic design automation (EDA) tools and well-defined abstraction-levels for digital circuits have almost automated the digital design process. However, analog circuit design and optimization is still not automated. Custom design of analog circuits and slow analog in SPICE has always needed maximum efforts, skills, design cycle time. This paper presents a novel design flow for constrained optimization of nano-CMOS analog circuits. The proposed analog design flow combines polynomial-regression based models and genetic algorithm for fast optimization. For evaluating the effectiveness of the proposed design flow, power minimization in a 50nm CMOS based current-starved voltage-controlled oscillator (VCO) is carried out, while treating oscillation frequency as a performance constraint. Accurate polynomial-regression based models are developed for power and frequency of the VCO. The goodness-of-fit of the models is evaluated using SSE, RMSE and R2. Using these models, we form a constrained optimization problem which is solved using genetic algorithm. The flow achieved 21.67% power savings, with a constraint of frequency ≥ 100 MHz. To the best of the authors' knowledge, this is the first study which approaches a VCO design problem as a mathematical constrained optimization involving the usage of regression based modeling and genetic algorithm.
{"title":"Fast analog design optimization using regression-based modeling and genetic algorithm: A nano-CMOS VCO case study","authors":"D. Ghai, S. Mohanty, G. Thakral","doi":"10.1109/ISQED.2013.6523643","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523643","url":null,"abstract":"The mature electronic design automation (EDA) tools and well-defined abstraction-levels for digital circuits have almost automated the digital design process. However, analog circuit design and optimization is still not automated. Custom design of analog circuits and slow analog in SPICE has always needed maximum efforts, skills, design cycle time. This paper presents a novel design flow for constrained optimization of nano-CMOS analog circuits. The proposed analog design flow combines polynomial-regression based models and genetic algorithm for fast optimization. For evaluating the effectiveness of the proposed design flow, power minimization in a 50nm CMOS based current-starved voltage-controlled oscillator (VCO) is carried out, while treating oscillation frequency as a performance constraint. Accurate polynomial-regression based models are developed for power and frequency of the VCO. The goodness-of-fit of the models is evaluated using SSE, RMSE and R2. Using these models, we form a constrained optimization problem which is solved using genetic algorithm. The flow achieved 21.67% power savings, with a constraint of frequency ≥ 100 MHz. To the best of the authors' knowledge, this is the first study which approaches a VCO design problem as a mathematical constrained optimization involving the usage of regression based modeling and genetic algorithm.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125235039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523596
C. O'Sullivan, P. Levine, S. Garg
We propose a new test array architecture-vertically-addressed test structures (VATS)-to experimentally characterize the within-tier and tier-to-tier process variations and through-silicon via (TSV) induced stress in 3D integrated circuits (ICs). The proposed VATS architecture utilizes the benefits of 3D integration to simultaneously provide high density, low I/O pin utilization, and high fidelity. A test chip featuring eight VATS arrays (>15,000 active devices) has been designed and fabricated in a two-tier, 130-nm 3D IC technology. Simulation results highlight the advantages of the proposed VATS architecture compared to conventional 2D test arrays.We also propose a radial filtering scheme to discriminate between process variations and the impact of TSV-induced stress in 3D ICs.
{"title":"Vertically-addressed test structures (VATS) for 3D IC variability and stress measurements","authors":"C. O'Sullivan, P. Levine, S. Garg","doi":"10.1109/ISQED.2013.6523596","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523596","url":null,"abstract":"We propose a new test array architecture-vertically-addressed test structures (VATS)-to experimentally characterize the within-tier and tier-to-tier process variations and through-silicon via (TSV) induced stress in 3D integrated circuits (ICs). The proposed VATS architecture utilizes the benefits of 3D integration to simultaneously provide high density, low I/O pin utilization, and high fidelity. A test chip featuring eight VATS arrays (>15,000 active devices) has been designed and fabricated in a two-tier, 130-nm 3D IC technology. Simulation results highlight the advantages of the proposed VATS architecture compared to conventional 2D test arrays.We also propose a radial filtering scheme to discriminate between process variations and the impact of TSV-induced stress in 3D ICs.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127187652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523654
Mohammad Shokrolah Shirazi, B. Morris, H. Selvaraj
FPGA-based fault injection methods have recently become more popular since they provide high speed in fault injection experiments. During each fault injection experiment, FPGA should send data related with observation points back to host computer for fault tolerant analysis. Since there is high data volume, FPGA should spend most of its time in communication. In this paper, we solve this problem by bringing all parts of fault injection tool inside FPGA. The area overhead problem related with observation data is obviated by using simple observation circuit. As case study, we injected 6400 SEU faults into OpensRISC 1200 processor over the Cyclone II FPGA. Results show that our fault injection experiments are done more than 400 times faster than one of the traditional FPGA based fault injection methods with only 5% area overhead.
基于fpga的故障注入方法由于提供了高速的故障注入实验,近年来越来越受欢迎。在每次故障注入实验中,FPGA将与观测点相关的数据发回上位机进行容错分析。由于数据量大,FPGA应该把大部分时间花在通信上。本文通过将故障注入工具的各个部分集成到FPGA中来解决这一问题。利用简单的观测电路,解决了观测数据的面积开销问题。作为案例研究,我们通过Cyclone II FPGA将6400个SEU故障注入到OpensRISC 1200处理器中。结果表明,我们的故障注入实验比传统的基于FPGA的故障注入方法快400倍以上,而面积开销仅为5%。
{"title":"Fast FPGA-based fault injection tool for embedded processors","authors":"Mohammad Shokrolah Shirazi, B. Morris, H. Selvaraj","doi":"10.1109/ISQED.2013.6523654","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523654","url":null,"abstract":"FPGA-based fault injection methods have recently become more popular since they provide high speed in fault injection experiments. During each fault injection experiment, FPGA should send data related with observation points back to host computer for fault tolerant analysis. Since there is high data volume, FPGA should spend most of its time in communication. In this paper, we solve this problem by bringing all parts of fault injection tool inside FPGA. The area overhead problem related with observation data is obviated by using simple observation circuit. As case study, we injected 6400 SEU faults into OpensRISC 1200 processor over the Cyclone II FPGA. Results show that our fault injection experiments are done more than 400 times faster than one of the traditional FPGA based fault injection methods with only 5% area overhead.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"187 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114017982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523588
Yen-Han Lee, Ing-Chao Lin, Sheng-Wei Wang
Ternary content addressable memory (TCAM), which can store 0, 1 and X in its cells, is widely used to store routing tables in network routers. Meanwhile, NBTI (Negative Bias Temperature Instability) and PBTI (Positive Biased Temperature Instability), which increase Vth and degrade transistor switching speed, have become major reliability challenges. In this paper, we propose a novel TCAM architecture to reduce BTI degradation using a bit-flipping technique. This novel TCAM architecture ensures the correctness of read, write and search operations. We also analyze the signal probabilities of TCAM cells, and demonstrate that the bit-flipping technique can balance signal probabilities. By using the bit-flipping technique, 76.40% of the data cells under investigation were found to have signal probabilities close to 50%, which is 62.80% higher than the original architecture. In addition, 92.60% of the mask cells had signal probabilities close to 50%, which is 91.20% higher than the original architecture. When considering the overhead of the bit-flipping technique, the best flipping frequency is once a day.
{"title":"Impacts of NBTI and PBTI effects on ternary CAM","authors":"Yen-Han Lee, Ing-Chao Lin, Sheng-Wei Wang","doi":"10.1109/ISQED.2013.6523588","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523588","url":null,"abstract":"Ternary content addressable memory (TCAM), which can store 0, 1 and X in its cells, is widely used to store routing tables in network routers. Meanwhile, NBTI (Negative Bias Temperature Instability) and PBTI (Positive Biased Temperature Instability), which increase Vth and degrade transistor switching speed, have become major reliability challenges. In this paper, we propose a novel TCAM architecture to reduce BTI degradation using a bit-flipping technique. This novel TCAM architecture ensures the correctness of read, write and search operations. We also analyze the signal probabilities of TCAM cells, and demonstrate that the bit-flipping technique can balance signal probabilities. By using the bit-flipping technique, 76.40% of the data cells under investigation were found to have signal probabilities close to 50%, which is 62.80% higher than the original architecture. In addition, 92.60% of the mask cells had signal probabilities close to 50%, which is 91.20% higher than the original architecture. When considering the overhead of the bit-flipping technique, the best flipping frequency is once a day.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125758334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523587
J. Wan, H. Kerkhoff
A compact NBTI model is presented by directly solving the reaction-diffusion (RD) equations in a simple way. The new model can handle arbitrary stress conditions without solving time-consuming equations and is hence very suitable for analog/mixed-signal NBTI simulations in SPICE-like environments. The model has been implemented in Cadence ADE with Verilog-A and also takes the stochastic effect of aging into account. The simulation speed has increased at least thousands times. The performance of the model is validated by both RD theoretical solutions as well as silicon results.
{"title":"An arbitrary stressed NBTI compact model for analog/mixed-signal reliability simulations","authors":"J. Wan, H. Kerkhoff","doi":"10.1109/ISQED.2013.6523587","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523587","url":null,"abstract":"A compact NBTI model is presented by directly solving the reaction-diffusion (RD) equations in a simple way. The new model can handle arbitrary stress conditions without solving time-consuming equations and is hence very suitable for analog/mixed-signal NBTI simulations in SPICE-like environments. The model has been implemented in Cadence ADE with Verilog-A and also takes the stochastic effect of aging into account. The simulation speed has increased at least thousands times. The performance of the model is validated by both RD theoretical solutions as well as silicon results.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128959245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523612
Y. Shinozuka, H. Fuketa, K. Ishida, F. Furuta, K. Osada, K. Takeda, M. Takamiya, T. Sakurai
This paper proposes a method to reduce the supply voltage IR drop of 3D stacked-die systems by implementing an on-chip Buck Converter on Top die (BCT) scheme. The IR drop is caused by the parasitic resistance of Through Silicon Vias (TSV's) used in the 3D integration. The IR drop reduction and the overhead associated with the BCT scheme are modeled and analyzed. A 3D stacked-die system is manufactured using 90nm CMOS technology with TSV's and a silicon interposer. A chip inductor and chip capacitors for the buck converter are mounted directly on the top die. The reduction of the IR drop to less than 1/4 is verified through experiments.
{"title":"Reducing IR drop in 3D integration to less than 1/4 using Buck Converter on Top die (BCT) scheme","authors":"Y. Shinozuka, H. Fuketa, K. Ishida, F. Furuta, K. Osada, K. Takeda, M. Takamiya, T. Sakurai","doi":"10.1109/ISQED.2013.6523612","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523612","url":null,"abstract":"This paper proposes a method to reduce the supply voltage IR drop of 3D stacked-die systems by implementing an on-chip Buck Converter on Top die (BCT) scheme. The IR drop is caused by the parasitic resistance of Through Silicon Vias (TSV's) used in the 3D integration. The IR drop reduction and the overhead associated with the BCT scheme are modeled and analyzed. A 3D stacked-die system is manufactured using 90nm CMOS technology with TSV's and a silicon interposer. A chip inductor and chip capacitors for the buck converter are mounted directly on the top die. The reduction of the IR drop to less than 1/4 is verified through experiments.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130809819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523674
A. Aguiar, C. Moratelli, M. Sartori, Fabiano Hessel
Recently, virtualization techniques has been investigated as an interesting approach for complex embedded systems designs since they allow more secure systems, improve software design quality and reduce costs. However, the need to meet design constraints, mainly the real-time constraints, constitutes one of the biggest challenges that may prevent the wide adoption of virtualization in embedded systems. Industry designers and researchers believe that the use of hardware support to virtualization is a possible way of improving the system's performance and meeting its real-time constraints. In this paper we present our virtualization-aware architecture intended for MIPS processors with support to real-time applications. In our proposed approach no changes are needed in the Guest OS since we implement a full virtualization scheme. Real-time constraints are achieved by mixing the full virtualization technique with hardware support along with bare-metal application usage. Details of our virtualization platform are presented and discussed in the paper. Results demonstrate the effectiveness of our approach considering the hardware impact in terms of area, the software performance overhead, and the operating system port to allow its execution in a virtualized environment.
{"title":"A virtualization approach for MIPS-based MPSoCs","authors":"A. Aguiar, C. Moratelli, M. Sartori, Fabiano Hessel","doi":"10.1109/ISQED.2013.6523674","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523674","url":null,"abstract":"Recently, virtualization techniques has been investigated as an interesting approach for complex embedded systems designs since they allow more secure systems, improve software design quality and reduce costs. However, the need to meet design constraints, mainly the real-time constraints, constitutes one of the biggest challenges that may prevent the wide adoption of virtualization in embedded systems. Industry designers and researchers believe that the use of hardware support to virtualization is a possible way of improving the system's performance and meeting its real-time constraints. In this paper we present our virtualization-aware architecture intended for MIPS processors with support to real-time applications. In our proposed approach no changes are needed in the Guest OS since we implement a full virtualization scheme. Real-time constraints are achieved by mixing the full virtualization technique with hardware support along with bare-metal application usage. Details of our virtualization platform are presented and discussed in the paper. Results demonstrate the effectiveness of our approach considering the hardware impact in terms of area, the software performance overhead, and the operating system port to allow its execution in a virtualized environment.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124554221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523582
S. Priyadarshi, N. Choudhary, Brandon H. Dwiel, Ankita Upreti, E. Rotenberg, W. R. Davis, P. Franzon
Timing the transition of a processor design to a new technology poses a provocative tradeoff. On the one hand, transitioning as early as possible offers a significant competitive advantage, by bringing improved designs to market early. On the other hand, an aggressive strategy may prove to be unprofitable, due to the low manufacturing yield of a technology that has not had time to mature. We propose exploiting two complementary forms of heterogeneity to profitably exploit an immature technology for Chip Multiprocessors (CMP). First, 3D integration facilitates a technology alloy. The CMP is split across two dies, one fabricated in the old technology and the other in the new technology. The alloy derives benefit from the new technology while limiting cost exposure. Second, to compensate for lower efficiency of old-technology cores, we exploit application and microarchitectural heterogeneity: applications which gain less from technology scaling are scheduled on old-technology cores, moreover, these cores are retuned to optimize this class of application. For a defect density ratio of 200 between 45nm and 65nm, Hetero2 3D gives 3.6× and 1.5× higher efficiency/cost compared to 2D and 3D homogeneous implementations, respectively, with only 6.5% degradation in efficiency. We also present a sensitivity analysis by sweeping the defect density ratio. The analysis reveals the defect density break-even points, where homogeneous 2D and 3D designs in 45nm achieve the same efficiency/cost as Hetero2 3D, marking significant points in the maturing of the technology.
{"title":"Hetero2 3D integration: A scheme for optimizing efficiency/cost of Chip Multiprocessors","authors":"S. Priyadarshi, N. Choudhary, Brandon H. Dwiel, Ankita Upreti, E. Rotenberg, W. R. Davis, P. Franzon","doi":"10.1109/ISQED.2013.6523582","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523582","url":null,"abstract":"Timing the transition of a processor design to a new technology poses a provocative tradeoff. On the one hand, transitioning as early as possible offers a significant competitive advantage, by bringing improved designs to market early. On the other hand, an aggressive strategy may prove to be unprofitable, due to the low manufacturing yield of a technology that has not had time to mature. We propose exploiting two complementary forms of heterogeneity to profitably exploit an immature technology for Chip Multiprocessors (CMP). First, 3D integration facilitates a technology alloy. The CMP is split across two dies, one fabricated in the old technology and the other in the new technology. The alloy derives benefit from the new technology while limiting cost exposure. Second, to compensate for lower efficiency of old-technology cores, we exploit application and microarchitectural heterogeneity: applications which gain less from technology scaling are scheduled on old-technology cores, moreover, these cores are retuned to optimize this class of application. For a defect density ratio of 200 between 45nm and 65nm, Hetero2 3D gives 3.6× and 1.5× higher efficiency/cost compared to 2D and 3D homogeneous implementations, respectively, with only 6.5% degradation in efficiency. We also present a sensitivity analysis by sweeping the defect density ratio. The analysis reveals the defect density break-even points, where homogeneous 2D and 3D designs in 45nm achieve the same efficiency/cost as Hetero2 3D, marking significant points in the maturing of the technology.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128967079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}