Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523651
Bao Liu, Lu Wang
Delay test pattern generation has emerged as an increasingly critical problem in high performance VLSI designs. Existing techniques find timing critical paths by STA or SSTA, apply a traditional ATPG algorithm subsequently and find the test patterns. In this paper, we propose a new delay test pattern generation method, which finds timing critical paths by more accurate input-aware statistical timing analysis, achieves input patterns by back-tracing, and verifies the estimated timing critical paths under the input patterns by logic simulation. Our experimental results based on 9 ISCAS'89 benchmark circuits show that the state-of-the-art SSTA-TQM-BnB technique achieves an average of 57.83%, 54.50%, and 69.91% delay fault coverage, while our SPSTA-DTPG technique achieves an average of 67.83%, 71.39%, and 77.53% delay fault coverage for a test size of 50, 100, and 200, respectively.
{"title":"Input-aware statistical timing analysis-based delay test pattern generation","authors":"Bao Liu, Lu Wang","doi":"10.1109/ISQED.2013.6523651","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523651","url":null,"abstract":"Delay test pattern generation has emerged as an increasingly critical problem in high performance VLSI designs. Existing techniques find timing critical paths by STA or SSTA, apply a traditional ATPG algorithm subsequently and find the test patterns. In this paper, we propose a new delay test pattern generation method, which finds timing critical paths by more accurate input-aware statistical timing analysis, achieves input patterns by back-tracing, and verifies the estimated timing critical paths under the input patterns by logic simulation. Our experimental results based on 9 ISCAS'89 benchmark circuits show that the state-of-the-art SSTA-TQM-BnB technique achieves an average of 57.83%, 54.50%, and 69.91% delay fault coverage, while our SPSTA-DTPG technique achieves an average of 67.83%, 71.39%, and 77.53% delay fault coverage for a test size of 50, 100, and 200, respectively.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133079686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523661
Young-Ho Gong, H. Jang, S. Chung
Most modern microprocessors have multi-level on-chip caches with multi-megabyte shared last-level cache (LLC). By using multi-level cache hierarchy, the whole size of on-chip caches becomes larger. The increased cache size causes the leakage power and area of the on-chip caches to increase. Recently, to reduce the leakage power and area of the SRAM based cache, the SRAM-eDRAM hybrid cache was proposed. For SRAM-eDRAM hybrid caches, however, there has not been any study to analyze the effects of the reduced area on wire delay, cache access time, and performance. By replacing half (or three fourth) of SRAM cells by small eDRAM cells for the SRAM-eDRAM hybrid caches, wire length is shortened, which eventually results in the reduction of wire delay and cache access time. In this paper, we evaluate the SRAM-eDRAM hybrid caches in terms of the energy, area, wire delay, access time, and performance. We show that the SRAM-eDRAM hybrid cache reduces the energy consumption, area, wire delay, and SRAM array access time by up to 53.9%, 49.9%, 50.4%, and 38.7%, respectively, compared to the SRAM based cache.
{"title":"Performance and cache access time of SRAM-eDRAM hybrid caches considering wire delay","authors":"Young-Ho Gong, H. Jang, S. Chung","doi":"10.1109/ISQED.2013.6523661","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523661","url":null,"abstract":"Most modern microprocessors have multi-level on-chip caches with multi-megabyte shared last-level cache (LLC). By using multi-level cache hierarchy, the whole size of on-chip caches becomes larger. The increased cache size causes the leakage power and area of the on-chip caches to increase. Recently, to reduce the leakage power and area of the SRAM based cache, the SRAM-eDRAM hybrid cache was proposed. For SRAM-eDRAM hybrid caches, however, there has not been any study to analyze the effects of the reduced area on wire delay, cache access time, and performance. By replacing half (or three fourth) of SRAM cells by small eDRAM cells for the SRAM-eDRAM hybrid caches, wire length is shortened, which eventually results in the reduction of wire delay and cache access time. In this paper, we evaluate the SRAM-eDRAM hybrid caches in terms of the energy, area, wire delay, access time, and performance. We show that the SRAM-eDRAM hybrid cache reduces the energy consumption, area, wire delay, and SRAM array access time by up to 53.9%, 49.9%, 50.4%, and 38.7%, respectively, compared to the SRAM based cache.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133890918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523690
Xin Fu, Tao Li, J. Fortes
Packet-switched on-chip interconnection networks are emerging as pervasive communication fabrics to connect different processing elements in multi/many-core chips. As a preferred NoC flow-control mechanism, Express Virtual Channel (EVC) allows packets to virtually bypass intermediate nodes to minimize communication delay. Technology scaling results in process variation and Negative Biased Temperature Instability (NBTI) which can significantly affect the reliability and lifetime of NoC fabricated using nano-meter transistors. In this paper, we propose a technique that significantly improves the reliability of EVC-based NoCs by reducing the simultaneous impact of process variation and NBTI. Our evaluation results using a detailed cycle-accurate simulator on a wide range of synthetic traffics and parallel benchmark traces show up to 75.5% guardband improvement over the conventional EVC-based NoCs.
{"title":"Reliable Express-Virtual-Channel-based network-on-chip under the impact of technology scaling","authors":"Xin Fu, Tao Li, J. Fortes","doi":"10.1109/ISQED.2013.6523690","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523690","url":null,"abstract":"Packet-switched on-chip interconnection networks are emerging as pervasive communication fabrics to connect different processing elements in multi/many-core chips. As a preferred NoC flow-control mechanism, Express Virtual Channel (EVC) allows packets to virtually bypass intermediate nodes to minimize communication delay. Technology scaling results in process variation and Negative Biased Temperature Instability (NBTI) which can significantly affect the reliability and lifetime of NoC fabricated using nano-meter transistors. In this paper, we propose a technique that significantly improves the reliability of EVC-based NoCs by reducing the simultaneous impact of process variation and NBTI. Our evaluation results using a detailed cycle-accurate simulator on a wide range of synthetic traffics and parallel benchmark traces show up to 75.5% guardband improvement over the conventional EVC-based NoCs.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128798730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523601
Jacob Murray, Rajath Hegde, Teng Lu, P. Pande, B. Shirazi
Wireless Network-on-Chip (WiNoC) has emerged as an enabling technology to design low power and high bandwidth massive multi-core chips. The performance advantages mainly stem from using the wireless links as long-range shortcuts between far apart cores. This performance gain can be enhanced further if the characteristics of the wireline links and the processing cores of the WiNoC are optimized according to the traffic patterns and workloads. In this work, we demonstrate that by incorporating both processor- and network-level dynamic voltage and frequency scaling (DVFS) in a WiNoC, the power and thermal profiles can be enhanced without a significant impact on the overall execution time. We also show that depending on the benchmark applications, temperature hotspots can be formed either in the processing core or in the network infrastructure. The proposed dual-level DVFS is capable of addressing both.
{"title":"Sustainable dual-level DVFS-enabled NoC with on-chip wireless links","authors":"Jacob Murray, Rajath Hegde, Teng Lu, P. Pande, B. Shirazi","doi":"10.1109/ISQED.2013.6523601","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523601","url":null,"abstract":"Wireless Network-on-Chip (WiNoC) has emerged as an enabling technology to design low power and high bandwidth massive multi-core chips. The performance advantages mainly stem from using the wireless links as long-range shortcuts between far apart cores. This performance gain can be enhanced further if the characteristics of the wireline links and the processing cores of the WiNoC are optimized according to the traffic patterns and workloads. In this work, we demonstrate that by incorporating both processor- and network-level dynamic voltage and frequency scaling (DVFS) in a WiNoC, the power and thermal profiles can be enhanced without a significant impact on the overall execution time. We also show that depending on the benchmark applications, temperature hotspots can be formed either in the processing core or in the network infrastructure. The proposed dual-level DVFS is capable of addressing both.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"464 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125824717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523627
R. Bazaz, Jianyong Xie, M. Swaminathan
Three dimensional integrated circuits (3D ICs) built with through-silicon vias (TSVs) have smaller planar dimensions, shorter wire length, and better performance than 2D ICs. Many 3D stacks will combine digital and analog/RF circuitry, requiring a strong analog/mixed-signal capability. Because of the unique packaging requirements of stacked die, an IC/package co-design capability is required. Hence there must be some standards to facilitate smooth and effective design of 3D ICs. Thus the main purpose of an exchange format (EF) between dies is to permit sharing of information necessary for design by external parties without disclosing their intellectual property (IP). The requirements of the standards should be the minimum necessary to produce satisfactory answers. Producing such models is a customer support function, not a core engineering function, i.e. it is not generated by the chip design scheme. The role of the standards is to facilitate the transfer of information through a compact model, not necessarily to build one. This paper presents initial efforts in designing such standards or EF. Steady state electrical and thermal simulations are performed in this paper to demonstrate the necessary information that needs to be exchanged between the dies to ensure adequate co-design.
{"title":"Electrical and thermal analysis for design exchange formats in three dimensional integrated circuits","authors":"R. Bazaz, Jianyong Xie, M. Swaminathan","doi":"10.1109/ISQED.2013.6523627","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523627","url":null,"abstract":"Three dimensional integrated circuits (3D ICs) built with through-silicon vias (TSVs) have smaller planar dimensions, shorter wire length, and better performance than 2D ICs. Many 3D stacks will combine digital and analog/RF circuitry, requiring a strong analog/mixed-signal capability. Because of the unique packaging requirements of stacked die, an IC/package co-design capability is required. Hence there must be some standards to facilitate smooth and effective design of 3D ICs. Thus the main purpose of an exchange format (EF) between dies is to permit sharing of information necessary for design by external parties without disclosing their intellectual property (IP). The requirements of the standards should be the minimum necessary to produce satisfactory answers. Producing such models is a customer support function, not a core engineering function, i.e. it is not generated by the chip design scheme. The role of the standards is to facilitate the transfer of information through a compact model, not necessarily to build one. This paper presents initial efforts in designing such standards or EF. Steady state electrical and thermal simulations are performed in this paper to demonstrate the necessary information that needs to be exchanged between the dies to ensure adequate co-design.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126685408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523650
Hong Zhu, V. Kursun
Conventional Static Random Access Memory (SRAM) cells suffer from an intrinsic data instability problem due to directly-accessed data storage nodes during a read operation. Noise margins of memory cells further shrink with increasing variability and decreasing supply voltage in scaled CMOS technologies. A selected set of novel seven-transistor (7T) and conventional six-transistor (6T) multi-threshold-voltage memory circuits are characterized for data stability, write margin, and idle mode leakage currents with an equal area constraint under parameter variations in this paper. The mean of the statistical read static noise margin distribution is enhanced by up to 2.4X and the mean of the statistical array leakage power consumption distribution is reduced by up to 82% with the triple-threshold-voltage 7T SRAM cells as compared to the traditional 6T SRAM cells in a UMC 80nm CMOS technology.
{"title":"Impact of process parameter and supply voltage fluctuations on multi-threshold-voltage seven-transistor static memory cells","authors":"Hong Zhu, V. Kursun","doi":"10.1109/ISQED.2013.6523650","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523650","url":null,"abstract":"Conventional Static Random Access Memory (SRAM) cells suffer from an intrinsic data instability problem due to directly-accessed data storage nodes during a read operation. Noise margins of memory cells further shrink with increasing variability and decreasing supply voltage in scaled CMOS technologies. A selected set of novel seven-transistor (7T) and conventional six-transistor (6T) multi-threshold-voltage memory circuits are characterized for data stability, write margin, and idle mode leakage currents with an equal area constraint under parameter variations in this paper. The mean of the statistical read static noise margin distribution is enhanced by up to 2.4X and the mean of the statistical array leakage power consumption distribution is reduced by up to 82% with the triple-threshold-voltage 7T SRAM cells as compared to the traditional 6T SRAM cells in a UMC 80nm CMOS technology.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121792761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523657
Hsin-Hung Liu, Rung-Bin Lin, I. Tseng
Memory blocks in a structured ASIC are normally pre-customized with fixed sizes and placed at predefined locations. The number of memory blocks is also pre-determined. This imposes a stringent limitation on the use of memory blocks, often creating a situation of either insufficient capacity or considerable waste. To remove this limitation, in this paper we propose a method to create relocatable and resizable SRAM blocks using the same via-configurable logic block to implement both logic gates and 6T SRAM cells. We develop an SRAM compiler to synthesize SRAM blocks of this sort. Our single-port SRAM array uses only 1/3 the area taken by a flip-flop based SRAM array. For dual-port SRAM arrays, this ratio is 2/3. We demonstrate first time the feasibility of deploying a varying number of relocatable and resizable SRAM blocks on a structured ASIC.
{"title":"Relocatable and resizable SRAM synthesis for via configurable structured ASIC","authors":"Hsin-Hung Liu, Rung-Bin Lin, I. Tseng","doi":"10.1109/ISQED.2013.6523657","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523657","url":null,"abstract":"Memory blocks in a structured ASIC are normally pre-customized with fixed sizes and placed at predefined locations. The number of memory blocks is also pre-determined. This imposes a stringent limitation on the use of memory blocks, often creating a situation of either insufficient capacity or considerable waste. To remove this limitation, in this paper we propose a method to create relocatable and resizable SRAM blocks using the same via-configurable logic block to implement both logic gates and 6T SRAM cells. We develop an SRAM compiler to synthesize SRAM blocks of this sort. Our single-port SRAM array uses only 1/3 the area taken by a flip-flop based SRAM array. For dual-port SRAM arrays, this ratio is 2/3. We demonstrate first time the feasibility of deploying a varying number of relocatable and resizable SRAM blocks on a structured ASIC.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123348899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523603
Ku He, A. Gerstlauer, M. Orshansky
In signal processing applications, large energy gains can be obtained by accepting some degradation in the output signal quality. Filters are at the core of many such systems. In this paper, we demonstrate the potential of a new paradigm for achieving favorable quality-energy trade-offs in digital filter design that is based on directly accepting timing errors in the datapath under aggressively scaled VDD. In an unmodified design, such scaling leads to rapid onset of timing errors and, consequently, quality loss. In a modified filter implementation, the onset of large errors is delayed, permitting significant energy reduction while maintaining high quality. Specifically, the innovations in the design include techniques for: 1) run-time adjustment of datapath bitwidth, and 2) design-time reordering of filter taps. We tested the new design strategy on several audio and image processing applications. The designs were synthesized using a 45nm standard cell library. Results of SPICE simulations on the entire designs show that up to 70% energy savings can be achieved while maintaining excellent perceived signal-to-noise ratios (SNRs). Compared to a traditional filter design, the area overhead of our architecture is about 2%.
{"title":"Low-energy digital filter design based on controlled timing error acceptance","authors":"Ku He, A. Gerstlauer, M. Orshansky","doi":"10.1109/ISQED.2013.6523603","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523603","url":null,"abstract":"In signal processing applications, large energy gains can be obtained by accepting some degradation in the output signal quality. Filters are at the core of many such systems. In this paper, we demonstrate the potential of a new paradigm for achieving favorable quality-energy trade-offs in digital filter design that is based on directly accepting timing errors in the datapath under aggressively scaled VDD. In an unmodified design, such scaling leads to rapid onset of timing errors and, consequently, quality loss. In a modified filter implementation, the onset of large errors is delayed, permitting significant energy reduction while maintaining high quality. Specifically, the innovations in the design include techniques for: 1) run-time adjustment of datapath bitwidth, and 2) design-time reordering of filter taps. We tested the new design strategy on several audio and image processing applications. The designs were synthesized using a 45nm standard cell library. Results of SPICE simulations on the entire designs show that up to 70% energy savings can be achieved while maintaining excellent perceived signal-to-noise ratios (SNRs). Compared to a traditional filter design, the area overhead of our architecture is about 2%.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116730179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523669
A. Datta, M. Abu-Rahma, S. Dasnurkar, Hadi Rasouli, Sean Tamjidi, M. Cai, S. Sengupta, P. Chidambaram, Raghavan Thirumala, Nikhil Kulkarni, Prasanna Seeram, Prasad Bhadri, P. Patel, S. Yoon, E. Terzioglu
Mobile devices spend most of the time in standby mode. Supported features and functionalities are increasing in each newer model. With the wide spread adaptation of multi-tasking in mobile devices, retaining current status and data for all active tasks is critical for user satisfaction. Extending battery life in portable mobile devices necessitates the use of minimum possible energy in standby mode while retaining present states for all active tasks. This paper for the first time, explains the low-voltage data-retention failure mechanism in ops. It analyzes the impact of design and process parameters on the data-retention failure. Statistical nature of data retention failure is established and validated with extensive Monte-Carlo simulations across various process corners. Finally, silicon measurement from several 28nm industrial mobile chips is presented showing good correlation of retention failure prediction from simulation.
{"title":"Analysis, modeling and silicon correlation of low-voltage flop data retention in 28nm process technology","authors":"A. Datta, M. Abu-Rahma, S. Dasnurkar, Hadi Rasouli, Sean Tamjidi, M. Cai, S. Sengupta, P. Chidambaram, Raghavan Thirumala, Nikhil Kulkarni, Prasanna Seeram, Prasad Bhadri, P. Patel, S. Yoon, E. Terzioglu","doi":"10.1109/ISQED.2013.6523669","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523669","url":null,"abstract":"Mobile devices spend most of the time in standby mode. Supported features and functionalities are increasing in each newer model. With the wide spread adaptation of multi-tasking in mobile devices, retaining current status and data for all active tasks is critical for user satisfaction. Extending battery life in portable mobile devices necessitates the use of minimum possible energy in standby mode while retaining present states for all active tasks. This paper for the first time, explains the low-voltage data-retention failure mechanism in ops. It analyzes the impact of design and process parameters on the data-retention failure. Statistical nature of data retention failure is established and validated with extensive Monte-Carlo simulations across various process corners. Finally, silicon measurement from several 28nm industrial mobile chips is presented showing good correlation of retention failure prediction from simulation.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"496 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133273352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-03-04DOI: 10.1109/ISQED.2013.6523594
Dali Zhao, H. Homayoun, A. Veidenbaum
A 3D architecture with DRAM memory stacked on a multi-core processor has many benefits for the embedded system. Compared with a conventional 2D design, it reduces memory access latency, increases memory bandwidth and reduces energy consumption. However it poses a thermal challenge as the heat generated by the processor cannot dissipate efficiently through the DRAM memory layer. Due to the fact that DRAM is very sensitive to high temperature as well as temperature variance, 3D stacking causes more failures to occur because DRAM thermal variance is higher than the conventional 2D architecture. To address this thermal challenge we propose to reduce temperature variance and peak temperature of a 3D multi-core processor and stacked DRAM by thermally aware thread migration among processor cores. This method has very limited impact on processor performance. Using migration-based policy we reduce peak steady-state temperature in the processor by up to 8.3 degrees Celsius, with the average of 4.7 degrees.
{"title":"Temperature aware thread migration in 3D architecture with stacked DRAM","authors":"Dali Zhao, H. Homayoun, A. Veidenbaum","doi":"10.1109/ISQED.2013.6523594","DOIUrl":"https://doi.org/10.1109/ISQED.2013.6523594","url":null,"abstract":"A 3D architecture with DRAM memory stacked on a multi-core processor has many benefits for the embedded system. Compared with a conventional 2D design, it reduces memory access latency, increases memory bandwidth and reduces energy consumption. However it poses a thermal challenge as the heat generated by the processor cannot dissipate efficiently through the DRAM memory layer. Due to the fact that DRAM is very sensitive to high temperature as well as temperature variance, 3D stacking causes more failures to occur because DRAM thermal variance is higher than the conventional 2D architecture. To address this thermal challenge we propose to reduce temperature variance and peak temperature of a 3D multi-core processor and stacked DRAM by thermally aware thread migration among processor cores. This method has very limited impact on processor performance. Using migration-based policy we reduce peak steady-state temperature in the processor by up to 8.3 degrees Celsius, with the average of 4.7 degrees.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123796994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}