Embedded computing systems include several off-chip serial links, that are typically used to interface processing elements with peripherals, such as sensors, actuators and I/O controllers. Because of the long physical lines of these connections, they can contribute significantly to the total energy consumption. On the other hand, many embedded applications are error resilient, i.e. they can tolerate intermediate approximations without a significant impact on the final quality of results. This feature can be exploited in serial buses to explore the trade-off between data approximations and energy consumption. We propose a simple yet very effective approximate encoding for reducing dynamic energy in serial buses. Our approach uses differential encoding as a baseline scheme, and extends it with bounded approximations to overcome the intrinsic limitations of differential encoding for data with low temporal correlation. We show that encoder and decoder for this algorithm can be implemented in hardware with no throughput loss and truly marginal power overheads. Nonetheless, our approach is superior to state-of-the-art approximate encodings, and for realistic inputs it reaches up to 95% power reduction with <;1% average error on decoded data.
{"title":"Approximate differential encoding for energy-efficient serial communication","authors":"D. J. Pagliari, E. Macii, M. Poncino","doi":"10.1145/2902961.2902974","DOIUrl":"https://doi.org/10.1145/2902961.2902974","url":null,"abstract":"Embedded computing systems include several off-chip serial links, that are typically used to interface processing elements with peripherals, such as sensors, actuators and I/O controllers. Because of the long physical lines of these connections, they can contribute significantly to the total energy consumption. On the other hand, many embedded applications are error resilient, i.e. they can tolerate intermediate approximations without a significant impact on the final quality of results. This feature can be exploited in serial buses to explore the trade-off between data approximations and energy consumption. We propose a simple yet very effective approximate encoding for reducing dynamic energy in serial buses. Our approach uses differential encoding as a baseline scheme, and extends it with bounded approximations to overcome the intrinsic limitations of differential encoding for data with low temporal correlation. We show that encoder and decoder for this algorithm can be implemented in hardware with no throughput loss and truly marginal power overheads. Nonetheless, our approach is superior to state-of-the-art approximate encodings, and for realistic inputs it reaches up to 95% power reduction with <;1% average error on decoded data.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115162798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The physical dimensions of standard cells constrain the dimensions of power networks, affecting the on-chip power noise. An exploratory modeling methodology is presented for estimating power noise in advanced technology nodes. The models are evaluated for 14, 10, and 7 nm technologies to assess the impact on performance. Scaled technologies are shown to be more sensitive to power noise, resulting in potential loss of performance enhancements achieved by scaling. Stripes between local track rails is evaluated as a means to reduce power noise, exhibiting up to 56.5% improvement in power noise for the 7 nm technology node. A strong dependence on the width of a stripe is observed, indicating that fewer wide stripes are more favorable then many thin stripes. As a promising alternative material for power network interconnects, graphene is shown to exhibit good potential in reducing power noise. The effects of different scaling scenarios of local power rails on power noise are also discussed.
{"title":"Exploratory power noise models of standard cell 14, 10, and 7 nm FinFET ICs","authors":"Ravi Patel, Kan Xu, E. Friedman, P. Raghavan","doi":"10.1145/2902961.2903035","DOIUrl":"https://doi.org/10.1145/2902961.2903035","url":null,"abstract":"The physical dimensions of standard cells constrain the dimensions of power networks, affecting the on-chip power noise. An exploratory modeling methodology is presented for estimating power noise in advanced technology nodes. The models are evaluated for 14, 10, and 7 nm technologies to assess the impact on performance. Scaled technologies are shown to be more sensitive to power noise, resulting in potential loss of performance enhancements achieved by scaling. Stripes between local track rails is evaluated as a means to reduce power noise, exhibiting up to 56.5% improvement in power noise for the 7 nm technology node. A strong dependence on the width of a stripe is observed, indicating that fewer wide stripes are more favorable then many thin stripes. As a promising alternative material for power network interconnects, graphene is shown to exhibit good potential in reducing power noise. The effects of different scaling scenarios of local power rails on power noise are also discussed.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126350450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aditya Dalakoti, Carrie Segal, Merritt Miller, F. Brewer
We present a metric for event detection, targeted for the analysis of CMOS asynchronous serial data links. Our metric is used to analyze signaling strategies that allow for coincident or nearly coincident detection of both data and event timing. The metric predicts that the CMOS link signaling mechanism has substantial implicit dispersion and intersymbol interference [ISI] tolerance when compared to conventionally timed links. In fact, it predicts correct link operation in situations where eye-diagram techniques predict link failure. Practical operation margins and metrics are described and evaluated for PCB and cabling solutions suggesting 10+ Gb/s low-power asynchronous links could be implemented in CMOS 130nm technology.
{"title":"Asynchronous high speed serial links analysis using integrated charge for event detection","authors":"Aditya Dalakoti, Carrie Segal, Merritt Miller, F. Brewer","doi":"10.1145/2902961.2902998","DOIUrl":"https://doi.org/10.1145/2902961.2902998","url":null,"abstract":"We present a metric for event detection, targeted for the analysis of CMOS asynchronous serial data links. Our metric is used to analyze signaling strategies that allow for coincident or nearly coincident detection of both data and event timing. The metric predicts that the CMOS link signaling mechanism has substantial implicit dispersion and intersymbol interference [ISI] tolerance when compared to conventionally timed links. In fact, it predicts correct link operation in situations where eye-diagram techniques predict link failure. Practical operation margins and metrics are described and evaluated for PCB and cabling solutions suggesting 10+ Gb/s low-power asynchronous links could be implemented in CMOS 130nm technology.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128467284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dimitrios Stamoulis, S. Corbetta, D. Rodopoulos, P. Weckx, P. Debacker, B. Meyer, B. Kaczer, P. Raghavan, D. Soudris, F. Catthoor, Z. Zilic
Atomistic-based approaches accurately model Bias Temperature Instability phenomena, but they suffer from prolonged execution times, preventing their seamless integration in system-level analysis flows. In this paper we present a comprehensive flow that combines the accuracy of Capture Emission Time (CET) maps with the efficiency of the Compact Digital Waveform (CDW) representation. That way, we capture the true workload-dependent BTI-induced degradation of selected CPU components. First, we show that existing works that assume constant stress patterns fail to account for workload dependency leading to fundamental estimation errors. Second, we evaluate the impact of different real workloads on selected CPU sub-blocks from a commercial processor design. To the best of our knowledge, this is the first work that combines atomistic property and true workload-dependency for variability analysis.
{"title":"Capturing true workload dependency of BTI-induced degradation in CPU components","authors":"Dimitrios Stamoulis, S. Corbetta, D. Rodopoulos, P. Weckx, P. Debacker, B. Meyer, B. Kaczer, P. Raghavan, D. Soudris, F. Catthoor, Z. Zilic","doi":"10.1145/2902961.2902992","DOIUrl":"https://doi.org/10.1145/2902961.2902992","url":null,"abstract":"Atomistic-based approaches accurately model Bias Temperature Instability phenomena, but they suffer from prolonged execution times, preventing their seamless integration in system-level analysis flows. In this paper we present a comprehensive flow that combines the accuracy of Capture Emission Time (CET) maps with the efficiency of the Compact Digital Waveform (CDW) representation. That way, we capture the true workload-dependent BTI-induced degradation of selected CPU components. First, we show that existing works that assume constant stress patterns fail to account for workload dependency leading to fundamental estimation errors. Second, we evaluate the impact of different real workloads on selected CPU sub-blocks from a commercial processor design. To the best of our knowledge, this is the first work that combines atomistic property and true workload-dependency for variability analysis.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116560646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Authentication of electronic devices has become critical. Hardware authentication is one way to enhance security of a chip. Along with software, it makes it harder for an intruder to access any computer, smart-phone, or other devices without authorization. One way of authenticating a device through hardware is to use the fabrication anomalies, which are random and unclonable. This mechanism is called a Physical Unclonable Function (PUF). PUFs are easy to evaluate but hard to predict. PUF is a concept that gained popularity since the past decade, when researchers started taking advantage of the randomness of electrical signals in order to build a unique authentication block. This survey will show the state-of-the-art devices that are currently investigated as PUFs. The different technologies are compared by taking into account reproducibility, uniqueness, randomness, area, scalability, and compatibility with CMOS. Emphasis is put on technologies that are emerging and gaining commercial interest. Through comparisons, we will show their applicability to different environments.
{"title":"Survey of emerging technology based physical unclonable funtions","authors":"Ilia A. Bautista Adames, J. Das, S. Bhanja","doi":"10.1145/2902961.2903044","DOIUrl":"https://doi.org/10.1145/2902961.2903044","url":null,"abstract":"Authentication of electronic devices has become critical. Hardware authentication is one way to enhance security of a chip. Along with software, it makes it harder for an intruder to access any computer, smart-phone, or other devices without authorization. One way of authenticating a device through hardware is to use the fabrication anomalies, which are random and unclonable. This mechanism is called a Physical Unclonable Function (PUF). PUFs are easy to evaluate but hard to predict. PUF is a concept that gained popularity since the past decade, when researchers started taking advantage of the randomness of electrical signals in order to build a unique authentication block. This survey will show the state-of-the-art devices that are currently investigated as PUFs. The different technologies are compared by taking into account reproducibility, uniqueness, randomness, area, scalability, and compatibility with CMOS. Emphasis is put on technologies that are emerging and gaining commercial interest. Through comparisons, we will show their applicability to different environments.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124345174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Circuit obfuscation techniques have been proposed to conceal circuit's functionality in order to thwart reverse engineering (RE) attacks to integrated circuits (IC). We believe that a good obfuscation method should have low design complexity and low performance overhead, yet, causing high RE attack complexity. However, existing obfuscation techniques do not meet all these requirements. In this paper, we propose a polynomial obfuscation scheme which leverages special designed multiplexers (MUXs) to replace judiciously selected logic gates. Candidate to-be-obfuscated logic gates are selected based on a novel gate classification method which utilizes IC topological structure information. We show that this scheme is resilient to all the known attacks, hence it is secure. Experiments are conducted on ISCAS 85/89 and MCNC benchmark suites to evaluate the performance overhead due to obfuscation.
{"title":"Secure and low-overhead circuit obfuscation technique with multiplexers","authors":"Xueyan Wang, Xiaotao Jia, Qiang Zhou, Yici Cai, Jianlei Yang, Mingze Gao, G. Qu","doi":"10.1145/2902961.2903000","DOIUrl":"https://doi.org/10.1145/2902961.2903000","url":null,"abstract":"Circuit obfuscation techniques have been proposed to conceal circuit's functionality in order to thwart reverse engineering (RE) attacks to integrated circuits (IC). We believe that a good obfuscation method should have low design complexity and low performance overhead, yet, causing high RE attack complexity. However, existing obfuscation techniques do not meet all these requirements. In this paper, we propose a polynomial obfuscation scheme which leverages special designed multiplexers (MUXs) to replace judiciously selected logic gates. Candidate to-be-obfuscated logic gates are selected based on a novel gate classification method which utilizes IC topological structure information. We show that this scheme is resilient to all the known attacks, hence it is secure. Experiments are conducted on ISCAS 85/89 and MCNC benchmark suites to evaluate the performance overhead due to obfuscation.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131762941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A technique for sampling clock skew correction by adjusting the delay in the input signal to each channel in a time-interleaved (TI) ADC is proposed. A proof-of-concept TI ADC employing this technique was implemented in a 65 nm CMOS process. The four-way TI ADC operates at an effective sampling rate of 150 MS/s, and achieves 60.2 dB and 58.2 dB SNDR for an input signal frequency of 2.1 MHz and 74.1 MHz, respectively. The ADC consumes 12.4 mW from a 1.2 V supply and occupies an area of 0.9 mm2.
{"title":"A sampling clock skew correction technique for time-interleaved SAR ADCs","authors":"D. Prashanth, Hae-Seung Lee","doi":"10.1145/2902961.2903008","DOIUrl":"https://doi.org/10.1145/2902961.2903008","url":null,"abstract":"A technique for sampling clock skew correction by adjusting the delay in the input signal to each channel in a time-interleaved (TI) ADC is proposed. A proof-of-concept TI ADC employing this technique was implemented in a 65 nm CMOS process. The four-way TI ADC operates at an effective sampling rate of 150 MS/s, and achieves 60.2 dB and 58.2 dB SNDR for an input signal frequency of 2.1 MHz and 74.1 MHz, respectively. The ADC consumes 12.4 mW from a 1.2 V supply and occupies an area of 0.9 mm2.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"33 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131992150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To fully exploit the massive parallelism of many cores, this work tackles the problem of mapping large-scale applications onto heterogeneous on-chip networks (NoCs) to minimize the peak workload for energy hotspot avoidance. A task-resource co-optimization framework is proposed which configures the on-chip communication infrastructure and maps the applications simultaneously and coherently, aiming to minimize the peak load under the constraints of computation power and communication capacity and a total cost budget of on-chip resources. The problem is first formulated into a linear programming model to search for optimal solution. A heuristic algorithm is further developed for fast design space exploration in extremely large-scale many-core NoCs. Extensive simulations are carried out under real-world benchmarks and randomly generated task graphs to demonstrate the effectiveness and efficiency of the proposed schemes.
{"title":"Task-resource co-allocation for hotspot minimization in heterogeneous many-core NoCs","authors":"Md Farhadur Reza, Dan Zhao, Hongyi Wu","doi":"10.1145/2902961.2903003","DOIUrl":"https://doi.org/10.1145/2902961.2903003","url":null,"abstract":"To fully exploit the massive parallelism of many cores, this work tackles the problem of mapping large-scale applications onto heterogeneous on-chip networks (NoCs) to minimize the peak workload for energy hotspot avoidance. A task-resource co-optimization framework is proposed which configures the on-chip communication infrastructure and maps the applications simultaneously and coherently, aiming to minimize the peak load under the constraints of computation power and communication capacity and a total cost budget of on-chip resources. The problem is first formulated into a linear programming model to search for optimal solution. A heuristic algorithm is further developed for fast design space exploration in extremely large-scale many-core NoCs. Extensive simulations are carried out under real-world benchmarks and randomly generated task graphs to demonstrate the effectiveness and efficiency of the proposed schemes.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114271766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents the design of a non-volatile register file using cells made of a SRAM and a Programmable Metallization Cell (PMC). The proposed cell is a symmetric 8T2P (8-transistors, 2PMC) design; it utilizes three control lines to ensure the correctness in its operations (i.e. Write, Read, Store and Restore). Simulation results using HSPICE are provided for the cell as well as the register file array (both one- and two-dimensional schemes). At cell level, it is shown that the off-state resistance has a limited effect on the Read time, because in the proposed circuit the transistor connecting the PMCs to the SRAM is off. While having no significant effect on the Store time, the time of the Restore operation depends on the value of the off-state resistance, i.e. an increase in off-state PMC resistance causes an increase in Restore time. Comparison between non-volatile register files utilizing either PMCs, or Phase Change Memories (PCMs) is provided.The register file using PMCs has a faster Store and Read times than the PCM-based counterpart; this is mostly caused by the difference in resistance values for these two non-volatile technologies. The lower delay involved in these operations confirms that the proposed PMC-based register file offers significant advantages in terms of delay performance.
{"title":"A design of a non-volatile PMC-based (programmable metallization cell) register file","authors":"Salin Junsangsri, Jie Han, F. Lombardi","doi":"10.1145/2902961.2903034","DOIUrl":"https://doi.org/10.1145/2902961.2903034","url":null,"abstract":"This paper presents the design of a non-volatile register file using cells made of a SRAM and a Programmable Metallization Cell (PMC). The proposed cell is a symmetric 8T2P (8-transistors, 2PMC) design; it utilizes three control lines to ensure the correctness in its operations (i.e. Write, Read, Store and Restore). Simulation results using HSPICE are provided for the cell as well as the register file array (both one- and two-dimensional schemes). At cell level, it is shown that the off-state resistance has a limited effect on the Read time, because in the proposed circuit the transistor connecting the PMCs to the SRAM is off. While having no significant effect on the Store time, the time of the Restore operation depends on the value of the off-state resistance, i.e. an increase in off-state PMC resistance causes an increase in Restore time. Comparison between non-volatile register files utilizing either PMCs, or Phase Change Memories (PCMs) is provided.The register file using PMCs has a faster Store and Read times than the PCM-based counterpart; this is mostly caused by the difference in resistance values for these two non-volatile technologies. The lower delay involved in these operations confirms that the proposed PMC-based register file offers significant advantages in terms of delay performance.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114686259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Negative bias temperature instability (NBTI) has emerged as a critical challenge to lifetime reliability of computing systems. Traditionally, temperature-aware methodologies are used to mitigate the impact of NBTI on aging and degradation of computing systems. However, in the presence of process variation, which is the norm in manycore processors, temperature-aware techniques are inefficient in improving lifetime reliability and can result in poor performance. In this paper, we propose a novel performance constraint-aware task mapping technique to improve lifetime reliability by mitigating NBTI considering on-chip process variation. Our approach consists of two phases, namely design-time and run-time. During design time, we generate Pareto-optimal mappings. Following which, our run-time technique judiciously intervenes to perform workload migration to save the weakest processing core. We compare our approach with performance-greedy and thermal-aware task mapping techniques. The experiment results demonstrate that our approach outperforms other two techniques and improves lifetime reliability of a manycore system as much as 54% without violating the throughput constraint.
{"title":"Performance constraint-aware task mapping to optimize lifetime reliability of manycore systems","authors":"Vijeta Rathore, Vivek Chaturvedi, T. Srikanthan","doi":"10.1145/2902961.2902996","DOIUrl":"https://doi.org/10.1145/2902961.2902996","url":null,"abstract":"Negative bias temperature instability (NBTI) has emerged as a critical challenge to lifetime reliability of computing systems. Traditionally, temperature-aware methodologies are used to mitigate the impact of NBTI on aging and degradation of computing systems. However, in the presence of process variation, which is the norm in manycore processors, temperature-aware techniques are inefficient in improving lifetime reliability and can result in poor performance. In this paper, we propose a novel performance constraint-aware task mapping technique to improve lifetime reliability by mitigating NBTI considering on-chip process variation. Our approach consists of two phases, namely design-time and run-time. During design time, we generate Pareto-optimal mappings. Following which, our run-time technique judiciously intervenes to perform workload migration to save the weakest processing core. We compare our approach with performance-greedy and thermal-aware task mapping techniques. The experiment results demonstrate that our approach outperforms other two techniques and improves lifetime reliability of a manycore system as much as 54% without violating the throughput constraint.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"210 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121359523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}