Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993609
J. Cong, Karthik Gururaj, Hui Huang, Chunyue Liu, Glenn D. Reinman, Yi Zou
By reconfiguring part of the cache as software-managed scratchpad memory (SPM), hybrid caches manage to handle both unknown and predictable memory access patterns. However, existing hybrid caches provide a flexible partitioning of cache and SPM without considering adaptation to the run-time cache behavior. Previous cache set balancing techniques are either energy-inefficient or require serial tag and data array access. In this paper an adaptive hybrid cache is proposed to dynamically remap SPM blocks from high-demand cache sets to low-demand cache sets. This achieves 19%, 25%, 18% and 18% energy-runtime-production reductions over four previous representative techniques on a wide range of benchmarks.
{"title":"An energy-efficient adaptive hybrid cache","authors":"J. Cong, Karthik Gururaj, Hui Huang, Chunyue Liu, Glenn D. Reinman, Yi Zou","doi":"10.1109/ISLPED.2011.5993609","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993609","url":null,"abstract":"By reconfiguring part of the cache as software-managed scratchpad memory (SPM), hybrid caches manage to handle both unknown and predictable memory access patterns. However, existing hybrid caches provide a flexible partitioning of cache and SPM without considering adaptation to the run-time cache behavior. Previous cache set balancing techniques are either energy-inefficient or require serial tag and data array access. In this paper an adaptive hybrid cache is proposed to dynamically remap SPM blocks from high-demand cache sets to low-demand cache sets. This achieves 19%, 25%, 18% and 18% energy-runtime-production reductions over four previous representative techniques on a wide range of benchmarks.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114227865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993666
S. Matsuoka
Supercomputers of the past were “performance at all cost” including power consumption, but nowadays supercomputers require even higher power-performance efficiencies than normal computers. For the past 25 years the ratio of supercomputer performance increase has constantly exceeded the so-called “Moore's Law”, but this has been partly achieved by increasing the size and thus the power requirement of the machine; such power increase is no longer viable because the machines have gotten too big. Our new project “JST-CREST ULP-HPC” and the new TSUBAME2.0 supercomputer we have built at Tokyo Institute of Technology aims to obtain utmost power efficiency in HPC. TSUBAME2.0 has been recognized as the “Greenest Production Supercomputer in the World” in the Green 500 rakings in November, 2010.
{"title":"Making TSUBAME2.0, the world's greenest production supercomputer, even greener — Challenges to the architects","authors":"S. Matsuoka","doi":"10.1109/ISLPED.2011.5993666","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993666","url":null,"abstract":"Supercomputers of the past were “performance at all cost” including power consumption, but nowadays supercomputers require even higher power-performance efficiencies than normal computers. For the past 25 years the ratio of supercomputer performance increase has constantly exceeded the so-called “Moore's Law”, but this has been partly achieved by increasing the size and thus the power requirement of the machine; such power increase is no longer viable because the machines have gotten too big. Our new project “JST-CREST ULP-HPC” and the new TSUBAME2.0 supercomputer we have built at Tokyo Institute of Technology aims to obtain utmost power efficiency in HPC. TSUBAME2.0 has been recognized as the “Greenest Production Supercomputer in the World” in the Green 500 rakings in November, 2010.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116678896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993670
Qiong Cai, José González, G. Magklis, P. Chaparro, Antonio González
In recent years, multi-core systems have become mainstream in computer industry. The design of multi-cores takes advantage of thread-level parallelism in emerging applications that are computationally intensive and highly parallel. Energy efficiency is one of the biggest challenges in the design of multi-core systems, and workload imbalance among parallel threads is one of sources of energy inefficiency. Many techniques based on dynamic voltage frequency scaling (DVFS) are proposed to save energy consumptions on multi-cores, but all of them assume that each core in a multi-core system contains only one hardware context and only one thread can execute on one core at a time. However, mainstream multi-core systems are moving to have simultaneous multithreading (SMT) support in cores, and existing DVFS-based techniques are not effective to achieve maximum energy savings. In this paper, we present a novel technique called thread shuffling, which combines thread migration and DVFS to achieve maximum energy savings and maintain performance on a multi-core system supporting SMT. Thread shuffling is implemented and simulated in a cycle-accurate ×86 multi-core system. The experiments show that it achieves up to 56% energy savings without performance penalty for selected Recognition, Mining and Synthesis (RMS) applications from Intel Labs.
{"title":"Thread shuffling: Combining DVFS and thread migration to reduce energy consumptions for multi-core systems","authors":"Qiong Cai, José González, G. Magklis, P. Chaparro, Antonio González","doi":"10.1109/ISLPED.2011.5993670","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993670","url":null,"abstract":"In recent years, multi-core systems have become mainstream in computer industry. The design of multi-cores takes advantage of thread-level parallelism in emerging applications that are computationally intensive and highly parallel. Energy efficiency is one of the biggest challenges in the design of multi-core systems, and workload imbalance among parallel threads is one of sources of energy inefficiency. Many techniques based on dynamic voltage frequency scaling (DVFS) are proposed to save energy consumptions on multi-cores, but all of them assume that each core in a multi-core system contains only one hardware context and only one thread can execute on one core at a time. However, mainstream multi-core systems are moving to have simultaneous multithreading (SMT) support in cores, and existing DVFS-based techniques are not effective to achieve maximum energy savings. In this paper, we present a novel technique called thread shuffling, which combines thread migration and DVFS to achieve maximum energy savings and maintain performance on a multi-core system supporting SMT. Thread shuffling is implemented and simulated in a cycle-accurate ×86 multi-core system. The experiments show that it achieves up to 56% energy savings without performance penalty for selected Recognition, Mining and Synthesis (RMS) applications from Intel Labs.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116716627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993639
T. Matsunaga, S. Kimura, Y. Matsunaga
Recent researches have indicated that multi-operand addition on FPGAs can be efficiently realized as the architecture consisting of a compressor tree which reduces the number of operands and a carry-propagate adder like ASIC by utilizing generalized parallel counters(GPCs). This paper addresses power and delay aware synthesis of GPC-based compressor trees. Based on the observation that dynamic power would correlate to the number of GPCs and the levels of GPCs, our approach targets to minimize the maximum levels and the total number of GPCs, and an ILP-based algorithm and heuristic approaches are proposed. Several experiments targeting Altera Stratix III architecture show that the proposed approach reduced the delay by up to 20% under a slight increase in total power dissipation.
{"title":"Power and delay aware synthesis of multi-operand adders targeting LUT-based FPGAs","authors":"T. Matsunaga, S. Kimura, Y. Matsunaga","doi":"10.1109/ISLPED.2011.5993639","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993639","url":null,"abstract":"Recent researches have indicated that multi-operand addition on FPGAs can be efficiently realized as the architecture consisting of a compressor tree which reduces the number of operands and a carry-propagate adder like ASIC by utilizing generalized parallel counters(GPCs). This paper addresses power and delay aware synthesis of GPC-based compressor trees. Based on the observation that dynamic power would correlate to the number of GPCs and the levels of GPCs, our approach targets to minimize the maximum levels and the total number of GPCs, and an ILP-based algorithm and heuristic approaches are proposed. Several experiments targeting Altera Stratix III architecture show that the proposed approach reduced the delay by up to 20% under a slight increase in total power dissipation.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122494912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993632
Kentaro Honda, K. Ikeuchi, M. Nomura, M. Takamiya, T. Sakurai
In order to reduce minimum operating voltage (VDDmin) of CMOS logic circuits, a new method reducing the within-die random threshold (VTH) variation of transistors by a post-fabrication automatically selective charge injection using substrate hot electrons (SHE) is proposed along with novel circuitry to utilize this. In the new circuit, switches are added to combinational logic circuits in order to turn them into latch loops. In order to reduce VDDmin, design guides on the optimal (1) loop topology, (2) number of stages in a loop, (3) VTH shift per charge injection, and (4) number of charge injection trials are explored through simulations. By applying the proposed scheme to 96-stage inverter chain fabricated in 65-nm CMOS, the measured reduction of VDDmin from 94mV to 74mV is successfully demonstrated for the first time.
{"title":"Reduction of minimum operating voltage (VDDmin) of CMOS logic circuits with post-fabrication automatically selective charge injection","authors":"Kentaro Honda, K. Ikeuchi, M. Nomura, M. Takamiya, T. Sakurai","doi":"10.1109/ISLPED.2011.5993632","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993632","url":null,"abstract":"In order to reduce minimum operating voltage (VDDmin) of CMOS logic circuits, a new method reducing the within-die random threshold (VTH) variation of transistors by a post-fabrication automatically selective charge injection using substrate hot electrons (SHE) is proposed along with novel circuitry to utilize this. In the new circuit, switches are added to combinational logic circuits in order to turn them into latch loops. In order to reduce VDDmin, design guides on the optimal (1) loop topology, (2) number of stages in a loop, (3) VTH shift per charge injection, and (4) number of charge injection trials are explored through simulations. By applying the proposed scheme to 96-stage inverter chain fabricated in 65-nm CMOS, the measured reduction of VDDmin from 94mV to 74mV is successfully demonstrated for the first time.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132656346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993620
Yanzhi Wang, Younghyun Kim, Q. Xie, N. Chang, Massoud Pedram
Electrical energy is high-quality form of energy, and thus it is beneficial to store the excessive electric energy in the electrical energy storage (EES) rather than converting into a different type of energy. Like memory devices, no single type of EES element can fulfill all the desirable requirements. Despite active research on the new EES technologies, it is not likely to have an ultimate high-efficiency, high-power/energy capacity, low-cost, and long-cycle life EES element in the near future. We propose an HEES system that consists of two or more heterogeneous EES elements, thereby realizing the advantages of each EES element while hiding their weaknesses. The HEES management problems can be broken into charge allocation into different banks of EES elements, charge replacement (i.e., discharge) from different banks of EES elements, and charge migration from one bank to another bank of EES elements. In spite of the optimal charge allocation and replacement, charge migration is mandatory to leverage the EES system efficiency. This paper is the first paper that formally describes the charge migration efficiency and its optimization. We first define the charge migration architecture and the corresponding charge migration problem. We provide a systematic solution for a single source and single destination charge migration considering the efficiency of the charger and power converter, the rate capacity effect of the storage element, the terminal voltage variation of the storage element as a function of the state of charge (SoC), and so on. Experimental results for an HEES system comprising of banks of batteries and supercapacitors demonstrate a migration efficiency improvement up to 51.3%, for supercapacitor to battery and supercapacitor to supercapacitor charge migration.
{"title":"Charge migration efficiency optimization in hybrid electrical energy storage (HEES) systems","authors":"Yanzhi Wang, Younghyun Kim, Q. Xie, N. Chang, Massoud Pedram","doi":"10.1109/ISLPED.2011.5993620","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993620","url":null,"abstract":"Electrical energy is high-quality form of energy, and thus it is beneficial to store the excessive electric energy in the electrical energy storage (EES) rather than converting into a different type of energy. Like memory devices, no single type of EES element can fulfill all the desirable requirements. Despite active research on the new EES technologies, it is not likely to have an ultimate high-efficiency, high-power/energy capacity, low-cost, and long-cycle life EES element in the near future. We propose an HEES system that consists of two or more heterogeneous EES elements, thereby realizing the advantages of each EES element while hiding their weaknesses. The HEES management problems can be broken into charge allocation into different banks of EES elements, charge replacement (i.e., discharge) from different banks of EES elements, and charge migration from one bank to another bank of EES elements. In spite of the optimal charge allocation and replacement, charge migration is mandatory to leverage the EES system efficiency. This paper is the first paper that formally describes the charge migration efficiency and its optimization. We first define the charge migration architecture and the corresponding charge migration problem. We provide a systematic solution for a single source and single destination charge migration considering the efficiency of the charger and power converter, the rate capacity effect of the storage element, the terminal voltage variation of the storage element as a function of the state of charge (SoC), and so on. Experimental results for an HEES system comprising of banks of batteries and supercapacitors demonstrate a migration efficiency improvement up to 51.3%, for supercapacitor to battery and supercapacitor to supercapacitor charge migration.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131854177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993603
Phillip Stanley-Marbell, V. Cabezas, R. Luijten
This article presents a study of the impact of packaging on the memory and power walls, in the context of application properties. The analysis is supported by characterizations of 130 hardware designs spanning 30 years, along with both microarchitectural simulation and actual-hardware performance counter measurements of 25 applications. It is shown that if trends in supply pin count (growing as the square root of current) and total packaging pin count (doubling every six years) continue, application memory bandwidth requirements, even in the presence of aggressive cache hierarchies, may limit the number of on-chip threads to under a thousand in 2020.
{"title":"Pinned to the walls — Impact of packaging and application properties on the memory and power walls","authors":"Phillip Stanley-Marbell, V. Cabezas, R. Luijten","doi":"10.1109/ISLPED.2011.5993603","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993603","url":null,"abstract":"This article presents a study of the impact of packaging on the memory and power walls, in the context of application properties. The analysis is supported by characterizations of 130 hardware designs spanning 30 years, along with both microarchitectural simulation and actual-hardware performance counter measurements of 25 applications. It is shown that if trends in supply pin count (growing as the square root of current) and total packaging pin count (doubling every six years) continue, application memory bandwidth requirements, even in the presence of aggressive cache hierarchies, may limit the number of on-chip threads to under a thousand in 2020.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124006846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993638
Z. Kedem, V. Mooney, Kirthi Krishna Muntimadugu, K. Palem
Given a 16-bit or 32-bit overclocked ripple-carry adder, we minimize error by allocating multiple supply voltages to the gates. We solve the error minimization problem for a fixed energy budget using a binned geometric program solution (BGPS). A solution found via BGPS outperforms the two best prior approaches, uniform voltage scaling and biased voltage scaling, reducing error by as much as a factor of 2.58X and by a median of 1.58X in 90nm transistor technology.
{"title":"An approach to energy-error tradeoffs in approximate ripple carry adders","authors":"Z. Kedem, V. Mooney, Kirthi Krishna Muntimadugu, K. Palem","doi":"10.1109/ISLPED.2011.5993638","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993638","url":null,"abstract":"Given a 16-bit or 32-bit overclocked ripple-carry adder, we minimize error by allocating multiple supply voltages to the gates. We solve the error minimization problem for a fixed energy budget using a binned geometric program solution (BGPS). A solution found via BGPS outperforms the two best prior approaches, uniform voltage scaling and biased voltage scaling, reducing error by as much as a factor of 2.58X and by a median of 1.58X in 90nm transistor technology.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"2004 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128763817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993647
A. Carpenter, Jianyun Hu, Michael C. Huang, Hui Wu, Peng Liu
With increasing core count, chip multiprocessors (CMP) require a high-performance interconnect fabric that is energy-efficient Well-engineered transmission line-based communication systems offer an attractive solution, especially for CMPs with a moderate number of cores. While transmission lines have been used in a wide variety of purposes, there lack comprehensive studies to guide architects to navigate the circuit and physical design space to make proper architecture-level analyses and tradeoffs. This paper makes a first-ste effort in exploring part of the design space. Using detailed simulation-based analysis, we show that a shared-medium fabric based on transmission line can offer better performance and a much better energy profil than a conventional mesh interconnect.
{"title":"A design space exploration of transmission-line links for on-chip interconnect","authors":"A. Carpenter, Jianyun Hu, Michael C. Huang, Hui Wu, Peng Liu","doi":"10.1109/ISLPED.2011.5993647","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993647","url":null,"abstract":"With increasing core count, chip multiprocessors (CMP) require a high-performance interconnect fabric that is energy-efficient Well-engineered transmission line-based communication systems offer an attractive solution, especially for CMPs with a moderate number of cores. While transmission lines have been used in a wide variety of purposes, there lack comprehensive studies to guide architects to navigate the circuit and physical design space to make proper architecture-level analyses and tradeoffs. This paper makes a first-ste effort in exploring part of the design space. Using detailed simulation-based analysis, we show that a shared-medium fabric based on transmission line can offer better performance and a much better energy profil than a conventional mesh interconnect.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"212 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132094070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993656
Aldhino Anggorosesar, Young-jin Kim
An LED-based BLU architecture has enabled local dimming, which can produce higher power saving than global dimming in LCD-based devices. However, existing local dimming techniques have not considered human visual system-awareness much. In this paper, we propose a novel local dimming technique using an object-based approach for both good human visuality and high power saving. We utilize prevalent colors of individual objects in a given image to do initial dimming, and then enhance the image using a proper fidelity threshold to reduce visible artifacts. Experimental results show that the proposed technique achieves power saving up to 12 and 5.5 times higher than a prior human visual system-aware global dimming approach and a well-designed local dimming one, respectively.
{"title":"Object-based local dimming for LCD systems with LED BLUs","authors":"Aldhino Anggorosesar, Young-jin Kim","doi":"10.1109/ISLPED.2011.5993656","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993656","url":null,"abstract":"An LED-based BLU architecture has enabled local dimming, which can produce higher power saving than global dimming in LCD-based devices. However, existing local dimming techniques have not considered human visual system-awareness much. In this paper, we propose a novel local dimming technique using an object-based approach for both good human visuality and high power saving. We utilize prevalent colors of individual objects in a given image to do initial dimming, and then enhance the image using a proper fidelity threshold to reduce visible artifacts. Experimental results show that the proposed technique achieves power saving up to 12 and 5.5 times higher than a prior human visual system-aware global dimming approach and a well-designed local dimming one, respectively.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"22 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113971594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}