首页 > 最新文献

IEEE/ACM International Symposium on Low Power Electronics and Design最新文献

英文 中文
An energy-efficient adaptive hybrid cache 一种节能的自适应混合缓存
Pub Date : 2011-08-01 DOI: 10.1109/ISLPED.2011.5993609
J. Cong, Karthik Gururaj, Hui Huang, Chunyue Liu, Glenn D. Reinman, Yi Zou
By reconfiguring part of the cache as software-managed scratchpad memory (SPM), hybrid caches manage to handle both unknown and predictable memory access patterns. However, existing hybrid caches provide a flexible partitioning of cache and SPM without considering adaptation to the run-time cache behavior. Previous cache set balancing techniques are either energy-inefficient or require serial tag and data array access. In this paper an adaptive hybrid cache is proposed to dynamically remap SPM blocks from high-demand cache sets to low-demand cache sets. This achieves 19%, 25%, 18% and 18% energy-runtime-production reductions over four previous representative techniques on a wide range of benchmarks.
通过将部分缓存重新配置为软件管理的暂存存储器(SPM),混合缓存可以处理未知和可预测的内存访问模式。但是,现有的混合缓存提供了灵活的缓存和SPM分区,而不考虑对运行时缓存行为的适应。以前的缓存集平衡技术要么能效低下,要么需要串行标签和数据阵列访问。本文提出了一种自适应混合缓存,用于动态地将SPM块从高需求缓存集重新映射到低需求缓存集。在广泛的基准测试中,与之前的四种代表性技术相比,这种技术的能源运行时产量分别降低了19%、25%、18%和18%。
{"title":"An energy-efficient adaptive hybrid cache","authors":"J. Cong, Karthik Gururaj, Hui Huang, Chunyue Liu, Glenn D. Reinman, Yi Zou","doi":"10.1109/ISLPED.2011.5993609","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993609","url":null,"abstract":"By reconfiguring part of the cache as software-managed scratchpad memory (SPM), hybrid caches manage to handle both unknown and predictable memory access patterns. However, existing hybrid caches provide a flexible partitioning of cache and SPM without considering adaptation to the run-time cache behavior. Previous cache set balancing techniques are either energy-inefficient or require serial tag and data array access. In this paper an adaptive hybrid cache is proposed to dynamically remap SPM blocks from high-demand cache sets to low-demand cache sets. This achieves 19%, 25%, 18% and 18% energy-runtime-production reductions over four previous representative techniques on a wide range of benchmarks.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114227865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Making TSUBAME2.0, the world's greenest production supercomputer, even greener — Challenges to the architects 让世界上最环保的超级计算机TSUBAME2.0更环保——对建筑师的挑战
Pub Date : 2011-08-01 DOI: 10.1109/ISLPED.2011.5993666
S. Matsuoka
Supercomputers of the past were “performance at all cost” including power consumption, but nowadays supercomputers require even higher power-performance efficiencies than normal computers. For the past 25 years the ratio of supercomputer performance increase has constantly exceeded the so-called “Moore's Law”, but this has been partly achieved by increasing the size and thus the power requirement of the machine; such power increase is no longer viable because the machines have gotten too big. Our new project “JST-CREST ULP-HPC” and the new TSUBAME2.0 supercomputer we have built at Tokyo Institute of Technology aims to obtain utmost power efficiency in HPC. TSUBAME2.0 has been recognized as the “Greenest Production Supercomputer in the World” in the Green 500 rakings in November, 2010.
过去的超级计算机是“不惜一切代价的性能”,包括功耗,但现在的超级计算机需要比普通计算机更高的功率性能效率。在过去的25年里,超级计算机性能增长的比率不断超过所谓的“摩尔定律”,但这在一定程度上是通过增加机器的尺寸和功率需求来实现的;这样的功率增加不再可行,因为机器已经变得太大了。我们的新项目“JST-CREST ULP-HPC”和我们在东京工业大学建造的新的TSUBAME2.0超级计算机旨在获得高性能计算的最高功率效率。TSUBAME2.0在2010年11月的绿色500强排名中被公认为“世界上最环保的生产超级计算机”。
{"title":"Making TSUBAME2.0, the world's greenest production supercomputer, even greener — Challenges to the architects","authors":"S. Matsuoka","doi":"10.1109/ISLPED.2011.5993666","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993666","url":null,"abstract":"Supercomputers of the past were “performance at all cost” including power consumption, but nowadays supercomputers require even higher power-performance efficiencies than normal computers. For the past 25 years the ratio of supercomputer performance increase has constantly exceeded the so-called “Moore's Law”, but this has been partly achieved by increasing the size and thus the power requirement of the machine; such power increase is no longer viable because the machines have gotten too big. Our new project “JST-CREST ULP-HPC” and the new TSUBAME2.0 supercomputer we have built at Tokyo Institute of Technology aims to obtain utmost power efficiency in HPC. TSUBAME2.0 has been recognized as the “Greenest Production Supercomputer in the World” in the Green 500 rakings in November, 2010.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116678896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Thread shuffling: Combining DVFS and thread migration to reduce energy consumptions for multi-core systems 线程变换:结合DVFS和线程迁移来减少多核系统的能耗
Pub Date : 2011-08-01 DOI: 10.1109/ISLPED.2011.5993670
Qiong Cai, José González, G. Magklis, P. Chaparro, Antonio González
In recent years, multi-core systems have become mainstream in computer industry. The design of multi-cores takes advantage of thread-level parallelism in emerging applications that are computationally intensive and highly parallel. Energy efficiency is one of the biggest challenges in the design of multi-core systems, and workload imbalance among parallel threads is one of sources of energy inefficiency. Many techniques based on dynamic voltage frequency scaling (DVFS) are proposed to save energy consumptions on multi-cores, but all of them assume that each core in a multi-core system contains only one hardware context and only one thread can execute on one core at a time. However, mainstream multi-core systems are moving to have simultaneous multithreading (SMT) support in cores, and existing DVFS-based techniques are not effective to achieve maximum energy savings. In this paper, we present a novel technique called thread shuffling, which combines thread migration and DVFS to achieve maximum energy savings and maintain performance on a multi-core system supporting SMT. Thread shuffling is implemented and simulated in a cycle-accurate ×86 multi-core system. The experiments show that it achieves up to 56% energy savings without performance penalty for selected Recognition, Mining and Synthesis (RMS) applications from Intel Labs.
近年来,多核系统已成为计算机行业的主流。在计算密集型和高度并行的新兴应用程序中,多核的设计利用了线程级并行性。能源效率是多核系统设计中面临的最大挑战之一,并行线程之间的工作负载不平衡是能源效率低下的来源之一。许多基于动态电压频率缩放(DVFS)的多核节能技术被提出,但它们都假设多核系统中的每个核只包含一个硬件上下文,并且一次只能在一个核上执行一个线程。然而,主流的多核系统正在向在内核中支持同步多线程(SMT)的方向发展,现有的基于dvfs的技术无法有效地实现最大的节能。在本文中,我们提出了一种称为线程变换的新技术,该技术将线程迁移和DVFS相结合,以在支持SMT的多核系统上实现最大的节能并保持性能。在周期精确×86多核系统中实现并模拟了线程变换。实验表明,在英特尔实验室选定的识别、挖掘和合成(RMS)应用程序中,它可以在不影响性能的情况下节省高达56%的能源。
{"title":"Thread shuffling: Combining DVFS and thread migration to reduce energy consumptions for multi-core systems","authors":"Qiong Cai, José González, G. Magklis, P. Chaparro, Antonio González","doi":"10.1109/ISLPED.2011.5993670","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993670","url":null,"abstract":"In recent years, multi-core systems have become mainstream in computer industry. The design of multi-cores takes advantage of thread-level parallelism in emerging applications that are computationally intensive and highly parallel. Energy efficiency is one of the biggest challenges in the design of multi-core systems, and workload imbalance among parallel threads is one of sources of energy inefficiency. Many techniques based on dynamic voltage frequency scaling (DVFS) are proposed to save energy consumptions on multi-cores, but all of them assume that each core in a multi-core system contains only one hardware context and only one thread can execute on one core at a time. However, mainstream multi-core systems are moving to have simultaneous multithreading (SMT) support in cores, and existing DVFS-based techniques are not effective to achieve maximum energy savings. In this paper, we present a novel technique called thread shuffling, which combines thread migration and DVFS to achieve maximum energy savings and maintain performance on a multi-core system supporting SMT. Thread shuffling is implemented and simulated in a cycle-accurate ×86 multi-core system. The experiments show that it achieves up to 56% energy savings without performance penalty for selected Recognition, Mining and Synthesis (RMS) applications from Intel Labs.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116716627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Power and delay aware synthesis of multi-operand adders targeting LUT-based FPGAs 针对基于lut的fpga的多操作数加法器的功率和延迟感知合成
Pub Date : 2011-08-01 DOI: 10.1109/ISLPED.2011.5993639
T. Matsunaga, S. Kimura, Y. Matsunaga
Recent researches have indicated that multi-operand addition on FPGAs can be efficiently realized as the architecture consisting of a compressor tree which reduces the number of operands and a carry-propagate adder like ASIC by utilizing generalized parallel counters(GPCs). This paper addresses power and delay aware synthesis of GPC-based compressor trees. Based on the observation that dynamic power would correlate to the number of GPCs and the levels of GPCs, our approach targets to minimize the maximum levels and the total number of GPCs, and an ILP-based algorithm and heuristic approaches are proposed. Several experiments targeting Altera Stratix III architecture show that the proposed approach reduced the delay by up to 20% under a slight increase in total power dissipation.
近年来的研究表明,利用广义并行计数器(gpc), fpga上的多操作数加法可以通过减少操作数数量的压缩树和像ASIC一样的进位传播加法器组成的体系结构来有效地实现。本文研究了基于gpc的压缩树的功率和延迟感知综合。基于动态功率与gpc数量和gpc级别相关的观察,该方法以最大gpc级别和gpc总数最小为目标,提出了一种基于ilp的算法和启发式方法。几个针对Altera Stratix III架构的实验表明,在总功耗略有增加的情况下,所提出的方法将延迟降低了20%。
{"title":"Power and delay aware synthesis of multi-operand adders targeting LUT-based FPGAs","authors":"T. Matsunaga, S. Kimura, Y. Matsunaga","doi":"10.1109/ISLPED.2011.5993639","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993639","url":null,"abstract":"Recent researches have indicated that multi-operand addition on FPGAs can be efficiently realized as the architecture consisting of a compressor tree which reduces the number of operands and a carry-propagate adder like ASIC by utilizing generalized parallel counters(GPCs). This paper addresses power and delay aware synthesis of GPC-based compressor trees. Based on the observation that dynamic power would correlate to the number of GPCs and the levels of GPCs, our approach targets to minimize the maximum levels and the total number of GPCs, and an ILP-based algorithm and heuristic approaches are proposed. Several experiments targeting Altera Stratix III architecture show that the proposed approach reduced the delay by up to 20% under a slight increase in total power dissipation.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122494912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Reduction of minimum operating voltage (VDDmin) of CMOS logic circuits with post-fabrication automatically selective charge injection 利用制程后自动选择性电荷注入降低CMOS逻辑电路的最小工作电压
Pub Date : 2011-08-01 DOI: 10.1109/ISLPED.2011.5993632
Kentaro Honda, K. Ikeuchi, M. Nomura, M. Takamiya, T. Sakurai
In order to reduce minimum operating voltage (VDDmin) of CMOS logic circuits, a new method reducing the within-die random threshold (VTH) variation of transistors by a post-fabrication automatically selective charge injection using substrate hot electrons (SHE) is proposed along with novel circuitry to utilize this. In the new circuit, switches are added to combinational logic circuits in order to turn them into latch loops. In order to reduce VDDmin, design guides on the optimal (1) loop topology, (2) number of stages in a loop, (3) VTH shift per charge injection, and (4) number of charge injection trials are explored through simulations. By applying the proposed scheme to 96-stage inverter chain fabricated in 65-nm CMOS, the measured reduction of VDDmin from 94mV to 74mV is successfully demonstrated for the first time.
为了降低CMOS逻辑电路的最小工作电压(VDDmin),提出了一种利用衬底热电子(SHE)制造后自动选择电荷注入的方法,并设计了一种新的电路来利用这种方法来降低晶体管的模内随机阈值(VTH)变化。在新的电路中,开关被添加到组合逻辑电路中,以便将它们变成锁存环。为了降低VDDmin,通过仿真探讨了最优(1)回路拓扑结构、(2)回路级数、(3)每次注药的VTH位移和(4)注药试验次数的设计准则。通过将该方案应用于65纳米CMOS制造的96级逆变链,首次成功地将VDDmin从94mV降低到74mV。
{"title":"Reduction of minimum operating voltage (VDDmin) of CMOS logic circuits with post-fabrication automatically selective charge injection","authors":"Kentaro Honda, K. Ikeuchi, M. Nomura, M. Takamiya, T. Sakurai","doi":"10.1109/ISLPED.2011.5993632","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993632","url":null,"abstract":"In order to reduce minimum operating voltage (VDDmin) of CMOS logic circuits, a new method reducing the within-die random threshold (VTH) variation of transistors by a post-fabrication automatically selective charge injection using substrate hot electrons (SHE) is proposed along with novel circuitry to utilize this. In the new circuit, switches are added to combinational logic circuits in order to turn them into latch loops. In order to reduce VDDmin, design guides on the optimal (1) loop topology, (2) number of stages in a loop, (3) VTH shift per charge injection, and (4) number of charge injection trials are explored through simulations. By applying the proposed scheme to 96-stage inverter chain fabricated in 65-nm CMOS, the measured reduction of VDDmin from 94mV to 74mV is successfully demonstrated for the first time.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132656346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Charge migration efficiency optimization in hybrid electrical energy storage (HEES) systems 混合储能系统中电荷迁移效率优化
Pub Date : 2011-08-01 DOI: 10.1109/ISLPED.2011.5993620
Yanzhi Wang, Younghyun Kim, Q. Xie, N. Chang, Massoud Pedram
Electrical energy is high-quality form of energy, and thus it is beneficial to store the excessive electric energy in the electrical energy storage (EES) rather than converting into a different type of energy. Like memory devices, no single type of EES element can fulfill all the desirable requirements. Despite active research on the new EES technologies, it is not likely to have an ultimate high-efficiency, high-power/energy capacity, low-cost, and long-cycle life EES element in the near future. We propose an HEES system that consists of two or more heterogeneous EES elements, thereby realizing the advantages of each EES element while hiding their weaknesses. The HEES management problems can be broken into charge allocation into different banks of EES elements, charge replacement (i.e., discharge) from different banks of EES elements, and charge migration from one bank to another bank of EES elements. In spite of the optimal charge allocation and replacement, charge migration is mandatory to leverage the EES system efficiency. This paper is the first paper that formally describes the charge migration efficiency and its optimization. We first define the charge migration architecture and the corresponding charge migration problem. We provide a systematic solution for a single source and single destination charge migration considering the efficiency of the charger and power converter, the rate capacity effect of the storage element, the terminal voltage variation of the storage element as a function of the state of charge (SoC), and so on. Experimental results for an HEES system comprising of banks of batteries and supercapacitors demonstrate a migration efficiency improvement up to 51.3%, for supercapacitor to battery and supercapacitor to supercapacitor charge migration.
电能是一种高质量的能量形式,因此将多余的电能存储在电能存储系统中而不是转换成其他类型的能量是有益的。像存储设备一样,没有一种类型的EES元件可以满足所有期望的要求。尽管对新的EES技术进行了积极的研究,但在不久的将来不太可能有最终的高效率、高功率/能量容量、低成本和长循环寿命的EES元件。我们提出了一个由两个或多个异构EES元素组成的HEES系统,从而实现每个EES元素的优势,同时隐藏其弱点。HEES管理问题可分为:不同银行的EES元之间的电荷分配、不同银行的EES元之间的电荷置换(即放电)以及从一个银行到另一个EES元之间的电荷迁移。尽管有最优的电荷分配和替换,但电荷迁移是提高EES系统效率的必要条件。本文是第一个正式描述电荷迁移效率及其优化的论文。我们首先定义了电荷迁移体系结构和相应的电荷迁移问题。考虑到充电器和电源转换器的效率、存储元件的倍率容量效应、存储元件的终端电压变化作为荷电状态(SoC)的函数等因素,我们提供了一个系统的单源单目标电荷迁移解决方案。由电池组和超级电容器组成的HEES系统的实验结果表明,超级电容器到电池和超级电容器到超级电容器的电荷迁移效率提高了51.3%。
{"title":"Charge migration efficiency optimization in hybrid electrical energy storage (HEES) systems","authors":"Yanzhi Wang, Younghyun Kim, Q. Xie, N. Chang, Massoud Pedram","doi":"10.1109/ISLPED.2011.5993620","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993620","url":null,"abstract":"Electrical energy is high-quality form of energy, and thus it is beneficial to store the excessive electric energy in the electrical energy storage (EES) rather than converting into a different type of energy. Like memory devices, no single type of EES element can fulfill all the desirable requirements. Despite active research on the new EES technologies, it is not likely to have an ultimate high-efficiency, high-power/energy capacity, low-cost, and long-cycle life EES element in the near future. We propose an HEES system that consists of two or more heterogeneous EES elements, thereby realizing the advantages of each EES element while hiding their weaknesses. The HEES management problems can be broken into charge allocation into different banks of EES elements, charge replacement (i.e., discharge) from different banks of EES elements, and charge migration from one bank to another bank of EES elements. In spite of the optimal charge allocation and replacement, charge migration is mandatory to leverage the EES system efficiency. This paper is the first paper that formally describes the charge migration efficiency and its optimization. We first define the charge migration architecture and the corresponding charge migration problem. We provide a systematic solution for a single source and single destination charge migration considering the efficiency of the charger and power converter, the rate capacity effect of the storage element, the terminal voltage variation of the storage element as a function of the state of charge (SoC), and so on. Experimental results for an HEES system comprising of banks of batteries and supercapacitors demonstrate a migration efficiency improvement up to 51.3%, for supercapacitor to battery and supercapacitor to supercapacitor charge migration.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131854177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Pinned to the walls — Impact of packaging and application properties on the memory and power walls 钉在墙上-封装和应用特性对内存和电源墙的影响
Pub Date : 2011-08-01 DOI: 10.1109/ISLPED.2011.5993603
Phillip Stanley-Marbell, V. Cabezas, R. Luijten
This article presents a study of the impact of packaging on the memory and power walls, in the context of application properties. The analysis is supported by characterizations of 130 hardware designs spanning 30 years, along with both microarchitectural simulation and actual-hardware performance counter measurements of 25 applications. It is shown that if trends in supply pin count (growing as the square root of current) and total packaging pin count (doubling every six years) continue, application memory bandwidth requirements, even in the presence of aggressive cache hierarchies, may limit the number of on-chip threads to under a thousand in 2020.
本文介绍了在应用程序属性的背景下,封装对内存和电源壁的影响的研究。该分析得到了30年来130种硬件设计的特征描述,以及25种应用程序的微架构模拟和实际硬件性能度量的支持。研究表明,如果供应引脚数(随着电流的平方根而增长)和总封装引脚数(每六年翻一番)的趋势继续下去,即使存在积极的缓存层次结构,应用程序内存带宽需求也可能在2020年将片上线程的数量限制在1000以下。
{"title":"Pinned to the walls — Impact of packaging and application properties on the memory and power walls","authors":"Phillip Stanley-Marbell, V. Cabezas, R. Luijten","doi":"10.1109/ISLPED.2011.5993603","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993603","url":null,"abstract":"This article presents a study of the impact of packaging on the memory and power walls, in the context of application properties. The analysis is supported by characterizations of 130 hardware designs spanning 30 years, along with both microarchitectural simulation and actual-hardware performance counter measurements of 25 applications. It is shown that if trends in supply pin count (growing as the square root of current) and total packaging pin count (doubling every six years) continue, application memory bandwidth requirements, even in the presence of aggressive cache hierarchies, may limit the number of on-chip threads to under a thousand in 2020.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124006846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 50
An approach to energy-error tradeoffs in approximate ripple carry adders 近似纹波进位加法器的能量误差权衡方法
Pub Date : 2011-08-01 DOI: 10.1109/ISLPED.2011.5993638
Z. Kedem, V. Mooney, Kirthi Krishna Muntimadugu, K. Palem
Given a 16-bit or 32-bit overclocked ripple-carry adder, we minimize error by allocating multiple supply voltages to the gates. We solve the error minimization problem for a fixed energy budget using a binned geometric program solution (BGPS). A solution found via BGPS outperforms the two best prior approaches, uniform voltage scaling and biased voltage scaling, reducing error by as much as a factor of 2.58X and by a median of 1.58X in 90nm transistor technology.
给定一个16位或32位超频纹波进位加法器,我们通过分配多个电源电压到门来最小化误差。我们用一种分箱几何规划解(BGPS)解决了固定能量预算的误差最小化问题。通过BGPS找到的解决方案优于先前的两种最佳方法,均匀电压缩放和偏置电压缩放,在90nm晶体管技术中将误差降低了2.58倍,中位数降低了1.58倍。
{"title":"An approach to energy-error tradeoffs in approximate ripple carry adders","authors":"Z. Kedem, V. Mooney, Kirthi Krishna Muntimadugu, K. Palem","doi":"10.1109/ISLPED.2011.5993638","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993638","url":null,"abstract":"Given a 16-bit or 32-bit overclocked ripple-carry adder, we minimize error by allocating multiple supply voltages to the gates. We solve the error minimization problem for a fixed energy budget using a binned geometric program solution (BGPS). A solution found via BGPS outperforms the two best prior approaches, uniform voltage scaling and biased voltage scaling, reducing error by as much as a factor of 2.58X and by a median of 1.58X in 90nm transistor technology.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"2004 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128763817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
A design space exploration of transmission-line links for on-chip interconnect 片上互连传输在线链路的设计空间探索
Pub Date : 2011-08-01 DOI: 10.1109/ISLPED.2011.5993647
A. Carpenter, Jianyun Hu, Michael C. Huang, Hui Wu, Peng Liu
With increasing core count, chip multiprocessors (CMP) require a high-performance interconnect fabric that is energy-efficient Well-engineered transmission line-based communication systems offer an attractive solution, especially for CMPs with a moderate number of cores. While transmission lines have been used in a wide variety of purposes, there lack comprehensive studies to guide architects to navigate the circuit and physical design space to make proper architecture-level analyses and tradeoffs. This paper makes a first-ste effort in exploring part of the design space. Using detailed simulation-based analysis, we show that a shared-medium fabric based on transmission line can offer better performance and a much better energy profil than a conventional mesh interconnect.
随着核心数量的增加,芯片多处理器(CMP)需要一种高效节能的高性能互连结构。设计良好的基于传输线的通信系统提供了一种有吸引力的解决方案,特别是对于内核数量适中的CMP。虽然传输线已被广泛用于各种各样的目的,但缺乏全面的研究来指导架构师导航电路和物理设计空间,以进行适当的架构级分析和权衡。本文对部分设计空间进行了初步探索。通过详细的基于仿真的分析,我们表明基于传输线的共享介质结构可以提供比传统网状互连更好的性能和更好的能量分布。
{"title":"A design space exploration of transmission-line links for on-chip interconnect","authors":"A. Carpenter, Jianyun Hu, Michael C. Huang, Hui Wu, Peng Liu","doi":"10.1109/ISLPED.2011.5993647","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993647","url":null,"abstract":"With increasing core count, chip multiprocessors (CMP) require a high-performance interconnect fabric that is energy-efficient Well-engineered transmission line-based communication systems offer an attractive solution, especially for CMPs with a moderate number of cores. While transmission lines have been used in a wide variety of purposes, there lack comprehensive studies to guide architects to navigate the circuit and physical design space to make proper architecture-level analyses and tradeoffs. This paper makes a first-ste effort in exploring part of the design space. Using detailed simulation-based analysis, we show that a shared-medium fabric based on transmission line can offer better performance and a much better energy profil than a conventional mesh interconnect.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"212 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132094070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Object-based local dimming for LCD systems with LED BLUs 具有LED blu的LCD系统的基于对象的局部调光
Pub Date : 2011-08-01 DOI: 10.1109/ISLPED.2011.5993656
Aldhino Anggorosesar, Young-jin Kim
An LED-based BLU architecture has enabled local dimming, which can produce higher power saving than global dimming in LCD-based devices. However, existing local dimming techniques have not considered human visual system-awareness much. In this paper, we propose a novel local dimming technique using an object-based approach for both good human visuality and high power saving. We utilize prevalent colors of individual objects in a given image to do initial dimming, and then enhance the image using a proper fidelity threshold to reduce visible artifacts. Experimental results show that the proposed technique achieves power saving up to 12 and 5.5 times higher than a prior human visual system-aware global dimming approach and a well-designed local dimming one, respectively.
基于led的BLU架构支持本地调光,这比基于lcd的设备的全局调光更省电。然而,现有的局部调光技术并没有考虑到人类视觉系统的感知。在本文中,我们提出了一种新的局部调光技术,使用基于对象的方法来获得良好的人类视觉效果和高功耗。我们利用给定图像中单个对象的流行颜色进行初始调光,然后使用适当的保真度阈值增强图像以减少可见伪影。实验结果表明,该方法比现有的人类视觉系统感知全局调光方法和设计良好的局部调光方法分别节能12倍和5.5倍。
{"title":"Object-based local dimming for LCD systems with LED BLUs","authors":"Aldhino Anggorosesar, Young-jin Kim","doi":"10.1109/ISLPED.2011.5993656","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993656","url":null,"abstract":"An LED-based BLU architecture has enabled local dimming, which can produce higher power saving than global dimming in LCD-based devices. However, existing local dimming techniques have not considered human visual system-awareness much. In this paper, we propose a novel local dimming technique using an object-based approach for both good human visuality and high power saving. We utilize prevalent colors of individual objects in a given image to do initial dimming, and then enhance the image using a proper fidelity threshold to reduce visible artifacts. Experimental results show that the proposed technique achieves power saving up to 12 and 5.5 times higher than a prior human visual system-aware global dimming approach and a well-designed local dimming one, respectively.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"22 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113971594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
IEEE/ACM International Symposium on Low Power Electronics and Design
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1