首页 > 最新文献

2018 Ninth International Green and Sustainable Computing Conference (IGSC)最新文献

英文 中文
Mitigating the Energy Impacts of VBTI Aging in Photonic Networks-on-Chip Architectures with Multilevel Signaling 基于多电平信号的片上光子网络结构中VBTI老化对能量的影响
Pub Date : 2018-10-01 DOI: 10.1109/IGCC.2018.8752130
Ishan G. Thakkar, S. Pasricha
Photonic networks-on-chip (PNoCs) can enable higher bandwidth and lower latency data transfers at the speed of light. Such PNoCs consist of photonic waveguides with dense-wavelength-division-multiplexing (DWDM) for signal traversal and microring resonators (MRs) for signal modulation and reception. To enable MRs to modulate and receive DWDM photonic signals, change in the free-carrier concentration in or operating temperature of MRs through their voltage biasing is essential. But long-term operation of MRs with constant or time-varying temperature and voltage biasing causes aging. Such voltage bias and temperature induced (VBTI) aging in MRs leads to resonance wavelength drifts and Q-factor degradation at the device-level, which in turn exacerbates three key spectral effects at the photonic link level, namely the intermodulation crosstalk, heterodyne crosstalk, and signal sidelobes truncation. These adverse spectral effects ultimately increase signal power attenuation and energy-per-bit in PNoCs. Our frequency-domain analysis of photonic links shows that the use of the four pulse amplitude modulation (4-PAM) signaling instead of the traditional on-off keying (OOK) signaling can proactively reduce signal attenuation caused by the VBTI aging induced spectral effects. Our system-level evaluation results indicate that, compared to OOK based PNoCs with no aging, 4-PAM based PNoCs can achieve 5.5% better energy-efficiency even after undergoing VBTI aging for 3 Years.
光子片上网络(PNoCs)可以实现更高带宽和更低延迟的光速数据传输。这种pnoc由用于信号穿越的具有密集波分复用(DWDM)的光子波导和用于信号调制和接收的微环谐振器(MRs)组成。为了使MRs调制和接收DWDM光子信号,必须通过其电压偏置改变MRs的自由载流子浓度或工作温度。但在恒定或时变温度和电压偏置的情况下,MRs长期运行会导致老化。这种电压偏置和温度诱导(VBTI)老化会导致器件级的共振波长漂移和q因子退化,进而加剧光子链路级的三个关键频谱效应,即互调串扰、外差串扰和信号旁瓣截断。这些不利的频谱效应最终会增加pnoc中的信号功率衰减和每比特能量。我们对光子链路的频域分析表明,使用四脉冲调幅(4-PAM)信号代替传统的开关键控(OOK)信号可以主动减少由VBTI老化引起的频谱效应引起的信号衰减。我们的系统级评价结果表明,与未老化的OOK pnoc相比,4-PAM pnoc在经过VBTI老化3年后的能效提高了5.5%。
{"title":"Mitigating the Energy Impacts of VBTI Aging in Photonic Networks-on-Chip Architectures with Multilevel Signaling","authors":"Ishan G. Thakkar, S. Pasricha","doi":"10.1109/IGCC.2018.8752130","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752130","url":null,"abstract":"Photonic networks-on-chip (PNoCs) can enable higher bandwidth and lower latency data transfers at the speed of light. Such PNoCs consist of photonic waveguides with dense-wavelength-division-multiplexing (DWDM) for signal traversal and microring resonators (MRs) for signal modulation and reception. To enable MRs to modulate and receive DWDM photonic signals, change in the free-carrier concentration in or operating temperature of MRs through their voltage biasing is essential. But long-term operation of MRs with constant or time-varying temperature and voltage biasing causes aging. Such voltage bias and temperature induced (VBTI) aging in MRs leads to resonance wavelength drifts and Q-factor degradation at the device-level, which in turn exacerbates three key spectral effects at the photonic link level, namely the intermodulation crosstalk, heterodyne crosstalk, and signal sidelobes truncation. These adverse spectral effects ultimately increase signal power attenuation and energy-per-bit in PNoCs. Our frequency-domain analysis of photonic links shows that the use of the four pulse amplitude modulation (4-PAM) signaling instead of the traditional on-off keying (OOK) signaling can proactively reduce signal attenuation caused by the VBTI aging induced spectral effects. Our system-level evaluation results indicate that, compared to OOK based PNoCs with no aging, 4-PAM based PNoCs can achieve 5.5% better energy-efficiency even after undergoing VBTI aging for 3 Years.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122154045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Evaluating Radial Basis Function Kernel on OpenCL FPGA Platform 基于OpenCL FPGA平台的径向基函数核评估
Pub Date : 2018-10-01 DOI: 10.1109/IGCC.2018.8752172
Zheming Jin, H. Finkel
Field-programmable gate arrays (FPGAs) are becoming a promising heterogeneous computing component for scientific computing when floating-point optimized architectures are added to the current FPGAs. The emerging high-level synthesis (HLS) tools provide a streamlined design flow to facilitate the use of FPGAs for researchers who have little FPGA development experience. In this paper, we choose the kernel, Radial Basis Function, in a support vector machine as a case study to evaluate the potential of implementing machine learning kernels on FPGAs, and the capabilities of an HLS tool to convert a kernel written in high-level language to an FPGA implementation. We explain the HLS flow and the RBF kernel. We evaluate the kernel in an OpenCL-to-FPGA HLS flow, and describe the optimizations of the kernel. Our optimizations using kernel vectorization and loop unrolling improve the kernel performance by a factor of 15.8 compared to a baseline kernel on the Nallatech 385A FPGA card that features an Intel Arria 10 GX 1150 FPGA. In terms of energy efficiency, the performance per watt on the FPGA platform is 2.8X higher than that on an Intel Xeon 16-core CPU, and 1.7X higher than that on an Nvidia Tesla K80 GPU. On the other hand, the performance per watt on an Intel Xeon Phi Knights Landing CPU and an Nvidia Tesla P100 GPU are 5.3X and 1.7X higher than that on the FPGA, respectively.
随着浮点优化架构的加入,现场可编程门阵列(fpga)正在成为一种很有前途的科学计算异构计算组件。新兴的高级综合(HLS)工具提供了一个简化的设计流程,以方便FPGA开发经验较少的研究人员使用FPGA。在本文中,我们选择支持向量机中的内核径向基函数作为案例研究,以评估在FPGA上实现机器学习内核的潜力,以及HLS工具将用高级语言编写的内核转换为FPGA实现的能力。我们解释了HLS流和RBF内核。我们在opencl到fpga的HLS流程中评估内核,并描述内核的优化。与采用Intel Arria 10 GX 1150 FPGA的Nallatech 385A FPGA卡上的基准内核相比,我们使用内核矢量化和循环展开进行的优化将内核性能提高了15.8倍。在能效方面,FPGA平台的每瓦性能比Intel至强16核CPU高2.8倍,比Nvidia Tesla K80 GPU高1.7倍。另一方面,Intel Xeon Phi Knights Landing CPU和Nvidia Tesla P100 GPU的每瓦性能分别比FPGA高5.3倍和1.7倍。
{"title":"Evaluating Radial Basis Function Kernel on OpenCL FPGA Platform","authors":"Zheming Jin, H. Finkel","doi":"10.1109/IGCC.2018.8752172","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752172","url":null,"abstract":"Field-programmable gate arrays (FPGAs) are becoming a promising heterogeneous computing component for scientific computing when floating-point optimized architectures are added to the current FPGAs. The emerging high-level synthesis (HLS) tools provide a streamlined design flow to facilitate the use of FPGAs for researchers who have little FPGA development experience. In this paper, we choose the kernel, Radial Basis Function, in a support vector machine as a case study to evaluate the potential of implementing machine learning kernels on FPGAs, and the capabilities of an HLS tool to convert a kernel written in high-level language to an FPGA implementation. We explain the HLS flow and the RBF kernel. We evaluate the kernel in an OpenCL-to-FPGA HLS flow, and describe the optimizations of the kernel. Our optimizations using kernel vectorization and loop unrolling improve the kernel performance by a factor of 15.8 compared to a baseline kernel on the Nallatech 385A FPGA card that features an Intel Arria 10 GX 1150 FPGA. In terms of energy efficiency, the performance per watt on the FPGA platform is 2.8X higher than that on an Intel Xeon 16-core CPU, and 1.7X higher than that on an Nvidia Tesla K80 GPU. On the other hand, the performance per watt on an Intel Xeon Phi Knights Landing CPU and an Nvidia Tesla P100 GPU are 5.3X and 1.7X higher than that on the FPGA, respectively.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124086763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Dynamic Programming Technique for Energy-Efficient Multicore Systems 节能多核系统的动态规划技术
Pub Date : 2018-10-01 DOI: 10.1109/IGCC.2018.8752159
Shervin Hajiamini, B. Shirazi, Aaron S. Crandall, Hassan Ghasemzadeh
With a focus on static (compile-time) methods for V/F level assignments, we propose an efficient Dynamic programming (DP) technique using the Viterbi algorithm, which uses the Energy-Delay Product (EDP) as objective function to predict the best V/F levels. By using the profiled information of applications, this technique minimizes energy consumption and execution time. We evaluate and compare the performance of the proposed algorithm against three heuristic methods—a greedy version of our algorithm, a feedback controller method, and a simple heuristic that uses historical performance to make predictions for adjusting the V/F levels. Experimental results show that our algorithm outperforms the heuristics under the study by an average of 12 to 24% using the EDP performance criteria.
针对静态(编译时)V/F电平分配方法,我们提出了一种高效的动态规划(DP)技术,该技术使用Viterbi算法,使用能量延迟积(EDP)作为目标函数来预测最佳V/F电平。通过使用应用程序的概要信息,该技术最大限度地减少了能耗和执行时间。我们根据三种启发式方法评估并比较了所提出算法的性能——我们的算法的贪婪版本,反馈控制器方法,以及使用历史性能来预测调整V/F水平的简单启发式方法。实验结果表明,使用EDP性能标准,我们的算法比研究下的启发式算法平均高出12 ~ 24%。
{"title":"A Dynamic Programming Technique for Energy-Efficient Multicore Systems","authors":"Shervin Hajiamini, B. Shirazi, Aaron S. Crandall, Hassan Ghasemzadeh","doi":"10.1109/IGCC.2018.8752159","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752159","url":null,"abstract":"With a focus on static (compile-time) methods for V/F level assignments, we propose an efficient Dynamic programming (DP) technique using the Viterbi algorithm, which uses the Energy-Delay Product (EDP) as objective function to predict the best V/F levels. By using the profiled information of applications, this technique minimizes energy consumption and execution time. We evaluate and compare the performance of the proposed algorithm against three heuristic methods—a greedy version of our algorithm, a feedback controller method, and a simple heuristic that uses historical performance to make predictions for adjusting the V/F levels. Experimental results show that our algorithm outperforms the heuristics under the study by an average of 12 to 24% using the EDP performance criteria.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114228100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Silicon Photonics for High-Performance Computing: Opportunities and Challenges! 高性能计算用硅光子学:机遇与挑战!
Pub Date : 2018-10-01 DOI: 10.1109/igcc.2018.8752169
M. Nikdast
Computing systems play an important role in today’s life. They are continuously scaling, and hence becoming more complicated, to satisfy new applications demands, such as higher computation and communication bandwidth required for big data and machine learning applications. As a result, the inter- and intra-chip communication in such systems is growing rapidly due to the continuous increase in the integration density of processing cores on a single die. Silicon photonics is introduced as a promising technology with potentials in realizing high-performance interconnect in multiprocessor computing systems. This interdisciplinary talk will discuss different opportunities as well as challenges related to employing silicon photonics in multiprocessor computing systems. Particularly, it will explore the requirements, feasibility, and performance of such systems while considering both the physical-level and the system-level perspectives.
计算机系统在今天的生活中扮演着重要的角色。它们不断扩展,因此变得更加复杂,以满足新的应用需求,例如大数据和机器学习应用所需的更高计算和通信带宽。因此,由于单个芯片上处理核心的集成密度不断提高,此类系统中的芯片间和芯片内通信正在迅速增长。硅光子学是一种很有前途的技术,在多处理器计算系统中具有实现高性能互连的潜力。这个跨学科的讲座将讨论在多处理器计算系统中应用硅光子学的不同机遇和挑战。特别地,它将在考虑物理层和系统层观点的同时探索这些系统的需求、可行性和性能。
{"title":"Silicon Photonics for High-Performance Computing: Opportunities and Challenges!","authors":"M. Nikdast","doi":"10.1109/igcc.2018.8752169","DOIUrl":"https://doi.org/10.1109/igcc.2018.8752169","url":null,"abstract":"Computing systems play an important role in today’s life. They are continuously scaling, and hence becoming more complicated, to satisfy new applications demands, such as higher computation and communication bandwidth required for big data and machine learning applications. As a result, the inter- and intra-chip communication in such systems is growing rapidly due to the continuous increase in the integration density of processing cores on a single die. Silicon photonics is introduced as a promising technology with potentials in realizing high-performance interconnect in multiprocessor computing systems. This interdisciplinary talk will discuss different opportunities as well as challenges related to employing silicon photonics in multiprocessor computing systems. Particularly, it will explore the requirements, feasibility, and performance of such systems while considering both the physical-level and the system-level perspectives.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123816954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Making Cables Disappear: Can Wireless Datacenter be a Reality? 让电缆消失:无线数据中心能成为现实吗?
Pub Date : 2018-10-01 DOI: 10.1109/IGCC.2018.8752167
Sayed Ashraf Mamun, A. Ganguly
Significant portion of the power consumption of datacenter is due to the power-hungry switching fabric necessary for communication. Additionally, the complex cabling in traditional datacenters pose design and maintenance challenges and increase the energy cost of the cooling infrastructure by obstructing the flow of chilled air. In this work, these problems of traditional datacenters are addressed by designing a server-to-server wireless datacenter network (S2S-WiDCN). It is estimated that by implementing S2S-WiDCN, power consumption is lower by five to seventeen times compared to a conventional DCN fabric.
数据中心的功耗很大一部分是由于通信所需的高功耗交换结构。此外,传统数据中心中复杂的布线给设计和维护带来了挑战,并通过阻碍冷却空气的流动增加了冷却基础设施的能源成本。在这项工作中,通过设计一个服务器到服务器的无线数据中心网络(S2S-WiDCN)来解决传统数据中心的这些问题。据估计,通过实施S2S-WiDCN,与传统的DCN织物相比,功耗降低了5到17倍。
{"title":"Making Cables Disappear: Can Wireless Datacenter be a Reality?","authors":"Sayed Ashraf Mamun, A. Ganguly","doi":"10.1109/IGCC.2018.8752167","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752167","url":null,"abstract":"Significant portion of the power consumption of datacenter is due to the power-hungry switching fabric necessary for communication. Additionally, the complex cabling in traditional datacenters pose design and maintenance challenges and increase the energy cost of the cooling infrastructure by obstructing the flow of chilled air. In this work, these problems of traditional datacenters are addressed by designing a server-to-server wireless datacenter network (S2S-WiDCN). It is estimated that by implementing S2S-WiDCN, power consumption is lower by five to seventeen times compared to a conventional DCN fabric.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121870808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IGSC 2018 Special Track on Sustainable Servers IGSC 2018可持续服务器专题专题
Pub Date : 2018-10-01 DOI: 10.1109/igcc.2018.8752168
{"title":"IGSC 2018 Special Track on Sustainable Servers","authors":"","doi":"10.1109/igcc.2018.8752168","DOIUrl":"https://doi.org/10.1109/igcc.2018.8752168","url":null,"abstract":"","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129606350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Much Cache is Enough? A Cache Behavior Analysis for Machine Learning GPU Architectures 多少缓存才足够?机器学习GPU架构的缓存行为分析
Pub Date : 2018-10-01 DOI: 10.1109/IGCC.2018.8752137
S. López, Y. Nimkar, G. Kotas
Graphic Processing Units (GPUs) are highly parallel, power hungry devices with large numbers of transistors devoted to the cache hierarchy. Machine learning is a target application field of these devices, which take advantage of their high levels of parallelism to hide long latency memory access dependencies. Even though parallelism is the main source of performance in these devices, a large number of transistors is still devoted to the cache memory hierarchy. Upon detailed analysis, we measure the real impact of the cache hierarchy on the overall performance. Targeting Machine Learning applications, we observed that most of the successful cache accesses happen in a very reduced number of blocks.With this in mind, we propose a different cache configuration for the GPU, resulting in 25% of the leakage power consumption and 10% of the dynamic energy per access of the original cache configuration, with minimal impact on the overall performance.
图形处理单元(gpu)是高度并行、耗电的设备,具有大量专用于缓存层次结构的晶体管。机器学习是这些设备的目标应用领域,它们利用其高水平的并行性来隐藏长延迟的内存访问依赖关系。尽管并行性是这些器件性能的主要来源,但大量晶体管仍然致力于缓存存储器层次结构。经过详细分析,我们测量了缓存层次结构对整体性能的实际影响。针对机器学习应用程序,我们观察到大多数成功的缓存访问都发生在非常少的块中。考虑到这一点,我们为GPU提出了一种不同的缓存配置,导致原始缓存配置每次访问的泄漏功耗为25%,动态能量为10%,对整体性能的影响最小。
{"title":"How Much Cache is Enough? A Cache Behavior Analysis for Machine Learning GPU Architectures","authors":"S. López, Y. Nimkar, G. Kotas","doi":"10.1109/IGCC.2018.8752137","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752137","url":null,"abstract":"Graphic Processing Units (GPUs) are highly parallel, power hungry devices with large numbers of transistors devoted to the cache hierarchy. Machine learning is a target application field of these devices, which take advantage of their high levels of parallelism to hide long latency memory access dependencies. Even though parallelism is the main source of performance in these devices, a large number of transistors is still devoted to the cache memory hierarchy. Upon detailed analysis, we measure the real impact of the cache hierarchy on the overall performance. Targeting Machine Learning applications, we observed that most of the successful cache accesses happen in a very reduced number of blocks.With this in mind, we propose a different cache configuration for the GPU, resulting in 25% of the leakage power consumption and 10% of the dynamic energy per access of the original cache configuration, with minimal impact on the overall performance.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124681398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-Driven User-Aware HVAC Scheduling 数据驱动的用户感知HVAC调度
Pub Date : 2018-10-01 DOI: 10.1109/IGCC.2018.8752161
Daniel Petrov, Rakan Alseghayer, D. Mossé, Panos K. Chrysanthis
HVAC (Heat, Ventilation, Air Conditioning) systems account for significant amount of energy spent in residential and commercial buildings. Improved wall and window insulation, energy efficient bulbs as well as building design that facilitates a more optimal usage of the thermally conditioned air within a building, are amongst some of the measures taken to address the high usage of energy for space conditioning. In this paper we address a main issue that affects the energy consumption for heating and cooling of buildings, namely the duty cycle of the furnaces/air-conditioners. We propose D-DUAL, a 3-fold scheduling mechanism that builds on multiple variable linear regression model. Our scheduler minimizes the duty cycle and does not impact users’ comfort. Our experimental evaluation shows that our proposed approach saves up to 49% energy, compared to commodity HVAC systems.
暖通空调(暖通空调)系统在住宅和商业建筑中消耗了大量的能源。改善墙壁和窗户的隔热性能、使用节能灯泡,以及在建筑设计中更有效地利用热调节空气,这些都是解决空间调节能源使用量高的一些措施。在本文中,我们解决了影响建筑物供暖和制冷能耗的主要问题,即炉子/空调的占空比。我们提出了一种基于多元线性回归模型的三重调度机制D-DUAL。我们的调度程序最大限度地减少了占空比,不会影响用户的舒适度。我们的实验评估表明,与商用暖通空调系统相比,我们提出的方法可节省高达49%的能源。
{"title":"Data-Driven User-Aware HVAC Scheduling","authors":"Daniel Petrov, Rakan Alseghayer, D. Mossé, Panos K. Chrysanthis","doi":"10.1109/IGCC.2018.8752161","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752161","url":null,"abstract":"HVAC (Heat, Ventilation, Air Conditioning) systems account for significant amount of energy spent in residential and commercial buildings. Improved wall and window insulation, energy efficient bulbs as well as building design that facilitates a more optimal usage of the thermally conditioned air within a building, are amongst some of the measures taken to address the high usage of energy for space conditioning. In this paper we address a main issue that affects the energy consumption for heating and cooling of buildings, namely the duty cycle of the furnaces/air-conditioners. We propose D-DUAL, a 3-fold scheduling mechanism that builds on multiple variable linear regression model. Our scheduler minimizes the duty cycle and does not impact users’ comfort. Our experimental evaluation shows that our proposed approach saves up to 49% energy, compared to commodity HVAC systems.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121172911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Near Threshold Last Level Cache for Energy Efficient Embedded Applications 近阈值最后一级缓存节能嵌入式应用
Pub Date : 2018-10-01 DOI: 10.1109/IGCC.2018.8752134
Mitali Sinha, Sidhartha Sankar Rout, G. Harsha, Sujay Deb
State-of-the-art embedded processors find their use in several domains like vision-based and big data applications. Such applications require a huge amount of information per task, and thereby need frequent main memory accesses to perform the entire computation. In such a scenario, a bigger size last level cache (LLC) would improve the performance and throughput of the system by reducing the global miss rate and miss penalty to a large extent. But this would lead to increased power consumption due to the extended cache memory, which becomes more significant for battery-driven mobile devices. Near threshold operation of memory cells is considered as a notable solution in saving a substantial amount of energy for such applications. We propose a cache architecture that takes advantage of both near threshold and standard LLC operation to meet the required power and performance constraints. A controller unit is implemented to dynamically drive the LLC to operate at standard or near threshold operating region based on application specific operations. The controller can also power gate a portion of LLC to further reduce the leakage power. By simulating different MiBench benchmarks, we show that our proposed cache architecture can reduce average energy consumption by 22% with a minimal average runtime penalty of 2.5% over the baseline architecture with no cache reconfigurability.
最先进的嵌入式处理器在基于视觉和大数据应用等多个领域得到了应用。这样的应用程序每个任务需要大量的信息,因此需要频繁地访问主内存来执行整个计算。在这种情况下,更大的最后一级缓存(LLC)将通过在很大程度上降低全局缺失率和缺失惩罚来提高系统的性能和吞吐量。但这将导致由于扩展缓存内存而增加的功耗,这对于电池驱动的移动设备来说变得更加重要。近阈值操作的存储单元被认为是一个显着的解决方案,在节省大量的能源,这类应用。我们提出了一种利用近阈值和标准LLC操作的缓存架构,以满足所需的功率和性能限制。控制器单元被实现以动态驱动LLC基于应用特定操作在标准或接近阈值操作区域运行。该控制器还可以对部分LLC进行电源闸通,以进一步降低泄漏功率。通过模拟不同的MiBench基准测试,我们表明,我们提出的缓存架构可以在没有缓存可重构性的情况下,比基准架构减少22%的平均能耗,最小的平均运行时间损失为2.5%。
{"title":"Near Threshold Last Level Cache for Energy Efficient Embedded Applications","authors":"Mitali Sinha, Sidhartha Sankar Rout, G. Harsha, Sujay Deb","doi":"10.1109/IGCC.2018.8752134","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752134","url":null,"abstract":"State-of-the-art embedded processors find their use in several domains like vision-based and big data applications. Such applications require a huge amount of information per task, and thereby need frequent main memory accesses to perform the entire computation. In such a scenario, a bigger size last level cache (LLC) would improve the performance and throughput of the system by reducing the global miss rate and miss penalty to a large extent. But this would lead to increased power consumption due to the extended cache memory, which becomes more significant for battery-driven mobile devices. Near threshold operation of memory cells is considered as a notable solution in saving a substantial amount of energy for such applications. We propose a cache architecture that takes advantage of both near threshold and standard LLC operation to meet the required power and performance constraints. A controller unit is implemented to dynamically drive the LLC to operate at standard or near threshold operating region based on application specific operations. The controller can also power gate a portion of LLC to further reduce the leakage power. By simulating different MiBench benchmarks, we show that our proposed cache architecture can reduce average energy consumption by 22% with a minimal average runtime penalty of 2.5% over the baseline architecture with no cache reconfigurability.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"44 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126071991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Gate-Level Approach To Compiling For Quantum Computers 量子计算机的门级编译方法
Pub Date : 2018-10-01 DOI: 10.1109/IGCC.2018.8752114
H. Dietz
Programming language constructs generally operate on data words, and so does most compiler analysis and transformation. However, individual word-level operations often harbor pointless, yet resource and power hungry, lower-level operations. By transforming complete programs into gate-level operations on individual bits, and optimizing operations at that level, it is possible to dramatically reduce the total amount of work needed to execute the program’s algorithm. This gate-level representation can be in terms of any complete set of logic gate types; earlier work targeted conventional multiplexor gates, but the work reported here centers on targeting CSWAP (FredKin) gates without fanout – a form that can be implemented on a quantum computer. This paper will overview the approach, describe the current state of the prototype compiler, and suggest some ways in which compiler automatic parallelization technology might be extended to allow ordinary programs to take advantage of the unique properties of quantum computers.
编程语言结构通常对数据字进行操作,大多数编译器分析和转换也是如此。然而,单个字级操作通常包含无意义的、但需要资源和功率的较低级操作。通过将完整的程序转换为对单个比特的门级操作,并在该级别上优化操作,可以显著减少执行程序算法所需的总工作量。这种门级表示可以用任何逻辑门类型的完整集合表示;早期的工作针对的是传统的多路复用门,但这里报道的工作主要针对的是没有扇出的CSWAP (FredKin)门——一种可以在量子计算机上实现的形式。本文将概述该方法,描述原型编译器的当前状态,并提出一些编译器自动并行化技术可以扩展的方法,以允许普通程序利用量子计算机的独特属性。
{"title":"A Gate-Level Approach To Compiling For Quantum Computers","authors":"H. Dietz","doi":"10.1109/IGCC.2018.8752114","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752114","url":null,"abstract":"Programming language constructs generally operate on data words, and so does most compiler analysis and transformation. However, individual word-level operations often harbor pointless, yet resource and power hungry, lower-level operations. By transforming complete programs into gate-level operations on individual bits, and optimizing operations at that level, it is possible to dramatically reduce the total amount of work needed to execute the program’s algorithm. This gate-level representation can be in terms of any complete set of logic gate types; earlier work targeted conventional multiplexor gates, but the work reported here centers on targeting CSWAP (FredKin) gates without fanout – a form that can be implemented on a quantum computer. This paper will overview the approach, describe the current state of the prototype compiler, and suggest some ways in which compiler automatic parallelization technology might be extended to allow ordinary programs to take advantage of the unique properties of quantum computers.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"348 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128026327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2018 Ninth International Green and Sustainable Computing Conference (IGSC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1