Pub Date : 2018-10-01DOI: 10.1109/IGCC.2018.8752130
Ishan G. Thakkar, S. Pasricha
Photonic networks-on-chip (PNoCs) can enable higher bandwidth and lower latency data transfers at the speed of light. Such PNoCs consist of photonic waveguides with dense-wavelength-division-multiplexing (DWDM) for signal traversal and microring resonators (MRs) for signal modulation and reception. To enable MRs to modulate and receive DWDM photonic signals, change in the free-carrier concentration in or operating temperature of MRs through their voltage biasing is essential. But long-term operation of MRs with constant or time-varying temperature and voltage biasing causes aging. Such voltage bias and temperature induced (VBTI) aging in MRs leads to resonance wavelength drifts and Q-factor degradation at the device-level, which in turn exacerbates three key spectral effects at the photonic link level, namely the intermodulation crosstalk, heterodyne crosstalk, and signal sidelobes truncation. These adverse spectral effects ultimately increase signal power attenuation and energy-per-bit in PNoCs. Our frequency-domain analysis of photonic links shows that the use of the four pulse amplitude modulation (4-PAM) signaling instead of the traditional on-off keying (OOK) signaling can proactively reduce signal attenuation caused by the VBTI aging induced spectral effects. Our system-level evaluation results indicate that, compared to OOK based PNoCs with no aging, 4-PAM based PNoCs can achieve 5.5% better energy-efficiency even after undergoing VBTI aging for 3 Years.
{"title":"Mitigating the Energy Impacts of VBTI Aging in Photonic Networks-on-Chip Architectures with Multilevel Signaling","authors":"Ishan G. Thakkar, S. Pasricha","doi":"10.1109/IGCC.2018.8752130","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752130","url":null,"abstract":"Photonic networks-on-chip (PNoCs) can enable higher bandwidth and lower latency data transfers at the speed of light. Such PNoCs consist of photonic waveguides with dense-wavelength-division-multiplexing (DWDM) for signal traversal and microring resonators (MRs) for signal modulation and reception. To enable MRs to modulate and receive DWDM photonic signals, change in the free-carrier concentration in or operating temperature of MRs through their voltage biasing is essential. But long-term operation of MRs with constant or time-varying temperature and voltage biasing causes aging. Such voltage bias and temperature induced (VBTI) aging in MRs leads to resonance wavelength drifts and Q-factor degradation at the device-level, which in turn exacerbates three key spectral effects at the photonic link level, namely the intermodulation crosstalk, heterodyne crosstalk, and signal sidelobes truncation. These adverse spectral effects ultimately increase signal power attenuation and energy-per-bit in PNoCs. Our frequency-domain analysis of photonic links shows that the use of the four pulse amplitude modulation (4-PAM) signaling instead of the traditional on-off keying (OOK) signaling can proactively reduce signal attenuation caused by the VBTI aging induced spectral effects. Our system-level evaluation results indicate that, compared to OOK based PNoCs with no aging, 4-PAM based PNoCs can achieve 5.5% better energy-efficiency even after undergoing VBTI aging for 3 Years.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122154045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/IGCC.2018.8752172
Zheming Jin, H. Finkel
Field-programmable gate arrays (FPGAs) are becoming a promising heterogeneous computing component for scientific computing when floating-point optimized architectures are added to the current FPGAs. The emerging high-level synthesis (HLS) tools provide a streamlined design flow to facilitate the use of FPGAs for researchers who have little FPGA development experience. In this paper, we choose the kernel, Radial Basis Function, in a support vector machine as a case study to evaluate the potential of implementing machine learning kernels on FPGAs, and the capabilities of an HLS tool to convert a kernel written in high-level language to an FPGA implementation. We explain the HLS flow and the RBF kernel. We evaluate the kernel in an OpenCL-to-FPGA HLS flow, and describe the optimizations of the kernel. Our optimizations using kernel vectorization and loop unrolling improve the kernel performance by a factor of 15.8 compared to a baseline kernel on the Nallatech 385A FPGA card that features an Intel Arria 10 GX 1150 FPGA. In terms of energy efficiency, the performance per watt on the FPGA platform is 2.8X higher than that on an Intel Xeon 16-core CPU, and 1.7X higher than that on an Nvidia Tesla K80 GPU. On the other hand, the performance per watt on an Intel Xeon Phi Knights Landing CPU and an Nvidia Tesla P100 GPU are 5.3X and 1.7X higher than that on the FPGA, respectively.
随着浮点优化架构的加入,现场可编程门阵列(fpga)正在成为一种很有前途的科学计算异构计算组件。新兴的高级综合(HLS)工具提供了一个简化的设计流程,以方便FPGA开发经验较少的研究人员使用FPGA。在本文中,我们选择支持向量机中的内核径向基函数作为案例研究,以评估在FPGA上实现机器学习内核的潜力,以及HLS工具将用高级语言编写的内核转换为FPGA实现的能力。我们解释了HLS流和RBF内核。我们在opencl到fpga的HLS流程中评估内核,并描述内核的优化。与采用Intel Arria 10 GX 1150 FPGA的Nallatech 385A FPGA卡上的基准内核相比,我们使用内核矢量化和循环展开进行的优化将内核性能提高了15.8倍。在能效方面,FPGA平台的每瓦性能比Intel至强16核CPU高2.8倍,比Nvidia Tesla K80 GPU高1.7倍。另一方面,Intel Xeon Phi Knights Landing CPU和Nvidia Tesla P100 GPU的每瓦性能分别比FPGA高5.3倍和1.7倍。
{"title":"Evaluating Radial Basis Function Kernel on OpenCL FPGA Platform","authors":"Zheming Jin, H. Finkel","doi":"10.1109/IGCC.2018.8752172","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752172","url":null,"abstract":"Field-programmable gate arrays (FPGAs) are becoming a promising heterogeneous computing component for scientific computing when floating-point optimized architectures are added to the current FPGAs. The emerging high-level synthesis (HLS) tools provide a streamlined design flow to facilitate the use of FPGAs for researchers who have little FPGA development experience. In this paper, we choose the kernel, Radial Basis Function, in a support vector machine as a case study to evaluate the potential of implementing machine learning kernels on FPGAs, and the capabilities of an HLS tool to convert a kernel written in high-level language to an FPGA implementation. We explain the HLS flow and the RBF kernel. We evaluate the kernel in an OpenCL-to-FPGA HLS flow, and describe the optimizations of the kernel. Our optimizations using kernel vectorization and loop unrolling improve the kernel performance by a factor of 15.8 compared to a baseline kernel on the Nallatech 385A FPGA card that features an Intel Arria 10 GX 1150 FPGA. In terms of energy efficiency, the performance per watt on the FPGA platform is 2.8X higher than that on an Intel Xeon 16-core CPU, and 1.7X higher than that on an Nvidia Tesla K80 GPU. On the other hand, the performance per watt on an Intel Xeon Phi Knights Landing CPU and an Nvidia Tesla P100 GPU are 5.3X and 1.7X higher than that on the FPGA, respectively.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124086763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/IGCC.2018.8752159
Shervin Hajiamini, B. Shirazi, Aaron S. Crandall, Hassan Ghasemzadeh
With a focus on static (compile-time) methods for V/F level assignments, we propose an efficient Dynamic programming (DP) technique using the Viterbi algorithm, which uses the Energy-Delay Product (EDP) as objective function to predict the best V/F levels. By using the profiled information of applications, this technique minimizes energy consumption and execution time. We evaluate and compare the performance of the proposed algorithm against three heuristic methods—a greedy version of our algorithm, a feedback controller method, and a simple heuristic that uses historical performance to make predictions for adjusting the V/F levels. Experimental results show that our algorithm outperforms the heuristics under the study by an average of 12 to 24% using the EDP performance criteria.
{"title":"A Dynamic Programming Technique for Energy-Efficient Multicore Systems","authors":"Shervin Hajiamini, B. Shirazi, Aaron S. Crandall, Hassan Ghasemzadeh","doi":"10.1109/IGCC.2018.8752159","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752159","url":null,"abstract":"With a focus on static (compile-time) methods for V/F level assignments, we propose an efficient Dynamic programming (DP) technique using the Viterbi algorithm, which uses the Energy-Delay Product (EDP) as objective function to predict the best V/F levels. By using the profiled information of applications, this technique minimizes energy consumption and execution time. We evaluate and compare the performance of the proposed algorithm against three heuristic methods—a greedy version of our algorithm, a feedback controller method, and a simple heuristic that uses historical performance to make predictions for adjusting the V/F levels. Experimental results show that our algorithm outperforms the heuristics under the study by an average of 12 to 24% using the EDP performance criteria.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114228100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/igcc.2018.8752169
M. Nikdast
Computing systems play an important role in today’s life. They are continuously scaling, and hence becoming more complicated, to satisfy new applications demands, such as higher computation and communication bandwidth required for big data and machine learning applications. As a result, the inter- and intra-chip communication in such systems is growing rapidly due to the continuous increase in the integration density of processing cores on a single die. Silicon photonics is introduced as a promising technology with potentials in realizing high-performance interconnect in multiprocessor computing systems. This interdisciplinary talk will discuss different opportunities as well as challenges related to employing silicon photonics in multiprocessor computing systems. Particularly, it will explore the requirements, feasibility, and performance of such systems while considering both the physical-level and the system-level perspectives.
{"title":"Silicon Photonics for High-Performance Computing: Opportunities and Challenges!","authors":"M. Nikdast","doi":"10.1109/igcc.2018.8752169","DOIUrl":"https://doi.org/10.1109/igcc.2018.8752169","url":null,"abstract":"Computing systems play an important role in today’s life. They are continuously scaling, and hence becoming more complicated, to satisfy new applications demands, such as higher computation and communication bandwidth required for big data and machine learning applications. As a result, the inter- and intra-chip communication in such systems is growing rapidly due to the continuous increase in the integration density of processing cores on a single die. Silicon photonics is introduced as a promising technology with potentials in realizing high-performance interconnect in multiprocessor computing systems. This interdisciplinary talk will discuss different opportunities as well as challenges related to employing silicon photonics in multiprocessor computing systems. Particularly, it will explore the requirements, feasibility, and performance of such systems while considering both the physical-level and the system-level perspectives.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123816954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/IGCC.2018.8752167
Sayed Ashraf Mamun, A. Ganguly
Significant portion of the power consumption of datacenter is due to the power-hungry switching fabric necessary for communication. Additionally, the complex cabling in traditional datacenters pose design and maintenance challenges and increase the energy cost of the cooling infrastructure by obstructing the flow of chilled air. In this work, these problems of traditional datacenters are addressed by designing a server-to-server wireless datacenter network (S2S-WiDCN). It is estimated that by implementing S2S-WiDCN, power consumption is lower by five to seventeen times compared to a conventional DCN fabric.
{"title":"Making Cables Disappear: Can Wireless Datacenter be a Reality?","authors":"Sayed Ashraf Mamun, A. Ganguly","doi":"10.1109/IGCC.2018.8752167","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752167","url":null,"abstract":"Significant portion of the power consumption of datacenter is due to the power-hungry switching fabric necessary for communication. Additionally, the complex cabling in traditional datacenters pose design and maintenance challenges and increase the energy cost of the cooling infrastructure by obstructing the flow of chilled air. In this work, these problems of traditional datacenters are addressed by designing a server-to-server wireless datacenter network (S2S-WiDCN). It is estimated that by implementing S2S-WiDCN, power consumption is lower by five to seventeen times compared to a conventional DCN fabric.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121870808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/igcc.2018.8752168
{"title":"IGSC 2018 Special Track on Sustainable Servers","authors":"","doi":"10.1109/igcc.2018.8752168","DOIUrl":"https://doi.org/10.1109/igcc.2018.8752168","url":null,"abstract":"","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129606350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/IGCC.2018.8752137
S. López, Y. Nimkar, G. Kotas
Graphic Processing Units (GPUs) are highly parallel, power hungry devices with large numbers of transistors devoted to the cache hierarchy. Machine learning is a target application field of these devices, which take advantage of their high levels of parallelism to hide long latency memory access dependencies. Even though parallelism is the main source of performance in these devices, a large number of transistors is still devoted to the cache memory hierarchy. Upon detailed analysis, we measure the real impact of the cache hierarchy on the overall performance. Targeting Machine Learning applications, we observed that most of the successful cache accesses happen in a very reduced number of blocks.With this in mind, we propose a different cache configuration for the GPU, resulting in 25% of the leakage power consumption and 10% of the dynamic energy per access of the original cache configuration, with minimal impact on the overall performance.
{"title":"How Much Cache is Enough? A Cache Behavior Analysis for Machine Learning GPU Architectures","authors":"S. López, Y. Nimkar, G. Kotas","doi":"10.1109/IGCC.2018.8752137","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752137","url":null,"abstract":"Graphic Processing Units (GPUs) are highly parallel, power hungry devices with large numbers of transistors devoted to the cache hierarchy. Machine learning is a target application field of these devices, which take advantage of their high levels of parallelism to hide long latency memory access dependencies. Even though parallelism is the main source of performance in these devices, a large number of transistors is still devoted to the cache memory hierarchy. Upon detailed analysis, we measure the real impact of the cache hierarchy on the overall performance. Targeting Machine Learning applications, we observed that most of the successful cache accesses happen in a very reduced number of blocks.With this in mind, we propose a different cache configuration for the GPU, resulting in 25% of the leakage power consumption and 10% of the dynamic energy per access of the original cache configuration, with minimal impact on the overall performance.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124681398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/IGCC.2018.8752161
Daniel Petrov, Rakan Alseghayer, D. Mossé, Panos K. Chrysanthis
HVAC (Heat, Ventilation, Air Conditioning) systems account for significant amount of energy spent in residential and commercial buildings. Improved wall and window insulation, energy efficient bulbs as well as building design that facilitates a more optimal usage of the thermally conditioned air within a building, are amongst some of the measures taken to address the high usage of energy for space conditioning. In this paper we address a main issue that affects the energy consumption for heating and cooling of buildings, namely the duty cycle of the furnaces/air-conditioners. We propose D-DUAL, a 3-fold scheduling mechanism that builds on multiple variable linear regression model. Our scheduler minimizes the duty cycle and does not impact users’ comfort. Our experimental evaluation shows that our proposed approach saves up to 49% energy, compared to commodity HVAC systems.
{"title":"Data-Driven User-Aware HVAC Scheduling","authors":"Daniel Petrov, Rakan Alseghayer, D. Mossé, Panos K. Chrysanthis","doi":"10.1109/IGCC.2018.8752161","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752161","url":null,"abstract":"HVAC (Heat, Ventilation, Air Conditioning) systems account for significant amount of energy spent in residential and commercial buildings. Improved wall and window insulation, energy efficient bulbs as well as building design that facilitates a more optimal usage of the thermally conditioned air within a building, are amongst some of the measures taken to address the high usage of energy for space conditioning. In this paper we address a main issue that affects the energy consumption for heating and cooling of buildings, namely the duty cycle of the furnaces/air-conditioners. We propose D-DUAL, a 3-fold scheduling mechanism that builds on multiple variable linear regression model. Our scheduler minimizes the duty cycle and does not impact users’ comfort. Our experimental evaluation shows that our proposed approach saves up to 49% energy, compared to commodity HVAC systems.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121172911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/IGCC.2018.8752134
Mitali Sinha, Sidhartha Sankar Rout, G. Harsha, Sujay Deb
State-of-the-art embedded processors find their use in several domains like vision-based and big data applications. Such applications require a huge amount of information per task, and thereby need frequent main memory accesses to perform the entire computation. In such a scenario, a bigger size last level cache (LLC) would improve the performance and throughput of the system by reducing the global miss rate and miss penalty to a large extent. But this would lead to increased power consumption due to the extended cache memory, which becomes more significant for battery-driven mobile devices. Near threshold operation of memory cells is considered as a notable solution in saving a substantial amount of energy for such applications. We propose a cache architecture that takes advantage of both near threshold and standard LLC operation to meet the required power and performance constraints. A controller unit is implemented to dynamically drive the LLC to operate at standard or near threshold operating region based on application specific operations. The controller can also power gate a portion of LLC to further reduce the leakage power. By simulating different MiBench benchmarks, we show that our proposed cache architecture can reduce average energy consumption by 22% with a minimal average runtime penalty of 2.5% over the baseline architecture with no cache reconfigurability.
{"title":"Near Threshold Last Level Cache for Energy Efficient Embedded Applications","authors":"Mitali Sinha, Sidhartha Sankar Rout, G. Harsha, Sujay Deb","doi":"10.1109/IGCC.2018.8752134","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752134","url":null,"abstract":"State-of-the-art embedded processors find their use in several domains like vision-based and big data applications. Such applications require a huge amount of information per task, and thereby need frequent main memory accesses to perform the entire computation. In such a scenario, a bigger size last level cache (LLC) would improve the performance and throughput of the system by reducing the global miss rate and miss penalty to a large extent. But this would lead to increased power consumption due to the extended cache memory, which becomes more significant for battery-driven mobile devices. Near threshold operation of memory cells is considered as a notable solution in saving a substantial amount of energy for such applications. We propose a cache architecture that takes advantage of both near threshold and standard LLC operation to meet the required power and performance constraints. A controller unit is implemented to dynamically drive the LLC to operate at standard or near threshold operating region based on application specific operations. The controller can also power gate a portion of LLC to further reduce the leakage power. By simulating different MiBench benchmarks, we show that our proposed cache architecture can reduce average energy consumption by 22% with a minimal average runtime penalty of 2.5% over the baseline architecture with no cache reconfigurability.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"44 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126071991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/IGCC.2018.8752114
H. Dietz
Programming language constructs generally operate on data words, and so does most compiler analysis and transformation. However, individual word-level operations often harbor pointless, yet resource and power hungry, lower-level operations. By transforming complete programs into gate-level operations on individual bits, and optimizing operations at that level, it is possible to dramatically reduce the total amount of work needed to execute the program’s algorithm. This gate-level representation can be in terms of any complete set of logic gate types; earlier work targeted conventional multiplexor gates, but the work reported here centers on targeting CSWAP (FredKin) gates without fanout – a form that can be implemented on a quantum computer. This paper will overview the approach, describe the current state of the prototype compiler, and suggest some ways in which compiler automatic parallelization technology might be extended to allow ordinary programs to take advantage of the unique properties of quantum computers.
{"title":"A Gate-Level Approach To Compiling For Quantum Computers","authors":"H. Dietz","doi":"10.1109/IGCC.2018.8752114","DOIUrl":"https://doi.org/10.1109/IGCC.2018.8752114","url":null,"abstract":"Programming language constructs generally operate on data words, and so does most compiler analysis and transformation. However, individual word-level operations often harbor pointless, yet resource and power hungry, lower-level operations. By transforming complete programs into gate-level operations on individual bits, and optimizing operations at that level, it is possible to dramatically reduce the total amount of work needed to execute the program’s algorithm. This gate-level representation can be in terms of any complete set of logic gate types; earlier work targeted conventional multiplexor gates, but the work reported here centers on targeting CSWAP (FredKin) gates without fanout – a form that can be implemented on a quantum computer. This paper will overview the approach, describe the current state of the prototype compiler, and suggest some ways in which compiler automatic parallelization technology might be extended to allow ordinary programs to take advantage of the unique properties of quantum computers.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"348 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128026327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}