首页 > 最新文献

2017 IEEE International Conference on Rebooting Computing (ICRC)最新文献

英文 中文
Generalize or Die: Operating Systems Support for Memristor-Based Accelerators 泛化或死亡:操作系统支持基于忆阻器的加速器
Pub Date : 2017-11-28 DOI: 10.1109/ICRC.2017.8123649
P. Bruel, S. R. Chalamalasetti, Chris I. Dalton, I. E. Hajj, A. Goldman, Catherine E. Graves, Wen-mei W. Hwu, Phil Laplante, D. Milojicic, Geoffrey Ndu, J. Strachan
The deceleration of transistor feature size scaling has motivated growing adoption of specialized accelerators implemented as GPUs, FPGAs, ASICs, and more recently new types of computing such as neuromorphic, bio-inspired, ultra low energy, reversible, stochastic, optical, quantum, combinations, and others unforeseen. There is a tension between specialization and generalization, with the current state trending to master slave models where accelerators (slaves) are instructed by a general purpose system (master) running an Operating System (OS). Traditionally, an OS is a layer between hardware and applications and its primary function is to manage hardware resources and provide a common abstraction to applications. Does this function, however, apply to new types of computing paradigms? This paper revisits OS functionality for memristor-based accelerators. We explore one accelerator implementation, the Dot Product Engine (DPE), for a select pattern of applications in machine learning, imaging, and scientific computing and a small set of use cases. We explore typical OS functionality, such as reconfiguration, partitioning, security, virtualization, and programming. We also explore new types of functionality, such as precision and trustworthiness of reconfiguration. We claim that making an accelerator, such as the DPE, more general will result in broader adoption and better utilization.
晶体管特征尺寸缩放的减速促使越来越多的人采用专用加速器来实现gpu、fpga、asic,以及最近的新型计算,如神经形态、生物启发、超低能量、可逆、随机、光学、量子、组合和其他不可预见的计算。专门化和泛化之间存在紧张关系,当前的状态趋向于主从模型,其中加速器(从)由运行操作系统(OS)的通用系统(主)指示。传统上,操作系统是硬件和应用程序之间的一层,其主要功能是管理硬件资源并为应用程序提供公共抽象。然而,这个函数是否适用于新的计算范式类型?本文回顾了基于忆阻器的加速器的操作系统功能。我们探索了一种加速器实现,即点积引擎(DPE),用于机器学习、成像和科学计算领域的特定应用模式和一小部分用例。我们将探讨典型的操作系统功能,例如重新配置、分区、安全性、虚拟化和编程。我们还探索了新的功能类型,如重新配置的准确性和可信度。我们声称,使加速器,如DPE,更加通用将导致更广泛的采用和更好的利用。
{"title":"Generalize or Die: Operating Systems Support for Memristor-Based Accelerators","authors":"P. Bruel, S. R. Chalamalasetti, Chris I. Dalton, I. E. Hajj, A. Goldman, Catherine E. Graves, Wen-mei W. Hwu, Phil Laplante, D. Milojicic, Geoffrey Ndu, J. Strachan","doi":"10.1109/ICRC.2017.8123649","DOIUrl":"https://doi.org/10.1109/ICRC.2017.8123649","url":null,"abstract":"The deceleration of transistor feature size scaling has motivated growing adoption of specialized accelerators implemented as GPUs, FPGAs, ASICs, and more recently new types of computing such as neuromorphic, bio-inspired, ultra low energy, reversible, stochastic, optical, quantum, combinations, and others unforeseen. There is a tension between specialization and generalization, with the current state trending to master slave models where accelerators (slaves) are instructed by a general purpose system (master) running an Operating System (OS). Traditionally, an OS is a layer between hardware and applications and its primary function is to manage hardware resources and provide a common abstraction to applications. Does this function, however, apply to new types of computing paradigms? This paper revisits OS functionality for memristor-based accelerators. We explore one accelerator implementation, the Dot Product Engine (DPE), for a select pattern of applications in machine learning, imaging, and scientific computing and a small set of use cases. We explore typical OS functionality, such as reconfiguration, partitioning, security, virtualization, and programming. We also explore new types of functionality, such as precision and trustworthiness of reconfiguration. We claim that making an accelerator, such as the DPE, more general will result in broader adoption and better utilization.","PeriodicalId":125114,"journal":{"name":"2017 IEEE International Conference on Rebooting Computing (ICRC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123434510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Reducing Binary Quadratic Forms for More Scalable Quantum Annealing 简化二元二次型的更可伸缩量子退火
Pub Date : 2017-11-01 DOI: 10.1109/ICRC.2017.8123654
Georg Hahn, H. Djidjev
Recent advances in the development of commercial quantum annealers such as the D-Wave 2X allow solving NP-hard optimization problems that can be expressed as quadratic unconstrained binary programs. However, the relatively small number of available qubits (around 1000 for the D-Wave 2X quantum annealer) poses a severe limitation to the range of problems that can be solved. This paper explores the suitability of preprocessing methods for reducing the sizes of the input programs and thereby the number of qubits required for their solution on quantum computers. Such methods allow us to determine the value of certain variables that hold in either any optimal solution (called strong persistencies) or in at least one optimal solution (weak persistencies). We investigate preprocessing methods for two important NP-hard graph problems, the computation of a maximum clique and a maximum cut in a graph. We show that the identification of strong and weak persistencies for those two optimization problems is very instance-specific,but can lead to substantial reductions in the number of variables.
商业量子退加工机(如D-Wave 2X)的最新进展允许解决NP-hard优化问题,这些问题可以表示为二次型无约束二进制程序。然而,相对较少的可用量子位(D-Wave 2X量子退火器大约1000个)严重限制了可以解决的问题范围。本文探讨了预处理方法的适用性,以减少输入程序的大小,从而减少在量子计算机上解决它们所需的量子位的数量。这些方法允许我们确定某些变量的值,这些变量要么存在于任何最优解(称为强持久性)中,要么存在于至少一个最优解(弱持久性)中。研究了两个重要的NP-hard图问题的预处理方法,即图中最大团和最大割的计算。我们表明,对于这两个优化问题,强持久性和弱持久性的识别是非常具体于实例的,但可以导致变量数量的大量减少。
{"title":"Reducing Binary Quadratic Forms for More Scalable Quantum Annealing","authors":"Georg Hahn, H. Djidjev","doi":"10.1109/ICRC.2017.8123654","DOIUrl":"https://doi.org/10.1109/ICRC.2017.8123654","url":null,"abstract":"Recent advances in the development of commercial quantum annealers such as the D-Wave 2X allow solving NP-hard optimization problems that can be expressed as quadratic unconstrained binary programs. However, the relatively small number of available qubits (around 1000 for the D-Wave 2X quantum annealer) poses a severe limitation to the range of problems that can be solved. This paper explores the suitability of preprocessing methods for reducing the sizes of the input programs and thereby the number of qubits required for their solution on quantum computers. Such methods allow us to determine the value of certain variables that hold in either any optimal solution (called strong persistencies) or in at least one optimal solution (weak persistencies). We investigate preprocessing methods for two important NP-hard graph problems, the computation of a maximum clique and a maximum cut in a graph. We show that the identification of strong and weak persistencies for those two optimization problems is very instance-specific,but can lead to substantial reductions in the number of variables.","PeriodicalId":125114,"journal":{"name":"2017 IEEE International Conference on Rebooting Computing (ICRC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124857123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Reconfigurable and Programmable Ion Trap Quantum Computer 可重构可编程离子阱量子计算机
Pub Date : 2017-11-01 DOI: 10.1109/ICRC.2017.8123665
Stewart Allen, Jungsang Kim, D. Moehring, C. Monroe
We present progress on the construction and operation of a room- temperature quantum computer built with trapped atomic ion qubits. Based on the technological underpinnings of atomic clocks that define time, atomic qubits are standards of quantum information because they are all identical. They present a fundamentally scalable approach to quantum computation where interactions can be faithfully replicated and measured with near-perfect efficiency. Moreover, the connection among atomic ion qubits are forged from external laser beams and mediated by the Coulomb repulsion between them, and hence behave as a fully reconfigurable quantum circuit, much like an FPGA in classical computation. We further discuss paths to scaling using demonstrated technologies that are unique to this class of quantum computation devices. This flexibility will likely allow ion trap quantum computers to express the superset of all known quantum computation operations, and thus efficiently target any type of application that arises.
本文介绍了用俘获原子离子量子比特构建的室温量子计算机的构造和运行的进展。基于定义时间的原子钟的技术基础,原子量子位是量子信息的标准,因为它们都是相同的。他们提出了一种基本可扩展的量子计算方法,可以以近乎完美的效率忠实地复制和测量相互作用。此外,原子离子量子比特之间的连接是由外部激光束形成的,并由它们之间的库仑斥力介导,因此表现为完全可重构的量子电路,很像经典计算中的FPGA。我们进一步讨论了使用该类量子计算设备特有的演示技术的扩展路径。这种灵活性可能使离子阱量子计算机能够表达所有已知量子计算操作的超集,从而有效地针对任何类型的应用。
{"title":"Reconfigurable and Programmable Ion Trap Quantum Computer","authors":"Stewart Allen, Jungsang Kim, D. Moehring, C. Monroe","doi":"10.1109/ICRC.2017.8123665","DOIUrl":"https://doi.org/10.1109/ICRC.2017.8123665","url":null,"abstract":"We present progress on the construction and operation of a room- temperature quantum computer built with trapped atomic ion qubits. Based on the technological underpinnings of atomic clocks that define time, atomic qubits are standards of quantum information because they are all identical. They present a fundamentally scalable approach to quantum computation where interactions can be faithfully replicated and measured with near-perfect efficiency. Moreover, the connection among atomic ion qubits are forged from external laser beams and mediated by the Coulomb repulsion between them, and hence behave as a fully reconfigurable quantum circuit, much like an FPGA in classical computation. We further discuss paths to scaling using demonstrated technologies that are unique to this class of quantum computation devices. This flexibility will likely allow ion trap quantum computers to express the superset of all known quantum computation operations, and thus efficiently target any type of application that arises.","PeriodicalId":125114,"journal":{"name":"2017 IEEE International Conference on Rebooting Computing (ICRC)","volume":"1873 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128684776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Auditory Neural Pathway Simulation 听觉神经通路模拟
Pub Date : 2017-11-01 DOI: 10.1109/ICRC.2017.8123633
R. Meeson
We describe an effort to simulate the neural pathway from the inner ear (cochlea) to the primary auditory cortex in the brain. The human cochlea contains sensory cells (inner hair cells), which respond to the mechanical motion of traveling waves that sweep along the basilar membrane. Neurons triggered by the sensory cells carry sound signals from the cochlea to the brain through a series of a half-dozen transfer sites. At each junction, firing neurons stimulate some and inhibit other neighboring neurons. The signal processing effects of these interactions are not fully understood. The net behavior is difficult to observe in-vivo because the neurons are not easily accessible and only a relatively few can be measured at one time. As a result, the "neural code" that represents sound signals is not understood. We do know, however, that our perception of sound is much more refined than the signal observable at the cochlea. Frequencies are only broadly separated within the cochlea, for example, yet we are able to perceive very narrow differences in pitch. The simulation model we are constructing provides a means to fully instrument all of the neurons and their interactions. The model allows for a wide range of signal analysis experimentation, which we hope will help untangle how this neural processing works.
我们描述了模拟从内耳(耳蜗)到大脑初级听觉皮层的神经通路的努力。人的耳蜗包含感觉细胞(内毛细胞),它们对沿基底膜扫过的行波的机械运动作出反应。由感觉细胞触发的神经元将声音信号从耳蜗通过一系列的6个转移点传递到大脑。在每个连接处,放电神经元刺激一些神经元,抑制其他邻近神经元。这些相互作用的信号处理效应尚未完全了解。由于神经元不易接近,而且一次只能测量相对较少的神经元,因此很难在体内观察到净行为。因此,代表声音信号的“神经代码”无法被理解。然而,我们确实知道,我们对声音的感知比在耳蜗上观察到的信号要精确得多。例如,在耳蜗内,频率只是大致分开的,但我们能够感知到音调的非常狭窄的差异。我们正在构建的仿真模型提供了一种全面测量所有神经元及其相互作用的方法。该模型允许进行广泛的信号分析实验,我们希望这将有助于解开这种神经处理的工作原理。
{"title":"Auditory Neural Pathway Simulation","authors":"R. Meeson","doi":"10.1109/ICRC.2017.8123633","DOIUrl":"https://doi.org/10.1109/ICRC.2017.8123633","url":null,"abstract":"We describe an effort to simulate the neural pathway from the inner ear (cochlea) to the primary auditory cortex in the brain. The human cochlea contains sensory cells (inner hair cells), which respond to the mechanical motion of traveling waves that sweep along the basilar membrane. Neurons triggered by the sensory cells carry sound signals from the cochlea to the brain through a series of a half-dozen transfer sites. At each junction, firing neurons stimulate some and inhibit other neighboring neurons. The signal processing effects of these interactions are not fully understood. The net behavior is difficult to observe in-vivo because the neurons are not easily accessible and only a relatively few can be measured at one time. As a result, the \"neural code\" that represents sound signals is not understood. We do know, however, that our perception of sound is much more refined than the signal observable at the cochlea. Frequencies are only broadly separated within the cochlea, for example, yet we are able to perceive very narrow differences in pitch. The simulation model we are constructing provides a means to fully instrument all of the neurons and their interactions. The model allows for a wide range of signal analysis experimentation, which we hope will help untangle how this neural processing works.","PeriodicalId":125114,"journal":{"name":"2017 IEEE International Conference on Rebooting Computing (ICRC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130471968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Socrates-D: Multicore Architecture for On-Line Learning Socrates-D:在线学习的多核架构
Pub Date : 2017-11-01 DOI: 10.1109/ICRC.2017.8123668
Yangjie Qi, Raqibul Hasan, Rasitha Fernando, T. Taha
Compact online learning architectures could be used to enhance internet of things devices to allow them to learn directly based on data being received instead of having to ship data to a remote server for learning. This saves communications energy and enhances privacy and security as the data is not shared. The learning architectures can also be used in high performance computing and in traditional computing architectures to learn approximations of the functions being performed based on runtime activities. This paper presents the Socrates-D a digital multicore on-chip learning architecture for deep neural networks. It has memories internal to each neural core to store synaptic weights. A variety of deep learning applications can be processed in this architecture. The system level area and power benefits of the specialized architecture is compared with an NVIDIA GEFORCE GTX 980Ti GPGPU. Our experimental evaluations show that the proposed architecture can provide significant area and energy efficiencies over GPGPUs for both training and inference.
紧凑的在线学习架构可以用来增强物联网设备,使它们能够直接根据接收到的数据进行学习,而不必将数据传输到远程服务器进行学习。这节省了通信能源,并增强了隐私和安全性,因为数据是不共享的。学习体系结构还可以用于高性能计算和传统计算体系结构中,以学习基于运行时活动执行的功能的近似值。本文介绍了用于深度神经网络的数字多核片上学习架构Socrates-D。它在每个神经核心内部都有记忆来存储突触的重量。在这个架构中可以处理各种深度学习应用程序。与NVIDIA GEFORCE GTX 980Ti GPGPU比较了专用架构的系统级面积和功耗优势。我们的实验评估表明,所提出的架构可以为训练和推理提供比gpgpu显著的面积和能源效率。
{"title":"Socrates-D: Multicore Architecture for On-Line Learning","authors":"Yangjie Qi, Raqibul Hasan, Rasitha Fernando, T. Taha","doi":"10.1109/ICRC.2017.8123668","DOIUrl":"https://doi.org/10.1109/ICRC.2017.8123668","url":null,"abstract":"Compact online learning architectures could be used to enhance internet of things devices to allow them to learn directly based on data being received instead of having to ship data to a remote server for learning. This saves communications energy and enhances privacy and security as the data is not shared. The learning architectures can also be used in high performance computing and in traditional computing architectures to learn approximations of the functions being performed based on runtime activities. This paper presents the Socrates-D a digital multicore on-chip learning architecture for deep neural networks. It has memories internal to each neural core to store synaptic weights. A variety of deep learning applications can be processed in this architecture. The system level area and power benefits of the specialized architecture is compared with an NVIDIA GEFORCE GTX 980Ti GPGPU. Our experimental evaluations show that the proposed architecture can provide significant area and energy efficiencies over GPGPUs for both training and inference.","PeriodicalId":125114,"journal":{"name":"2017 IEEE International Conference on Rebooting Computing (ICRC)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127607565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Superstrider Architecture: Integrating Logic and Memory Towards Non-Von Neumann Computing Superstrider架构:整合逻辑和记忆迈向非冯诺依曼计算
Pub Date : 2017-11-01 DOI: 10.1109/ICRC.2017.8123669
S. Srikanth, T. Conte, E. Debenedictis, Jeanine E. Cook
We present a new non-von Neumann architecture, termed "Superstrider," predicated on no more than current projected improvements in semiconductor components and 3D manufacturing technologies, which should offer orders of magnitude advances in both energy efficiency and performance for many high-utility problem classes. The architecture is described, which is based on computing on row-wide memory words to accelerate sparse matrix algebraic operations that are normally implemented as scalar operations. A cycle-accurate simulation demonstrates potential performance improvements on existing High Bandwidth Memory (HBM) on the order of 50× that increases to 1000× or more when implemented using a fully integrated 3D technology and compared to a simple baseline. Further refinement may change these numbers, but the magnitude of the opportunity suggests further work.
我们提出了一种新的非冯·诺伊曼架构,称为“超级黾”,基于半导体组件和3D制造技术的当前预计改进,这应该为许多高实用问题提供能源效率和性能方面的数量级进步。描述了基于行级存储字的计算来加速通常作为标量运算实现的稀疏矩阵代数运算的架构。周期精确的模拟表明,与简单的基线相比,使用完全集成的3D技术实现时,现有高带宽内存(HBM)的潜在性能提高了50倍,增加到1000倍或更多。进一步的改进可能会改变这些数字,但巨大的机会意味着进一步的工作。
{"title":"The Superstrider Architecture: Integrating Logic and Memory Towards Non-Von Neumann Computing","authors":"S. Srikanth, T. Conte, E. Debenedictis, Jeanine E. Cook","doi":"10.1109/ICRC.2017.8123669","DOIUrl":"https://doi.org/10.1109/ICRC.2017.8123669","url":null,"abstract":"We present a new non-von Neumann architecture, termed \"Superstrider,\" predicated on no more than current projected improvements in semiconductor components and 3D manufacturing technologies, which should offer orders of magnitude advances in both energy efficiency and performance for many high-utility problem classes. The architecture is described, which is based on computing on row-wide memory words to accelerate sparse matrix algebraic operations that are normally implemented as scalar operations. A cycle-accurate simulation demonstrates potential performance improvements on existing High Bandwidth Memory (HBM) on the order of 50× that increases to 1000× or more when implemented using a fully integrated 3D technology and compared to a simple baseline. Further refinement may change these numbers, but the magnitude of the opportunity suggests further work.","PeriodicalId":125114,"journal":{"name":"2017 IEEE International Conference on Rebooting Computing (ICRC)","volume":"219 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122254536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Generating Sparse Representations Using Quantum Annealing: Comparison to Classical Algorithms 使用量子退火生成稀疏表示:与经典算法的比较
Pub Date : 2017-11-01 DOI: 10.1109/ICRC.2017.8123653
N. T. Nguyen, Amy E. Larson, Garrett T. Kenyon
We use a quantum annealing D-Wave 2X (1,152-qubit) computer to generate sparse representations of Canny-filtered, center-cropped 30x30 CIFAR-10 images. Each binary neuron (qubit) represents a feature kernel obtained initially by imprinting on a randomly chosen 5x5 image patch and then adapted via an off-line Hebbian learning protocol using the sparse solutions generated by the D-Wave. When using binary neurons, the energy function is non-convex (multiple local-minima) and finding a global minimum is NP-hard. Quantum annealing provides a strategy for finding sparse representations that correspond to good local minima of a non-convex cost function. To overcome the severe coupling restrictions between physical qubits on the D-Wave Chimera graph, we use embedding tools to achieve approximately all-to-all connectivity across a reduced number of logical qubits. We assess the sparse representations generated by the D-Wave using both the total energy as well as classification accuracy on a subset of the CIFAR-10 database. The D-Wave 2X outperforms two classical state-of-the-art binary solvers, GUROBI and Chimera-inspired algorithm Hamze-Freitas-Selby (HFS). Specifically, the D-Wave 2X yields lower energy sparse solutions within seconds while the largest problems take over 10 hours for both GUROBI and HFS. We obtained cross-validation classification of 31.02% for the first 4K images using 47 features on the quantum D- Wave 2X.
我们使用量子退火D-Wave 2X(1,152量子位)计算机生成canny滤波、中心裁剪的30x30 CIFAR-10图像的稀疏表示。每个二进制神经元(量子比特)代表一个特征核,最初是通过在随机选择的5x5图像补丁上进行印迹获得的,然后通过使用D-Wave生成的稀疏解决方案通过离线Hebbian学习协议进行调整。当使用二值神经元时,能量函数是非凸的(多个局部最小值),寻找全局最小值是np困难的。量子退火提供了一种寻找稀疏表示的策略,这些表示对应于非凸代价函数的良好局部最小值。为了克服D-Wave Chimera图上物理量子位之间的严重耦合限制,我们使用嵌入工具在减少数量的逻辑量子位上实现大约全对全的连接。我们在CIFAR-10数据库的一个子集上使用总能量和分类精度来评估D-Wave生成的稀疏表示。D-Wave 2X优于两种经典的最先进的二进制求解器,即GUROBI和chimera启发的算法Hamze-Freitas-Selby (HFS)。具体来说,D-Wave 2X在几秒钟内就能产生较低能量的稀疏解决方案,而对于GUROBI和HFS来说,最大的问题需要10个多小时。我们在量子D- Wave 2X上使用47个特征获得了首批4K图像的交叉验证分类率为31.02%。
{"title":"Generating Sparse Representations Using Quantum Annealing: Comparison to Classical Algorithms","authors":"N. T. Nguyen, Amy E. Larson, Garrett T. Kenyon","doi":"10.1109/ICRC.2017.8123653","DOIUrl":"https://doi.org/10.1109/ICRC.2017.8123653","url":null,"abstract":"We use a quantum annealing D-Wave 2X (1,152-qubit) computer to generate sparse representations of Canny-filtered, center-cropped 30x30 CIFAR-10 images. Each binary neuron (qubit) represents a feature kernel obtained initially by imprinting on a randomly chosen 5x5 image patch and then adapted via an off-line Hebbian learning protocol using the sparse solutions generated by the D-Wave. When using binary neurons, the energy function is non-convex (multiple local-minima) and finding a global minimum is NP-hard. Quantum annealing provides a strategy for finding sparse representations that correspond to good local minima of a non-convex cost function. To overcome the severe coupling restrictions between physical qubits on the D-Wave Chimera graph, we use embedding tools to achieve approximately all-to-all connectivity across a reduced number of logical qubits. We assess the sparse representations generated by the D-Wave using both the total energy as well as classification accuracy on a subset of the CIFAR-10 database. The D-Wave 2X outperforms two classical state-of-the-art binary solvers, GUROBI and Chimera-inspired algorithm Hamze-Freitas-Selby (HFS). Specifically, the D-Wave 2X yields lower energy sparse solutions within seconds while the largest problems take over 10 hours for both GUROBI and HFS. We obtained cross-validation classification of 31.02% for the first 4K images using 47 features on the quantum D- Wave 2X.","PeriodicalId":125114,"journal":{"name":"2017 IEEE International Conference on Rebooting Computing (ICRC)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114762395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Physical Constraints on Quantum Circuits 量子电路的物理约束
Pub Date : 2017-11-01 DOI: 10.1109/ICRC.2017.8123663
P. Civalleri, F. Corinto, Á. Csurgay
The physical constraints underlying the concept of quantum circuit are considered. In particular it is shown that the point of departure for their modeling starts from the interconnection of the components into a classical network, followed by quantization of the latter, and not by the interconnection of already quantized components. The procedure is straightforward for lossless networks but cannot be worked out in presence of resistors for the impossibility of constructing a Lagrangian function. However the difficulty is circumvented by distinguishing thermal from radiative resistors, the former being the usual ones, the latter being realized by semi-infinite LC transmission lines, for which the Lagrangian exists. In the complex plane s = σ + jω the impedance of a thermal resistor is Z(s) = R in the entire plane while that of a radiative resistor is Rsign(σ), so that the latter is lossless and does not dissipate energy but conveys it to the infinity. Comparison with Lindblad approach shows that the resistor fits into it in the RHP. Radiative resistors make evolution reversible, which is not true for a physical system including thermal.
考虑了量子电路概念的物理约束。特别指出的是,他们的建模的出发点是从组件的互连到一个经典网络,然后是后者的量化,而不是由已经量化的组件的互连。这一过程对于无损网络来说很简单,但由于不可能构造拉格朗日函数,在电阻存在的情况下无法计算。然而,通过区分热电阻和辐射电阻,可以避免困难,前者是通常的电阻,后者是通过半无限LC传输线实现的,其中存在拉格朗日量。在复平面s = σ + jω中,热敏电阻在整个平面上的阻抗为Z(s) = R,而辐射电阻的阻抗为Rsign(σ),因此后者是无损的,不耗散能量,而是将能量传递到无穷远。与Lindblad方法的比较表明,该电阻适合于RHP。辐射电阻使演化可逆,这对包括热系统在内的物理系统是不成立的。
{"title":"Physical Constraints on Quantum Circuits","authors":"P. Civalleri, F. Corinto, Á. Csurgay","doi":"10.1109/ICRC.2017.8123663","DOIUrl":"https://doi.org/10.1109/ICRC.2017.8123663","url":null,"abstract":"The physical constraints underlying the concept of quantum circuit are considered. In particular it is shown that the point of departure for their modeling starts from the interconnection of the components into a classical network, followed by quantization of the latter, and not by the interconnection of already quantized components. The procedure is straightforward for lossless networks but cannot be worked out in presence of resistors for the impossibility of constructing a Lagrangian function. However the difficulty is circumvented by distinguishing thermal from radiative resistors, the former being the usual ones, the latter being realized by semi-infinite LC transmission lines, for which the Lagrangian exists. In the complex plane s = σ + jω the impedance of a thermal resistor is Z(s) = R in the entire plane while that of a radiative resistor is Rsign(σ), so that the latter is lossless and does not dissipate energy but conveys it to the infinity. Comparison with Lindblad approach shows that the resistor fits into it in the RHP. Radiative resistors make evolution reversible, which is not true for a physical system including thermal.","PeriodicalId":125114,"journal":{"name":"2017 IEEE International Conference on Rebooting Computing (ICRC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114973481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convolutional Drift Networks for Video Classification 用于视频分类的卷积漂移网络
Pub Date : 2017-11-01 DOI: 10.1109/ICRC.2017.8123647
Dillon Graham, Seyed Hamed Fatemi Langroudi, Christopher Kanan, D. Kudithipudi
Analyzing spatio-temporal data like video is a challenging task that requires processing visual and temporal information effectively. Convolutional Neural Networks have shown promise as baseline fixed feature extractors through transfer learning, a technique that helps minimize the training cost on visual information. Temporal information is often handled using hand-crafted features or Recurrent Neural Networks, but this can be overly specific or prohibitively complex. Building a fully trainable system that can efficiently analyze spatio-temporal data without hand-crafted features or complex training is an open challenge. We present a new neural network architecture to address this challenge, the Convolutional Drift Network (CDN). Our CDN architecture combines the visual feature extraction power of deep Convolutional Neural Networks with the intrinsically efficient temporal processing provided by Reservoir Computing. In this introductory paper on the CDN, we provide a very simple baseline implementation tested on two egocentric (first-person) video activity datasets. We achieve video-level activity classification results on-par with state-of-the art methods. Notably, performance on this complex spatio- temporal task was produced by only training a single feed-forward layer in the CDN.
分析像视频这样的时空数据是一项具有挑战性的任务,需要有效地处理视觉和时间信息。卷积神经网络通过迁移学习显示了作为基线固定特征提取器的前景,这种技术有助于将视觉信息的训练成本降至最低。时间信息通常使用手工制作的特征或递归神经网络处理,但这可能过于具体或过于复杂。建立一个完全可训练的系统,可以有效地分析时空数据,而无需手工制作的特征或复杂的训练是一个开放的挑战。提出了一种新的神经网络体系结构来解决这一挑战,卷积漂移网络(CDN)。我们的CDN架构结合了深度卷积神经网络的视觉特征提取能力和水库计算提供的内在高效的时间处理。在这篇关于CDN的介绍性文章中,我们提供了一个非常简单的基线实现,测试了两个以自我为中心(第一人称)的视频活动数据集。我们实现了视频级别的活动分类结果,与最先进的方法相当。值得注意的是,在这个复杂的时空任务上,仅通过训练CDN中的单个前馈层就能产生性能。
{"title":"Convolutional Drift Networks for Video Classification","authors":"Dillon Graham, Seyed Hamed Fatemi Langroudi, Christopher Kanan, D. Kudithipudi","doi":"10.1109/ICRC.2017.8123647","DOIUrl":"https://doi.org/10.1109/ICRC.2017.8123647","url":null,"abstract":"Analyzing spatio-temporal data like video is a challenging task that requires processing visual and temporal information effectively. Convolutional Neural Networks have shown promise as baseline fixed feature extractors through transfer learning, a technique that helps minimize the training cost on visual information. Temporal information is often handled using hand-crafted features or Recurrent Neural Networks, but this can be overly specific or prohibitively complex. Building a fully trainable system that can efficiently analyze spatio-temporal data without hand-crafted features or complex training is an open challenge. We present a new neural network architecture to address this challenge, the Convolutional Drift Network (CDN). Our CDN architecture combines the visual feature extraction power of deep Convolutional Neural Networks with the intrinsically efficient temporal processing provided by Reservoir Computing. In this introductory paper on the CDN, we provide a very simple baseline implementation tested on two egocentric (first-person) video activity datasets. We achieve video-level activity classification results on-par with state-of-the art methods. Notably, performance on this complex spatio- temporal task was produced by only training a single feed-forward layer in the CDN.","PeriodicalId":125114,"journal":{"name":"2017 IEEE International Conference on Rebooting Computing (ICRC)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122116796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Routing Congestion Aware Cell Library Development for Monolithic 3D ICs 面向单片3D集成电路的路由拥塞感知单元库开发
Pub Date : 2017-11-01 DOI: 10.1109/ICRC.2017.8123686
Chen Yan, E. Salman
According to International Roadmap for Devices and Systems (IRDS), after 2024, there is no headroom for 2D geometry scaling. By 2024, IRDS predicts that monolithic 3D integration technology will be one of the most critical performance boosters. This prediction follows the highly promising and relatively recent improvements on the fabrication of second-tier devices on a single substrate. In this study, a fully functional and open source cell library is developed to design and characterize large-scale monolithic 3D ICs. The second version of the 3D cell library is equipped with the required files for integration into existing design automation tools, thereby enabling chip-level benchmarking. Furthermore, multiple versions of the library are proposed to investigate the tradeoffs among routability, timing, power, and area characteristics. The 3D library can also be used to analyze some of the important issues in monolithic 3D ICs, such as chip-level thermal characteristics, efficacy of various thermal management methodologies, and design-for-test methods for monolithic 3D integration.
根据国际设备和系统路线图(IRDS), 2024年之后,二维几何缩放将没有空间。IRDS预测,到2024年,单片3D集成技术将成为最关键的性能助推器之一。这一预测遵循了在单一衬底上制造第二层器件的非常有前途和相对较新的改进。在本研究中,开发了一个功能齐全的开源单元库,用于设计和表征大规模单片3D集成电路。第二个版本的3D单元库配备了集成到现有设计自动化工具所需的文件,从而实现芯片级基准测试。此外,还提出了多个版本的库,以研究可达性、时序、功率和面积特性之间的权衡。3D库还可用于分析单片3D集成电路中的一些重要问题,例如芯片级热特性、各种热管理方法的有效性以及单片3D集成的面向测试的设计方法。
{"title":"Routing Congestion Aware Cell Library Development for Monolithic 3D ICs","authors":"Chen Yan, E. Salman","doi":"10.1109/ICRC.2017.8123686","DOIUrl":"https://doi.org/10.1109/ICRC.2017.8123686","url":null,"abstract":"According to International Roadmap for Devices and Systems (IRDS), after 2024, there is no headroom for 2D geometry scaling. By 2024, IRDS predicts that monolithic 3D integration technology will be one of the most critical performance boosters. This prediction follows the highly promising and relatively recent improvements on the fabrication of second-tier devices on a single substrate. In this study, a fully functional and open source cell library is developed to design and characterize large-scale monolithic 3D ICs. The second version of the 3D cell library is equipped with the required files for integration into existing design automation tools, thereby enabling chip-level benchmarking. Furthermore, multiple versions of the library are proposed to investigate the tradeoffs among routability, timing, power, and area characteristics. The 3D library can also be used to analyze some of the important issues in monolithic 3D ICs, such as chip-level thermal characteristics, efficacy of various thermal management methodologies, and design-for-test methods for monolithic 3D integration.","PeriodicalId":125114,"journal":{"name":"2017 IEEE International Conference on Rebooting Computing (ICRC)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121236192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2017 IEEE International Conference on Rebooting Computing (ICRC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1