VIBNN: Hardware Acceleration of Bayesian Neural Networks

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2018-02-02 DOI:10.1145/3173162.3173212

R. Cai, Ao Ren, Ning Liu, Caiwen Ding, Luhao Wang, Xuehai Qian, Massoud Pedram, Yanzhi Wang

{"title":"VIBNN: Hardware Acceleration of Bayesian Neural Networks","authors":"R. Cai, Ao Ren, Ning Liu, Caiwen Ding, Luhao Wang, Xuehai Qian, Massoud Pedram, Yanzhi Wang","doi":"10.1145/3173162.3173212","DOIUrl":null,"url":null,"abstract":"Bayesian Neural Networks (BNNs) have been proposed to address the problem of model uncertainty in training and inference. By introducing weights associated with conditioned probability distributions, BNNs are capable of resolving the overfitting issue commonly seen in conventional neural networks and allow for small-data training, through the variational inference process. Frequent usage of Gaussian random variables in this process requires a properly optimized Gaussian Random Number Generator (GRNG). The high hardware cost of conventional GRNG makes the hardware implementation of BNNs challenging. In this paper, we propose VIBNN, an FPGA-based hardware accelerator design for variational inference on BNNs. We explore the design space for massive amount of Gaussian variable sampling tasks in BNNs. Specifically, we introduce two high performance Gaussian (pseudo) random number generators: 1) the RAM-based Linear Feedback Gaussian Random Number Generator (RLF-GRNG), which is inspired by the properties of binomial distribution and linear feedback logics; and 2) the Bayesian Neural Network-oriented Wallace Gaussian Random Number Generator. To achieve high scalability and efficient memory access, we propose a deep pipelined accelerator architecture with fast execution and good hardware utilization. Experimental results demonstrate that the proposed VIBNN implementations on an FPGA can achieve throughput of 321,543.4 Images/s and energy efficiency upto 52,694.8 Images/J while maintaining similar accuracy as its software counterpart.","PeriodicalId":302876,"journal":{"name":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"71","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3173162.3173212","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 71

Abstract

Bayesian Neural Networks (BNNs) have been proposed to address the problem of model uncertainty in training and inference. By introducing weights associated with conditioned probability distributions, BNNs are capable of resolving the overfitting issue commonly seen in conventional neural networks and allow for small-data training, through the variational inference process. Frequent usage of Gaussian random variables in this process requires a properly optimized Gaussian Random Number Generator (GRNG). The high hardware cost of conventional GRNG makes the hardware implementation of BNNs challenging. In this paper, we propose VIBNN, an FPGA-based hardware accelerator design for variational inference on BNNs. We explore the design space for massive amount of Gaussian variable sampling tasks in BNNs. Specifically, we introduce two high performance Gaussian (pseudo) random number generators: 1) the RAM-based Linear Feedback Gaussian Random Number Generator (RLF-GRNG), which is inspired by the properties of binomial distribution and linear feedback logics; and 2) the Bayesian Neural Network-oriented Wallace Gaussian Random Number Generator. To achieve high scalability and efficient memory access, we propose a deep pipelined accelerator architecture with fast execution and good hardware utilization. Experimental results demonstrate that the proposed VIBNN implementations on an FPGA can achieve throughput of 321,543.4 Images/s and energy efficiency upto 52,694.8 Images/J while maintaining similar accuracy as its software counterpart.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

贝叶斯神经网络的硬件加速

为了解决训练和推理中的模型不确定性问题，提出了贝叶斯神经网络(BNNs)。通过引入与条件概率分布相关的权重，bnn能够解决传统神经网络中常见的过拟合问题，并允许通过变分推理过程进行小数据训练。在此过程中频繁使用高斯随机变量，需要适当优化的高斯随机数生成器(GRNG)。传统GRNG的高硬件成本给bnn的硬件实现带来了挑战。在本文中，我们提出了一种基于fpga的硬件加速器VIBNN，用于对bnn进行变分推理。我们探索了bnn中大量高斯变量采样任务的设计空间。具体来说，我们介绍了两种高性能的高斯(伪)随机数生成器:1)基于ram的线性反馈高斯随机数生成器(RLF-GRNG)，它的灵感来自二项分布和线性反馈逻辑的特性;2)基于贝叶斯神经网络的华莱士高斯随机数发生器。为了实现高可扩展性和高效的内存访问，我们提出了一种执行速度快、硬件利用率高的深度流水线加速器架构。实验结果表明，在FPGA上实现的VIBNN可以实现321,543.4图像/s的吞吐量和高达52,694.8图像/J的能量效率，同时保持与软件相似的精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems

自引率

0.00%

发文量

期刊最新文献

CALOREE: Learning Control for Predictable Latency and Low Energy Session details: Session 7B: Memory 2 Session details: Session 4A: Memory 1 BranchScope: A New Side-Channel Attack on Directional Branch Predictor Devirtualizing Memory in Heterogeneous Systems