FPGA Fastfood -一种大规模在线核方法的高速收缩实现

Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays Pub Date : 2018-02-15 DOI:10.1145/3174243.3174271

Sean Fox, D. Boland, P. Leong

{"title":"FPGA Fastfood -一种大规模在线核方法的高速收缩实现","authors":"Sean Fox, D. Boland, P. Leong","doi":"10.1145/3174243.3174271","DOIUrl":null,"url":null,"abstract":"In this paper, we describe a systolic Field Programmable Gate Array (FPGA) implementation of the Fastfood algorithm that is optimised to run at a high frequency. The Fastfood algorithm supports online learning for large scale kernel methods. Empirical results show that 500 MHz clock rates can be sustained for an architecture that can solve problems with input dimensions that are $10^3$ times larger than previously reported. Unlike many recent deep learning publications, this design implements both training and prediction. This enables the use of kernel methods in applications requiring a rare combination of capacity, adaption and speed.","PeriodicalId":164936,"journal":{"name":"Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"FPGA Fastfood - A High Speed Systolic Implementation of a Large Scale Online Kernel Method\",\"authors\":\"Sean Fox, D. Boland, P. Leong\",\"doi\":\"10.1145/3174243.3174271\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we describe a systolic Field Programmable Gate Array (FPGA) implementation of the Fastfood algorithm that is optimised to run at a high frequency. The Fastfood algorithm supports online learning for large scale kernel methods. Empirical results show that 500 MHz clock rates can be sustained for an architecture that can solve problems with input dimensions that are $10^3$ times larger than previously reported. Unlike many recent deep learning publications, this design implements both training and prediction. This enables the use of kernel methods in applications requiring a rare combination of capacity, adaption and speed.\",\"PeriodicalId\":164936,\"journal\":{\"name\":\"Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-02-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3174243.3174271\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3174243.3174271","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

在本文中，我们描述了一种收缩现场可编程门阵列(FPGA)的Fastfood算法的实现，该算法被优化为在高频下运行。Fastfood算法支持大规模核方法的在线学习。经验结果表明，对于可以解决输入尺寸比先前报道的大10^3倍的问题的架构，可以维持500 MHz时钟速率。与最近的许多深度学习出版物不同，该设计同时实现了训练和预测。这使得在需要容量、适应性和速度的罕见组合的应用程序中使用内核方法成为可能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

FPGA Fastfood - A High Speed Systolic Implementation of a Large Scale Online Kernel Method

In this paper, we describe a systolic Field Programmable Gate Array (FPGA) implementation of the Fastfood algorithm that is optimised to run at a high frequency. The Fastfood algorithm supports online learning for large scale kernel methods. Empirical results show that 500 MHz clock rates can be sustained for an architecture that can solve problems with input dimensions that are $10^3$ times larger than previously reported. Unlike many recent deep learning publications, this design implements both training and prediction. This enables the use of kernel methods in applications requiring a rare combination of capacity, adaption and speed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

自引率

0.00%

发文量

期刊最新文献

Architecture and Circuit Design of an All-Spintronic FPGA Session details: Session 6: High Level Synthesis 2 A FPGA Friendly Approximate Computing Framework with Hybrid Neural Networks: (Abstract Only) Software/Hardware Co-design for Multichannel Scheduling in IEEE 802.11p MLME: (Abstract Only) Session details: Special Session: Deep Learning