Dynamic Neural Accelerator for Reconfigurable & Energy-efficient Neural Network Inference

2021 IEEE Hot Chips 33 Symposium (HCS) Pub Date : 2021-08-22 DOI:10.1109/HCS52781.2021.9566886

Nikolay Nez, Antonio N. Vilchez, H. Zohouri, Oleg Khavin, Sakyasingha Dasgupta

引用次数: 2

Abstract

Unique Challenges for AI Inference Hardware at the Edge • Peak TOPS or TOPS/Watt are not ideal measures of performance at the edge. Cannot prioritize performance over power efficiency (throughput/watt) • Many AI Hardware rely on batching to improve utilization. Unsuitable for streaming data (batch size 1) use-case at the edge • AI hardware architectures that fully cache network parameters using large on-chip SRAM cannot be scaled down easily to sizes applicable for edge workloads. • Need adaptability to new workloads and the ability to deploy multiple AI models • AI-specific accelerator needs to operate within heterogenous compute environments • Need for efficient compiler & scheduling to maximize compute utilization • Need for high software robustness and usability

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

可重构节能神经网络推理的动态神经加速器

•峰值TOPS或TOPS/Watt不是边缘性能的理想衡量标准。不能优先考虑性能而不是功率效率(吞吐量/瓦特)•许多AI硬件依赖批处理来提高利用率。不适合边缘的流数据(批处理大小为1)用例•使用大型片上SRAM完全缓存网络参数的AI硬件架构不能轻松缩放到适用于边缘工作负载的大小。•需要适应新的工作负载和部署多个AI模型的能力•AI特定加速器需要在异构计算环境中运行•需要高效的编译器和调度以最大限度地提高计算利用率•需要高软件鲁棒性和可用性

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE Hot Chips 33 Symposium (HCS)

自引率

0.00%

发文量

期刊最新文献

Multi-Million Core, Multi-Wafer AI Cluster Next Generation “Zen 3” Core Intel’s Hyperscale-Ready Infrastructure Processing Unit (IPU) Sapphire Rapids SambaNova SN10 RDU:Accelerating Software 2.0 with Dataflow