C3-Flow: Compute Compression Co-Design Flow for Deep Neural Networks

Matthew Sotoudeh, Sara S. Baghsorkhi
{"title":"C3-Flow: Compute Compression Co-Design Flow for Deep Neural Networks","authors":"Matthew Sotoudeh, Sara S. Baghsorkhi","doi":"10.1145/3316781.3317786","DOIUrl":null,"url":null,"abstract":"Existing approaches to neural network compression have failed to holistically address algorithmic (training accuracy) and computational (inference performance) demands of real-world systems, particularly on resource-constrained devices. We present C3-Flow, a new approach adding non-uniformity to low-rank approximations and designed specifically to enable highly-efficient computation on common hardware architectures while retaining more accuracy than competing methods. Evaluation on two state-of-the-art acoustic models (versus existing work, empirical limit study approaches, and hand-tuned models) demonstrates up to 60% lower error. Finally, we show that our co-design approach achieves up to 14X inference speedup across three Haswell- and Broadwell-based platforms.","PeriodicalId":391209,"journal":{"name":"Proceedings of the 56th Annual Design Automation Conference 2019","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 56th Annual Design Automation Conference 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3316781.3317786","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Existing approaches to neural network compression have failed to holistically address algorithmic (training accuracy) and computational (inference performance) demands of real-world systems, particularly on resource-constrained devices. We present C3-Flow, a new approach adding non-uniformity to low-rank approximations and designed specifically to enable highly-efficient computation on common hardware architectures while retaining more accuracy than competing methods. Evaluation on two state-of-the-art acoustic models (versus existing work, empirical limit study approaches, and hand-tuned models) demonstrates up to 60% lower error. Finally, we show that our co-design approach achieves up to 14X inference speedup across three Haswell- and Broadwell-based platforms.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
C3-Flow:深度神经网络计算压缩协同设计流程
现有的神经网络压缩方法未能全面解决现实世界系统的算法(训练精度)和计算(推理性能)需求,特别是在资源受限的设备上。我们提出了C3-Flow,这是一种将非均匀性添加到低秩近似中的新方法,专门设计用于在通用硬件架构上实现高效计算,同时保持比竞争方法更高的准确性。对两种最先进的声学模型(与现有工作、经验极限研究方法和手动调谐模型相比)的评估表明,误差降低了60%。最后,我们展示了我们的协同设计方法在三个基于Haswell和broadwell的平台上实现了高达14倍的推理加速。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
LODESTAR DHOOM Filianore ChipSecure MRLoc
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1