基于熵的隐单元剪枝以减少深度神经网络参数

G. Mantena, K. Sim
{"title":"基于熵的隐单元剪枝以减少深度神经网络参数","authors":"G. Mantena, K. Sim","doi":"10.1109/SLT.2016.7846335","DOIUrl":null,"url":null,"abstract":"For acoustic modeling, the use of DNN has become popular due to its superior performance improvements observed in many automatic speech recognition (ASR) tasks. Typically, DNNs with deep (many layers) and wide (many hidden units per layer) architectures are chosen in order to achieve good gains. An issue with such approaches is that there is an explosion in the number of learnable parameters. Thus, it is often difficult to build models in cases where there is no sufficient amount of training data (or data for adaptation), and also limits the usage of ASR systems on hand-held devices such as mobile phones. A method to overcome this issue is to reduce the number of parameters. In this work, we provide a framework to effectively reduce the number of parameters by removing the hidden units. Each hidden unit is represented by an activity vector associated with speech attributes such as phones. A normalized entropy-based measure is computed from these activity vectors which reflects the significance of these units in the DNN model. For comparison we also use low-rank matrix factorization to reduce the number of parameters. We show that low-rank matrix factorization can reduce the number of parameters only to a certain extent. Thus, we extend the pruning technique in combination with low-rank matrix factorization to further reduce the model. In this work, we provide detailed experimental results on the Aurora-4 and TEDLIUM databases and show that the models can be reduced to approximately 20 – 30% of its initial size without much loss in the ASR performance.","PeriodicalId":281635,"journal":{"name":"2016 IEEE Spoken Language Technology Workshop (SLT)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Entropy-based pruning of hidden units to reduce DNN parameters\",\"authors\":\"G. Mantena, K. Sim\",\"doi\":\"10.1109/SLT.2016.7846335\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For acoustic modeling, the use of DNN has become popular due to its superior performance improvements observed in many automatic speech recognition (ASR) tasks. Typically, DNNs with deep (many layers) and wide (many hidden units per layer) architectures are chosen in order to achieve good gains. An issue with such approaches is that there is an explosion in the number of learnable parameters. Thus, it is often difficult to build models in cases where there is no sufficient amount of training data (or data for adaptation), and also limits the usage of ASR systems on hand-held devices such as mobile phones. A method to overcome this issue is to reduce the number of parameters. In this work, we provide a framework to effectively reduce the number of parameters by removing the hidden units. Each hidden unit is represented by an activity vector associated with speech attributes such as phones. A normalized entropy-based measure is computed from these activity vectors which reflects the significance of these units in the DNN model. For comparison we also use low-rank matrix factorization to reduce the number of parameters. We show that low-rank matrix factorization can reduce the number of parameters only to a certain extent. Thus, we extend the pruning technique in combination with low-rank matrix factorization to further reduce the model. In this work, we provide detailed experimental results on the Aurora-4 and TEDLIUM databases and show that the models can be reduced to approximately 20 – 30% of its initial size without much loss in the ASR performance.\",\"PeriodicalId\":281635,\"journal\":{\"name\":\"2016 IEEE Spoken Language Technology Workshop (SLT)\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Spoken Language Technology Workshop (SLT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SLT.2016.7846335\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2016.7846335","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

对于声学建模,深度神经网络的使用已经变得流行,因为它在许多自动语音识别(ASR)任务中观察到优越的性能改进。通常,选择具有深度(多层)和宽(每层有许多隐藏单元)架构的dnn是为了获得良好的增益。这种方法的一个问题是,可学习参数的数量呈爆炸式增长。因此,在没有足够数量的训练数据(或用于适应的数据)的情况下,通常很难建立模型,并且还限制了ASR系统在诸如移动电话之类的手持设备上的使用。克服这个问题的一个方法是减少参数的数量。在这项工作中,我们提供了一个框架,通过去除隐藏单元来有效地减少参数的数量。每个隐藏单元由与语音属性(如电话)相关的活动向量表示。从这些活动向量中计算一个归一化的基于熵的度量,这反映了这些单元在DNN模型中的重要性。为了比较,我们还使用低秩矩阵分解来减少参数的数量。我们证明了低秩矩阵分解只能在一定程度上减少参数的数量。因此,我们将剪枝技术与低秩矩阵分解相结合进行扩展,进一步简化模型。在这项工作中,我们在Aurora-4和TEDLIUM数据库上提供了详细的实验结果,并表明模型可以减少到其初始大小的20 - 30%左右,而不会对ASR性能造成很大损失。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Entropy-based pruning of hidden units to reduce DNN parameters
For acoustic modeling, the use of DNN has become popular due to its superior performance improvements observed in many automatic speech recognition (ASR) tasks. Typically, DNNs with deep (many layers) and wide (many hidden units per layer) architectures are chosen in order to achieve good gains. An issue with such approaches is that there is an explosion in the number of learnable parameters. Thus, it is often difficult to build models in cases where there is no sufficient amount of training data (or data for adaptation), and also limits the usage of ASR systems on hand-held devices such as mobile phones. A method to overcome this issue is to reduce the number of parameters. In this work, we provide a framework to effectively reduce the number of parameters by removing the hidden units. Each hidden unit is represented by an activity vector associated with speech attributes such as phones. A normalized entropy-based measure is computed from these activity vectors which reflects the significance of these units in the DNN model. For comparison we also use low-rank matrix factorization to reduce the number of parameters. We show that low-rank matrix factorization can reduce the number of parameters only to a certain extent. Thus, we extend the pruning technique in combination with low-rank matrix factorization to further reduce the model. In this work, we provide detailed experimental results on the Aurora-4 and TEDLIUM databases and show that the models can be reduced to approximately 20 – 30% of its initial size without much loss in the ASR performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Further optimisations of constant Q cepstral processing for integrated utterance and text-dependent speaker verification Learning dialogue dynamics with the method of moments A study of speech distortion conditions in real scenarios for speech processing applications Comparing speaker independent and speaker adapted classification for word prominence detection Influence of corpus size and content on the perceptual quality of a unit selection MaryTTS voice
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1