Theoretical learning guarantees applied to acoustic modeling

Christopher D. Shulby, Martha D. Ferreira, Rodrigo F. de Mello, Sandra M. Aluisio
{"title":"Theoretical learning guarantees applied to acoustic modeling","authors":"Christopher D. Shulby, Martha D. Ferreira, Rodrigo F. de Mello, Sandra M. Aluisio","doi":"10.1186/s13173-018-0081-3","DOIUrl":null,"url":null,"abstract":"In low-resource scenarios, for example, small datasets or a lack in computational resources available, state-of-the-art deep learning methods for speech recognition have been known to fail. It is possible to achieve more robust models if care is taken to ensure the learning guarantees provided by the statistical learning theory. This work presents a shallow and hybrid approach using a convolutional neural network feature extractor fed into a hierarchical tree of support vector machines for classification. Here, we show that gross errors present even in state-of-the-art systems can be avoided and that an accurate acoustic model can be built in a hierarchical fashion. Furthermore, we present proof that our algorithm does adhere to the learning guarantees provided by the statistical learning theory. The acoustic model produced in this work outperforms traditional hidden Markov models, and the hierarchical support vector machine tree outperforms a multi-class multilayer perceptron classifier using the same features. More importantly, we isolate the performance of the acoustic model and provide results on both the frame and phoneme level, considering the true robustness of the model. We show that even with a small amount of data, accurate and robust recognition rates can be obtained.","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Brazilian Computer Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13173-018-0081-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

In low-resource scenarios, for example, small datasets or a lack in computational resources available, state-of-the-art deep learning methods for speech recognition have been known to fail. It is possible to achieve more robust models if care is taken to ensure the learning guarantees provided by the statistical learning theory. This work presents a shallow and hybrid approach using a convolutional neural network feature extractor fed into a hierarchical tree of support vector machines for classification. Here, we show that gross errors present even in state-of-the-art systems can be avoided and that an accurate acoustic model can be built in a hierarchical fashion. Furthermore, we present proof that our algorithm does adhere to the learning guarantees provided by the statistical learning theory. The acoustic model produced in this work outperforms traditional hidden Markov models, and the hierarchical support vector machine tree outperforms a multi-class multilayer perceptron classifier using the same features. More importantly, we isolate the performance of the acoustic model and provide results on both the frame and phoneme level, considering the true robustness of the model. We show that even with a small amount of data, accurate and robust recognition rates can be obtained.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
理论学习保证了声学建模的应用
在资源匮乏的情况下,例如,小数据集或缺乏可用的计算资源,最先进的语音识别深度学习方法已经失败。如果注意确保统计学习理论提供的学习保证,则有可能实现更健壮的模型。这项工作提出了一种浅层和混合的方法,使用卷积神经网络特征提取器馈送到支持向量机的分层树中进行分类。在这里,我们展示了即使在最先进的系统中也可以避免出现的严重误差,并且可以以分层方式建立准确的声学模型。此外,我们还证明了我们的算法确实遵循了统计学习理论提供的学习保证。在这项工作中产生的声学模型优于传统的隐马尔可夫模型,分层支持向量机树优于使用相同特征的多类多层感知器分类器。更重要的是,考虑到模型的真正鲁棒性,我们分离了声学模型的性能,并在框架和音素水平上提供了结果。我们表明,即使在少量数据下,也可以获得准确和鲁棒的识别率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of the Brazilian Computer Society
Journal of the Brazilian Computer Society Computer Science-Computer Science (all)
CiteScore
2.40
自引率
0.00%
发文量
2
期刊介绍: JBCS is a formal quarterly publication of the Brazilian Computer Society. It is a peer-reviewed international journal which aims to serve as a forum to disseminate innovative research in all fields of computer science and related subjects. Theoretical, practical and experimental papers reporting original research contributions are welcome, as well as high quality survey papers. The journal is open to contributions in all computer science topics, computer systems development or in formal and theoretical aspects of computing, as the list of topics below is not exhaustive. Contributions will be considered for publication in JBCS if they have not been published previously and are not under consideration for publication elsewhere.
期刊最新文献
An optimization-based framework for personal scheduling during pandemic events Multiobjective message scheduling for Hybrid Synchronization in Distributed Simulations Promoting Children's Participation in a Participatory Design Process in a Rural School: A new role needed? A Deep Learning Model for the Assessment of the Visual Aesthetics of Mobile User Interfaces Adopting Human-data Interaction Guidelines and Participatory Practices for Supporting Inexperienced Designers in Information Visualization Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1