Pipelining Acoustic Model Training for Speech Recognition Using Storm

D. Sitaram, Haripriya Srinivasaraghavan, Kapish Agarwal, Amritanshu Agrawal, N. Joshi, Debraj Ray
{"title":"Pipelining Acoustic Model Training for Speech Recognition Using Storm","authors":"D. Sitaram, Haripriya Srinivasaraghavan, Kapish Agarwal, Amritanshu Agrawal, N. Joshi, Debraj Ray","doi":"10.1109/CIMSIM.2013.42","DOIUrl":null,"url":null,"abstract":"Speech recognition has been increasingly used on mobile devices, which has in turn increased the need for creation of new acoustic models for various languages, dialects, accents, speakers and environmental conditions. This involves training and adapting a huge number of acoustic models, some of them in real-time. Training acoustic models is thus essential for speech recognition because these models determine the accuracy and quality of the recognition process. This paper, discusses the use of Storm, a distributed real time computational system, to pipeline the creation of acoustic models by CMU Sphinx, an open-source software project for speech recognition and training. Software pipelining reduces the time required for training and optimizes system resource utilization, thus enabling huge amounts of data to be trained in considerably less amount of time than taken by the conventional sequential process. Pipelining is achieved by grouping the stages of the training process into a set of five stages, and running each stage on individual nodes in a Storm cluster. Thus acoustic models are created by training multiple streams of speech samples using the same SphinxTrain setup, also resulting in improvement of training time and throughput.","PeriodicalId":249355,"journal":{"name":"2013 Fifth International Conference on Computational Intelligence, Modelling and Simulation","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Fifth International Conference on Computational Intelligence, Modelling and Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIMSIM.2013.42","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Speech recognition has been increasingly used on mobile devices, which has in turn increased the need for creation of new acoustic models for various languages, dialects, accents, speakers and environmental conditions. This involves training and adapting a huge number of acoustic models, some of them in real-time. Training acoustic models is thus essential for speech recognition because these models determine the accuracy and quality of the recognition process. This paper, discusses the use of Storm, a distributed real time computational system, to pipeline the creation of acoustic models by CMU Sphinx, an open-source software project for speech recognition and training. Software pipelining reduces the time required for training and optimizes system resource utilization, thus enabling huge amounts of data to be trained in considerably less amount of time than taken by the conventional sequential process. Pipelining is achieved by grouping the stages of the training process into a set of five stages, and running each stage on individual nodes in a Storm cluster. Thus acoustic models are created by training multiple streams of speech samples using the same SphinxTrain setup, also resulting in improvement of training time and throughput.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于Storm的语音识别管道声学模型训练
语音识别在移动设备上的应用越来越多,这反过来又增加了为各种语言、方言、口音、说话者和环境条件创建新的声学模型的需求。这包括训练和调整大量的声学模型,其中一些是实时的。因此,训练声学模型对于语音识别至关重要,因为这些模型决定了识别过程的准确性和质量。本文讨论了CMU Sphinx(一个用于语音识别和训练的开源软件项目)使用分布式实时计算系统Storm来流水线创建声学模型。软件流水线减少了训练所需的时间,并优化了系统资源的利用,因此,与传统的顺序过程相比,可以在相当短的时间内训练大量的数据。流水线是通过将训练过程的阶段分组为一组五个阶段,并在Storm集群中的单个节点上运行每个阶段来实现的。因此,声学模型是通过使用相同的SphinxTrain设置训练多个语音样本流来创建的,这也导致了训练时间和吞吐量的改善。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Impact of Software Quality Standards on Commercial Product Development and Customer Satisfaction for Software Industry in Pakistan Bringing Semantic Resources Together in the Cloud: From Theory to Application A Unified Architecture for a Dual Field ECC Processor Applicable to AES Comparison of Back Propagation and Resilient Propagation Algorithm for Spam Classification HIPAA Based Predictive Analytics for an Adaptive and Descriptive Mobile Healthcare System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1