基于并行架构的A*语音识别系统

2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA) Pub Date : 2012-07-02 DOI:10.1109/ISSPA.2012.6310452

P. Cardinal, Gilles Boulianne, P. Dumouchel

{"title":"基于并行架构的A*语音识别系统","authors":"P. Cardinal, Gilles Boulianne, P. Dumouchel","doi":"10.1109/ISSPA.2012.6310452","DOIUrl":null,"url":null,"abstract":"The speed of modern processors has remained constant over the last few years but the integration capacity continues to follow Moore's law and thus, to be scalable, applications must be parallelized. In addition to the main CPU, almost every computer is equipped with a Graphics Processors Unit (GPU) which is in essence a specialized parallel processor. This paper explore how performance of speech recognition systems can be enhanced by using the A* algorithm which allows better parallelization over the Viterbi algorithm and a GPU for the acoustic computations in large vocabulary applications. First experiments with a “unigram approximation” heuristic resulted in approximatively 8.7 times less states being explored compared to our classical Viterbi decoder. The multi-thread implementation of the A* decoder combined with GPU for acoustic computation led to a speed-up factor of 5.2 over its sequential counterpart and an improvement of 5% absolute of the accuracy over the sequential Viterbi search at real-time.","PeriodicalId":248763,"journal":{"name":"2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"The A* speech recognition system on parallel architectures\",\"authors\":\"P. Cardinal, Gilles Boulianne, P. Dumouchel\",\"doi\":\"10.1109/ISSPA.2012.6310452\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The speed of modern processors has remained constant over the last few years but the integration capacity continues to follow Moore's law and thus, to be scalable, applications must be parallelized. In addition to the main CPU, almost every computer is equipped with a Graphics Processors Unit (GPU) which is in essence a specialized parallel processor. This paper explore how performance of speech recognition systems can be enhanced by using the A* algorithm which allows better parallelization over the Viterbi algorithm and a GPU for the acoustic computations in large vocabulary applications. First experiments with a “unigram approximation” heuristic resulted in approximatively 8.7 times less states being explored compared to our classical Viterbi decoder. The multi-thread implementation of the A* decoder combined with GPU for acoustic computation led to a speed-up factor of 5.2 over its sequential counterpart and an improvement of 5% absolute of the accuracy over the sequential Viterbi search at real-time.\",\"PeriodicalId\":248763,\"journal\":{\"name\":\"2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSPA.2012.6310452\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPA.2012.6310452","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在过去几年中，现代处理器的速度一直保持不变，但集成能力继续遵循摩尔定律，因此，为了实现可扩展，应用程序必须并行化。除了主CPU之外，几乎每台计算机都配备了图形处理器单元(GPU)，它本质上是一个专门的并行处理器。本文探讨了如何通过使用A*算法来增强语音识别系统的性能，该算法允许在Viterbi算法和GPU上更好地并行化，用于大词汇量应用中的声学计算。与我们经典的维特比解码器相比，使用“一元近似”启发式的第一次实验导致探索的状态减少了大约8.7倍。A*解码器的多线程实现与GPU的声学计算相结合，导致了5.2倍的加速系数，并且在实时情况下比顺序Viterbi搜索提高了5%的绝对精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

The A* speech recognition system on parallel architectures

The speed of modern processors has remained constant over the last few years but the integration capacity continues to follow Moore's law and thus, to be scalable, applications must be parallelized. In addition to the main CPU, almost every computer is equipped with a Graphics Processors Unit (GPU) which is in essence a specialized parallel processor. This paper explore how performance of speech recognition systems can be enhanced by using the A* algorithm which allows better parallelization over the Viterbi algorithm and a GPU for the acoustic computations in large vocabulary applications. First experiments with a “unigram approximation” heuristic resulted in approximatively 8.7 times less states being explored compared to our classical Viterbi decoder. The multi-thread implementation of the A* decoder combined with GPU for acoustic computation led to a speed-up factor of 5.2 over its sequential counterpart and an improvement of 5% absolute of the accuracy over the sequential Viterbi search at real-time.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA)

自引率

0.00%

发文量

期刊最新文献

Online mvbf adaptation under diffuse noise environments with mimo based noise pre-filtering Hierarchical scheme for Arabic text recognition Precoder selection and rank adaptation in MIMO-OFDM Head detection using Kinect camera and its application to fall detection Wavelength and code division multiplexing toward diffuse optical imaging