A speech and character combined recognition engine for mobile devices

IF 0.8 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS International Journal of Pervasive Computing and Communications Pub Date : 2006-08-01 DOI:10.1108/17427370810890409

Soo-Young Suk, Hyun-Yeol Chung

{"title":"A speech and character combined recognition engine for mobile devices","authors":"Soo-Young Suk, Hyun-Yeol Chung","doi":"10.1108/17427370810890409","DOIUrl":null,"url":null,"abstract":"A Speech and Character Combined Recognition Engine (SCCRE) is developed for working on Personal Digital Assistants (PDA) or on mobile devices. In SCCRE, feature extraction from speech and character is carried out separately, but recognition is performed in an engine. The recognition engine employs essentially CHMM (Continuous Hidden Markov Model) structure and this CHMM consists of variable parameter topology in order to minimize the number of model parameters and reduce recognition time. This model also adopts our proposed SSMS (Successive State and Mixture Splitting) for generating context independent model. SSMS optimizes the number of mixtures through splitting in mixture domain and the number of states through splitting in time domain. When we applied our developed engine which adopts SSMS to speech recognition for mobile devices, SSMS can reduce total number of Gaussian up to 40.0% compared with the fixed parameter models at the same recognition performance. This leads that SSMS can reduce the size of memory for models to 65% and that for processing to 82%. Moreover, recognition time decreases 17% with SSMS model but still maintains the recognition rate.","PeriodicalId":43952,"journal":{"name":"International Journal of Pervasive Computing and Communications","volume":"7 1","pages":"549-559"},"PeriodicalIF":0.8000,"publicationDate":"2006-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Pervasive Computing and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/17427370810890409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 2

Abstract

A Speech and Character Combined Recognition Engine (SCCRE) is developed for working on Personal Digital Assistants (PDA) or on mobile devices. In SCCRE, feature extraction from speech and character is carried out separately, but recognition is performed in an engine. The recognition engine employs essentially CHMM (Continuous Hidden Markov Model) structure and this CHMM consists of variable parameter topology in order to minimize the number of model parameters and reduce recognition time. This model also adopts our proposed SSMS (Successive State and Mixture Splitting) for generating context independent model. SSMS optimizes the number of mixtures through splitting in mixture domain and the number of states through splitting in time domain. When we applied our developed engine which adopts SSMS to speech recognition for mobile devices, SSMS can reduce total number of Gaussian up to 40.0% compared with the fixed parameter models at the same recognition performance. This leads that SSMS can reduce the size of memory for models to 65% and that for processing to 82%. Moreover, recognition time decreases 17% with SSMS model but still maintains the recognition rate.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于移动设备的语音和字符组合识别引擎

语音和字符组合识别引擎(SCCRE)是为个人数字助理(PDA)或移动设备开发的。在scre中，语音和字符的特征提取是分开进行的，而识别是在引擎中进行的。识别引擎本质上采用连续隐马尔可夫模型(CHMM)结构，该CHMM由变参数拓扑组成，以减少模型参数的数量和减少识别时间。该模型还采用了我们提出的SSMS(连续状态和混合分裂)来生成上下文无关的模型。SSMS通过混合域的分裂来优化混合物的数量，通过时域的分裂来优化状态的数量。将我们开发的基于SSMS的引擎应用于移动设备的语音识别中，在相同的识别性能下，SSMS与固定参数模型相比，可以减少高达40.0%的高斯总数。这导致SSMS可以将模型的内存大小减少到65%，处理的内存大小减少到82%。此外，SSMS模型的识别时间减少了17%，但仍保持了识别率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Pervasive Computing and Communications COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-

CiteScore

6.60

自引率

0.00%

发文量