边缘语音AI中用于紧凑语言资源表示的有限状态超转换器

IF 3.2 Q2 AUTOMATION & CONTROL SYSTEMS Systems Science & Control Engineering Pub Date : 2022-06-23 DOI:10.1080/21642583.2022.2089930
S. Dobrišek, Ziga Golob, Jerneja Žganec Gros
{"title":"边缘语音AI中用于紧凑语言资源表示的有限状态超转换器","authors":"S. Dobrišek, Ziga Golob, Jerneja Žganec Gros","doi":"10.1080/21642583.2022.2089930","DOIUrl":null,"url":null,"abstract":"Finite-state transducers have been proven to yield compact representations of pronunciation dictionaries used for grapheme-to-phoneme conversion in speech engines running on low-resource embedded platforms. However, for highly inflected languages even more efficient language resource reduction methods are needed. In the paper, we demonstrate that the size of finite-state transducers tends to decrease when the number of word forms in the modelled pronunciation dictionary reaches a certain threshold. Motivated by this finding, we propose and evaluate a new type of finite-state transducers, called ‘finite-state super transducers’, which allow for the representation of pronunciation dictionaries by a smaller number of states and transitions, thereby significantly reducing the size of the language resource representation in comparison to minimal deterministic final-state transducers by up to 25%. Further, we demonstrate that finite-state super transducers exhibit a generalization capability as they may accept and thereby phonetically transform even inflected word forms that had not been initially represented in the original pronunciation dictionary used for building the finite-state super transducer. This method is suitable for speech engines operating on platforms at the edge of an AI system with restricted memory capabilities and processing power, where efficient speech processing methods based on compact language resources must be implemented.","PeriodicalId":46282,"journal":{"name":"Systems Science & Control Engineering","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Finite-state super transducers for compact language resource representation in edge voice-AI\",\"authors\":\"S. Dobrišek, Ziga Golob, Jerneja Žganec Gros\",\"doi\":\"10.1080/21642583.2022.2089930\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Finite-state transducers have been proven to yield compact representations of pronunciation dictionaries used for grapheme-to-phoneme conversion in speech engines running on low-resource embedded platforms. However, for highly inflected languages even more efficient language resource reduction methods are needed. In the paper, we demonstrate that the size of finite-state transducers tends to decrease when the number of word forms in the modelled pronunciation dictionary reaches a certain threshold. Motivated by this finding, we propose and evaluate a new type of finite-state transducers, called ‘finite-state super transducers’, which allow for the representation of pronunciation dictionaries by a smaller number of states and transitions, thereby significantly reducing the size of the language resource representation in comparison to minimal deterministic final-state transducers by up to 25%. Further, we demonstrate that finite-state super transducers exhibit a generalization capability as they may accept and thereby phonetically transform even inflected word forms that had not been initially represented in the original pronunciation dictionary used for building the finite-state super transducer. This method is suitable for speech engines operating on platforms at the edge of an AI system with restricted memory capabilities and processing power, where efficient speech processing methods based on compact language resources must be implemented.\",\"PeriodicalId\":46282,\"journal\":{\"name\":\"Systems Science & Control Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2022-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Systems Science & Control Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/21642583.2022.2089930\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems Science & Control Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/21642583.2022.2089930","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

有限状态换能器已被证明可以产生用于在低资源嵌入式平台上运行的语音引擎中进行字素到音素转换的发音字典的紧凑表示。然而,对于高度屈折的语言,需要更有效的语言资源缩减方法。在本文中,我们证明了有限状态换能器的大小在建模发音字典中的词形数量达到一定阈值时趋于减小。受此发现的启发,我们提出并评估了一种新型有限状态换能器,称为“有限状态超级换能器”,它允许通过更少的状态和转换来表示发音字典,从而与最小确定性最终状态换能器相比,显着减少了语言资源表示的大小,最多减少了25%。此外,我们证明有限状态超级换能器表现出一种泛化能力,因为它们可以接受并因此在语音上转换甚至没有在用于构建有限状态超级换能器的原始发音字典中最初表示的屈折音词形。该方法适用于在内存能力和处理能力有限的AI系统边缘平台上运行的语音引擎,必须实现基于紧凑语言资源的高效语音处理方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Finite-state super transducers for compact language resource representation in edge voice-AI
Finite-state transducers have been proven to yield compact representations of pronunciation dictionaries used for grapheme-to-phoneme conversion in speech engines running on low-resource embedded platforms. However, for highly inflected languages even more efficient language resource reduction methods are needed. In the paper, we demonstrate that the size of finite-state transducers tends to decrease when the number of word forms in the modelled pronunciation dictionary reaches a certain threshold. Motivated by this finding, we propose and evaluate a new type of finite-state transducers, called ‘finite-state super transducers’, which allow for the representation of pronunciation dictionaries by a smaller number of states and transitions, thereby significantly reducing the size of the language resource representation in comparison to minimal deterministic final-state transducers by up to 25%. Further, we demonstrate that finite-state super transducers exhibit a generalization capability as they may accept and thereby phonetically transform even inflected word forms that had not been initially represented in the original pronunciation dictionary used for building the finite-state super transducer. This method is suitable for speech engines operating on platforms at the edge of an AI system with restricted memory capabilities and processing power, where efficient speech processing methods based on compact language resources must be implemented.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Systems Science & Control Engineering
Systems Science & Control Engineering AUTOMATION & CONTROL SYSTEMS-
CiteScore
9.50
自引率
2.40%
发文量
70
审稿时长
29 weeks
期刊介绍: Systems Science & Control Engineering is a world-leading fully open access journal covering all areas of theoretical and applied systems science and control engineering. The journal encourages the submission of original articles, reviews and short communications in areas including, but not limited to: · artificial intelligence · complex systems · complex networks · control theory · control applications · cybernetics · dynamical systems theory · operations research · systems biology · systems dynamics · systems ecology · systems engineering · systems psychology · systems theory
期刊最新文献
MS-YOLOv5: a lightweight algorithm for strawberry ripeness detection based on deep learning Research on the operation of integrated energy microgrid based on cluster power sharing mechanism Low-frequency operation control method for medium-voltage high-capacity FC-MMC type frequency converter Customized passenger path optimization for airport connections under carbon emissions restrictions Nonlinear impact analysis of built environment on urban road traffic safety risk
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1