Phone-to-word decoding through statistical machine translation and complementary system combination

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI:10.1109/ASRU.2009.5373281

D. Falavigna, M. Gerosa, R. Gretter, D. Giuliani

引用次数: 4

Abstract

In this paper, phone-to-word transduction is first investigated by coupling a speech recognizer, generating for each speech segment a phone sequence or a phone confusion network, with the efficient decoder of confusion networks adopted by MOSES, a popular statistical machine translation toolkit. Then, system combination is investigated by combining the outputs of several conventional ASR systems with the output of a system embedding phone-to-word decoding through statistical machine translation. Experiments are carried out in the context of a large vocabulary speech recognition task consisting of transcription of speeches delivered in English during the European Parliament Plenary Sessions (EPPS). While only a marginal performance improvements is achieved in system combination experiments when the output of the phone-to-word transducer is included in the combination, partial results show a great potential for improvements.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过统计机器翻译和互补系统相结合实现电话到文字的译码

本文首先通过将语音识别器(为每个语音段生成电话序列或电话混淆网络)与流行的统计机器翻译工具包MOSES采用的高效混淆网络解码器耦合在一起，研究了电话到单词的转导。然后，将几个传统ASR系统的输出与一个通过统计机器翻译嵌入电话到单词解码的系统的输出相结合，研究了系统组合。实验是在一个大词汇量的语音识别任务的背景下进行的，该任务包括在欧洲议会全体会议(EPPS)期间用英语发表的演讲的转录。虽然在系统组合实验中，当包含电话到单词换能器的输出时，仅实现了边际性能改进，但部分结果显示出巨大的改进潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2009 IEEE Workshop on Automatic Speech Recognition & Understanding

自引率

0.00%

发文量