首页 > 最新文献

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.最新文献

英文 中文
Directory assistance: learning user formulations for business listings 目录协助:学习企业列表的用户公式
Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034634
C. Popovici, M. Andorno, P. Laface, L. Fissore, M. Nigra, C. Vair
One of the main problems in automatic directory assistance (DA) for business listings is that customers formulate their requests for the same listing with a great variability. We show that an automatic approach allows the detection, from field data, of user formulations that were not foreseen by the designers, and that they can be added, as variants, to the denominations already included in the system to reduce its failures.
用于企业列表的自动目录帮助(DA)的一个主要问题是,客户对同一列表的请求具有很大的可变性。我们表明,自动方法允许从现场数据中检测设计者没有预见到的用户配方,并且可以将它们作为变体添加到系统中已经包含的面额中,以减少其故障。
{"title":"Directory assistance: learning user formulations for business listings","authors":"C. Popovici, M. Andorno, P. Laface, L. Fissore, M. Nigra, C. Vair","doi":"10.1109/ASRU.2001.1034634","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034634","url":null,"abstract":"One of the main problems in automatic directory assistance (DA) for business listings is that customers formulate their requests for the same listing with a great variability. We show that an automatic approach allows the detection, from field data, of user formulations that were not foreseen by the designers, and that they can be added, as variants, to the denominations already included in the system to reduce its failures.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116802537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evaluating long-term spectral subtraction for reverberant ASR 评估混响ASR的长期频谱减法
Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034598
David Gelbart, Nelson Morgan
Even a modest degree of room reverberation can greatly increase the difficulty of automatic speech recognition. We have observed large increases in speech recognition word error rates when using a far-field (3-6 feet) microphone in a conference room, in comparison with recordings from head-mounted microphones. In this paper, we describe experiments with a proposed remedy based on the subtraction of an estimate of the log spectrum from a long-term (e.g., 2 s) analysis window, followed by overlap-add resynthesis. Since the technique is essentially one of enhancement, the processed signal it generates can be used as input for complete speech recognition systems. Here we report results with both the HTK and the SRI Hub-5 recognizer. For simpler recognizer configurations and/or moderate-sized training, the improvements are huge, while moderate improvements are still observed for more complex configurations under a number of conditions.
即使是适度的室内混响也会大大增加自动语音识别的难度。我们观察到,与头戴式麦克风的录音相比,在会议室使用远场(3-6英尺)麦克风时,语音识别单词错误率大幅增加。在本文中,我们描述了一种基于从长期(例如,2秒)分析窗口中减去对数谱估计值的拟议补救方法的实验,然后进行重叠添加再合成。由于该技术本质上是一种增强技术,因此它产生的处理信号可以用作完整语音识别系统的输入。在这里,我们报告了HTK和SRI Hub-5识别器的结果。对于更简单的识别器配置和/或中等规模的训练,改进是巨大的,而在许多条件下,对于更复杂的配置,仍然可以观察到适度的改进。
{"title":"Evaluating long-term spectral subtraction for reverberant ASR","authors":"David Gelbart, Nelson Morgan","doi":"10.1109/ASRU.2001.1034598","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034598","url":null,"abstract":"Even a modest degree of room reverberation can greatly increase the difficulty of automatic speech recognition. We have observed large increases in speech recognition word error rates when using a far-field (3-6 feet) microphone in a conference room, in comparison with recordings from head-mounted microphones. In this paper, we describe experiments with a proposed remedy based on the subtraction of an estimate of the log spectrum from a long-term (e.g., 2 s) analysis window, followed by overlap-add resynthesis. Since the technique is essentially one of enhancement, the processed signal it generates can be used as input for complete speech recognition systems. Here we report results with both the HTK and the SRI Hub-5 recognizer. For simpler recognizer configurations and/or moderate-sized training, the improvements are huge, while moderate improvements are still observed for more complex configurations under a number of conditions.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117213225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
n-gram and decision tree based language identification for written words 基于N-gram和决策树的文字识别
Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034655
J. Hakkinen, Jilei Tian
As the demand for multilingual speech recognizers increases, the development of systems which combine automatic language identification, language-specific pronunciation modeling and language-independent acoustic models becomes increasingly important. When the recognition grammar is dynamic and obtained directly from written text, the language associated with each grammar item has to be identified using that text. Many methods proposed in the literature require fairly large amounts of text, which may not always be available. This paper describes a text-based language identification system developed for the identification of the language of short words, e.g., proper names. Two different approaches are compared. The n-gram method commonly used in the literature is first reviewed and further enhanced. We also propose a simple method for language identification that is based on decision trees. The methods are first evaluated in a text-based language identification task. Both methods are also tested as preprocessors for a multilingual speech recognition task, where the language of each text item has to be determined, in order to choose the correct text-to-pronunciation mapping. The experimental results show that the proposed methods perform very well, and merit further development.
随着对多语言语音识别器需求的增加,将自动语言识别、特定语言语音建模和独立于语言的声学模型相结合的系统的开发变得越来越重要。当识别语法是动态的并且直接从书面文本获得时,必须使用该文本识别与每个语法项相关联的语言。文献中提出的许多方法需要相当大量的文本,而这些文本可能并不总是可用的。本文介绍了一种基于文本的语言识别系统,用于识别短词的语言,如专有名词。比较了两种不同的方法。首先对文献中常用的n-gram方法进行了回顾和进一步的改进。我们还提出了一种基于决策树的简单语言识别方法。首先在一个基于文本的语言识别任务中对这些方法进行评估。这两种方法还作为多语言语音识别任务的预处理器进行了测试,其中必须确定每个文本项的语言,以便选择正确的文本到发音映射。实验结果表明,该方法性能良好,值得进一步开发。
{"title":"n-gram and decision tree based language identification for written words","authors":"J. Hakkinen, Jilei Tian","doi":"10.1109/ASRU.2001.1034655","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034655","url":null,"abstract":"As the demand for multilingual speech recognizers increases, the development of systems which combine automatic language identification, language-specific pronunciation modeling and language-independent acoustic models becomes increasingly important. When the recognition grammar is dynamic and obtained directly from written text, the language associated with each grammar item has to be identified using that text. Many methods proposed in the literature require fairly large amounts of text, which may not always be available. This paper describes a text-based language identification system developed for the identification of the language of short words, e.g., proper names. Two different approaches are compared. The n-gram method commonly used in the literature is first reviewed and further enhanced. We also propose a simple method for language identification that is based on decision trees. The methods are first evaluated in a text-based language identification task. Both methods are also tested as preprocessors for a multilingual speech recognition task, where the language of each text item has to be determined, in order to choose the correct text-to-pronunciation mapping. The experimental results show that the proposed methods perform very well, and merit further development.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124946275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
The ALERT system: advanced broadcast speech recognition technology for selective dissemination of multimedia information ALERT系统:先进的广播语音识别技术,用于选择性地传播多媒体信息
Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034647
G. Rigoll
This paper presents a brief description of the ALERT system, which is under development by a consortium working on a research project sponsored by the European Commission. The ALERT system uses advanced speech recognition technology and video processing techniques in order to process large broadcast speech archives and multimedia information resources for the purpose of extracting specific information from such databases and inform selected customers about its contents. It is one of the most ambitious projects currently carried out in the human language technologies (HLT) area (see also http://alert.uni-duisburg.de). The paper describes the objectives of the overall system, its basic system architecture and the scientific approach taken in order to realize the specified demonstrators.
本文简要介绍了警报系统,该系统是由欧洲委员会资助的一个研究项目的一个财团开发的。ALERT系统采用先进的语音识别技术和视频处理技术,处理大型广播语音档案和多媒体信息资源,以便从这些数据库中提取特定信息,并将其内容告知选定的客户。它是目前在人类语言技术(HLT)领域开展的最雄心勃勃的项目之一(参见http://alert.uni-duisburg.de)。本文描述了整个系统的目标、基本系统架构以及为实现指定演示所采取的科学方法。
{"title":"The ALERT system: advanced broadcast speech recognition technology for selective dissemination of multimedia information","authors":"G. Rigoll","doi":"10.1109/ASRU.2001.1034647","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034647","url":null,"abstract":"This paper presents a brief description of the ALERT system, which is under development by a consortium working on a research project sponsored by the European Commission. The ALERT system uses advanced speech recognition technology and video processing techniques in order to process large broadcast speech archives and multimedia information resources for the purpose of extracting specific information from such databases and inform selected customers about its contents. It is one of the most ambitious projects currently carried out in the human language technologies (HLT) area (see also http://alert.uni-duisburg.de). The paper describes the objectives of the overall system, its basic system architecture and the scientific approach taken in order to realize the specified demonstrators.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114565900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
CORBA-based speech-to-speech translation system 基于corba的语音到语音翻译系统
Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034660
R. Gruhn, K. Takashima, A. Nishino, S. Nakamura
We describe the new implementation of a speech-to-speech translation system at ATR Spoken Language Translation Research Laboratories (SLT). We use the architecture standard CORBA (Common Object Request Broker Architecture) to interface between a speech recognizer, translation system and TTS engine. Various input types are supported, including close-talking microphone and telephony hardware.
我们描述了ATR口语翻译研究实验室(SLT)语音到语音翻译系统的新实现。我们使用体系结构标准CORBA(公共对象请求代理体系结构)作为语音识别器、翻译系统和TTS引擎之间的接口。支持各种输入类型,包括近距离交谈麦克风和电话硬件。
{"title":"CORBA-based speech-to-speech translation system","authors":"R. Gruhn, K. Takashima, A. Nishino, S. Nakamura","doi":"10.1109/ASRU.2001.1034660","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034660","url":null,"abstract":"We describe the new implementation of a speech-to-speech translation system at ATR Spoken Language Translation Research Laboratories (SLT). We use the architecture standard CORBA (Common Object Request Broker Architecture) to interface between a speech recognizer, translation system and TTS engine. Various input types are supported, including close-talking microphone and telephony hardware.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129177184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition 基于pca优化滤波器组的语音识别改进MFCC特征提取
Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034586
Shang-Ming Lee, Shih-Hau Fang, J. Hung, Lin-Shan Lee
Although Mel-frequency cepstral coefficients (MFCC) have been proven to perform very well under most conditions, some limited efforts have been made in optimizing the shape of the filters in the filter-bank in the conventional MFCC approach. This paper presents a new feature extraction approach that designs the shapes of the filters in the filter-bank. In this new approach, the filter-bank coefficients are data-driven and obtained by applying principal component analysis (PCA) to the FFT spectrum of the training data. The experimental results show that this method is robust under noisy environment and is well additive with other noise-handling techniques.
尽管Mel-frequency倒谱系数(MFCC)已被证明在大多数条件下都具有很好的性能,但在优化传统MFCC方法中滤波器组中滤波器的形状方面所做的努力有限。本文提出了一种新的特征提取方法,设计滤波器组中滤波器的形状。在这种新方法中,滤波器组系数是数据驱动的,并通过对训练数据的FFT谱应用主成分分析(PCA)获得。实验结果表明,该方法在噪声环境下具有较强的鲁棒性,与其他噪声处理技术具有较好的叠加性。
{"title":"Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition","authors":"Shang-Ming Lee, Shih-Hau Fang, J. Hung, Lin-Shan Lee","doi":"10.1109/ASRU.2001.1034586","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034586","url":null,"abstract":"Although Mel-frequency cepstral coefficients (MFCC) have been proven to perform very well under most conditions, some limited efforts have been made in optimizing the shape of the filters in the filter-bank in the conventional MFCC approach. This paper presents a new feature extraction approach that designs the shapes of the filters in the filter-bank. In this new approach, the filter-bank coefficients are data-driven and obtained by applying principal component analysis (PCA) to the FFT spectrum of the training data. The experimental results show that this method is robust under noisy environment and is well additive with other noise-handling techniques.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123726179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Language models beyond word strings 超越字串的语言模型
Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034614
E. Noth, A. Batliner, H. Niemann, G. Stemmer, F. Gallwitz, J. Spilker
In this paper we want to show how n-gram language models can be used to provide additional information in automatic speech understanding systems beyond the pure word chain. This becomes important in the context of conversational dialogue systems that have to recognize and interpret spontaneous speech. We show how n-grams can: (1) help to classify prosodic events like boundaries and accents; (2) be extended to directly provide boundary information in the speech recognition phase; (3) help to process speech repairs; and (4) detect and semantically classify out-of-vocabulary words. The approaches can work on the best word chain or a word hypotheses graph. Examples and experimental results are provided from our own research within the EVAR information retrieval system and the VERBMOBIL speech-to-speech translation system.
在本文中,我们想展示n-gram语言模型如何在自动语音理解系统中提供纯词链之外的附加信息。这在必须识别和解释自发言语的会话对话系统中变得非常重要。我们展示了n-gram如何:(1)帮助分类韵律事件,如边界和重音;(2)扩展到在语音识别阶段直接提供边界信息;(3)帮助处理言语修复;(4)对词汇外词进行检测和语义分类。这些方法可以在最佳词链或词假设图上工作。给出了我们在EVAR信息检索系统和ververmobil语音对语音翻译系统中的研究实例和实验结果。
{"title":"Language models beyond word strings","authors":"E. Noth, A. Batliner, H. Niemann, G. Stemmer, F. Gallwitz, J. Spilker","doi":"10.1109/ASRU.2001.1034614","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034614","url":null,"abstract":"In this paper we want to show how n-gram language models can be used to provide additional information in automatic speech understanding systems beyond the pure word chain. This becomes important in the context of conversational dialogue systems that have to recognize and interpret spontaneous speech. We show how n-grams can: (1) help to classify prosodic events like boundaries and accents; (2) be extended to directly provide boundary information in the speech recognition phase; (3) help to process speech repairs; and (4) detect and semantically classify out-of-vocabulary words. The approaches can work on the best word chain or a word hypotheses graph. Examples and experimental results are provided from our own research within the EVAR information retrieval system and the VERBMOBIL speech-to-speech translation system.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116483795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dialogue management in the Talk'n'Travel system Talk'n'Travel系统中的对话管理
Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034631
D. Stallard
A central problem for mixed-initiative dialogue management is coping with user utterances that fall outside of the expected sequence of dialogue. Independent initiative by the user may require a complete revision of the future course of the dialogue, even when the system is engaged in activities of its own, such as querying a database, etc. This paper presents an event-driven, goal-based dialogue manager component we have developed to cope with these challenges. The dialog manager is explicitly designed for asynchronous input and flexible control, and uses a tree-ordered rule language we have developed that also provides for close coupling with discourse processing. The dialogue manager is implemented as part of Talk'n'Travel, a simulated air travel reservation dialogue system we have developed under the US DARPA Communicator dialogue research program, whose purpose and scope we also briefly summarize.
混合主动对话管理的一个中心问题是处理超出预期对话序列的用户话语。用户的独立主动性可能需要对对话的未来进程进行完整的修订,甚至当系统从事自己的活动时也是如此,例如查询数据库等。本文介绍了我们开发的一个事件驱动的、基于目标的对话管理器组件,以应对这些挑战。对话管理器明确为异步输入和灵活控制而设计,并使用我们开发的树序规则语言,该语言还提供了与话语处理的紧密耦合。对话管理器是作为Talk'n'Travel的一部分实现的,Talk'n'Travel是我们在美国国防部高级研究计划局通信者对话研究项目下开发的模拟航空旅行预订对话系统,我们还简要总结了其目的和范围。
{"title":"Dialogue management in the Talk'n'Travel system","authors":"D. Stallard","doi":"10.1109/ASRU.2001.1034631","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034631","url":null,"abstract":"A central problem for mixed-initiative dialogue management is coping with user utterances that fall outside of the expected sequence of dialogue. Independent initiative by the user may require a complete revision of the future course of the dialogue, even when the system is engaged in activities of its own, such as querying a database, etc. This paper presents an event-driven, goal-based dialogue manager component we have developed to cope with these challenges. The dialog manager is explicitly designed for asynchronous input and flexible control, and uses a tree-ordered rule language we have developed that also provides for close coupling with discourse processing. The dialogue manager is implemented as part of Talk'n'Travel, a simulated air travel reservation dialogue system we have developed under the US DARPA Communicator dialogue research program, whose purpose and scope we also briefly summarize.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114688771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
The statistical approach to spoken language translation 口语翻译的统计方法
Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034663
H. Ney
This paper gives an overview of our work on statistical machine translation of spoken dialogues, in particular in the framework of the VERBMOBIL project. The goal of the VERBMOBIL project is the translation of spoken dialogues in the domains of appointment scheduling and travel planning. Starting with the Bayes decision rule as in speech recognition; we show how the required probability distributions can be structured into three parts: the language model, the alignment model and the lexicon model. We describe the components of the system and report results on the VERBMOBIL task. The experience obtained in the VERBMOBIL project, in particular a largescale end-to-end evaluation, showed that the statistical approach resulted in significantly lower error rates than three competing translation approaches: the sentence error rate was 29% in comparison with 52% to 62% for the other translation approaches. Finally, we discuss the integrated approach to speech translation as opposed to the serial approach that is widely used nowadays.
本文概述了我们在口语对话的统计机器翻译方面的工作,特别是在VERBMOBIL项目的框架下。ververmobil项目的目标是翻译预约安排和旅行计划领域的口语对话。从语音识别中的贝叶斯决策规则开始;我们展示了如何将所需的概率分布分为三部分:语言模型、对齐模型和词典模型。我们描述了系统的组成部分,并报告了ververmobil任务的结果。从ververmobil项目中获得的经验,特别是大规模的端到端评估表明,统计方法的错误率明显低于三种竞争翻译方法:句子错误率为29%,而其他翻译方法的错误率为52%至62%。最后,我们讨论了语音翻译的综合方法,而不是目前广泛使用的串行方法。
{"title":"The statistical approach to spoken language translation","authors":"H. Ney","doi":"10.1109/ASRU.2001.1034663","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034663","url":null,"abstract":"This paper gives an overview of our work on statistical machine translation of spoken dialogues, in particular in the framework of the VERBMOBIL project. The goal of the VERBMOBIL project is the translation of spoken dialogues in the domains of appointment scheduling and travel planning. Starting with the Bayes decision rule as in speech recognition; we show how the required probability distributions can be structured into three parts: the language model, the alignment model and the lexicon model. We describe the components of the system and report results on the VERBMOBIL task. The experience obtained in the VERBMOBIL project, in particular a largescale end-to-end evaluation, showed that the statistical approach resulted in significantly lower error rates than three competing translation approaches: the sentence error rate was 29% in comparison with 52% to 62% for the other translation approaches. Finally, we discuss the integrated approach to speech translation as opposed to the serial approach that is widely used nowadays.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126359279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Computing consensus translation from multiple machine translation systems 从多个机器翻译系统计算共识翻译
Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034659
B. Bangalore, Germán Bordel, G. Riccardi
We address the problem of computing a consensus translation given the outputs from a set of machine translation (MT) systems. The translations from the MT systems are aligned with a multiple string alignment algorithm and the consensus translation is then computed. We describe the multiple string alignment algorithm and the consensus MT hypothesis computation. We report on the subjective and objective performance of the multilingual acquisition approach on a limited domain spoken language application. We evaluate five domain-independent off-the-shelf MT systems and show that the consensus-based translation performance is equal to or better than any of the given MT systems, in terms of both objective and subjective measures.
我们解决了给定一组机器翻译(MT)系统的输出计算共识翻译的问题。通过多字符串对齐算法对来自MT系统的翻译进行对齐,然后计算共识翻译。我们描述了多字符串对齐算法和共识MT假设的计算。我们报告了多语言习得方法在有限领域口语应用中的主观和客观表现。我们评估了五个独立于领域的现成机器翻译系统,并表明基于共识的翻译性能等于或优于任何给定的机器翻译系统,无论是在客观还是主观的度量方面。
{"title":"Computing consensus translation from multiple machine translation systems","authors":"B. Bangalore, Germán Bordel, G. Riccardi","doi":"10.1109/ASRU.2001.1034659","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034659","url":null,"abstract":"We address the problem of computing a consensus translation given the outputs from a set of machine translation (MT) systems. The translations from the MT systems are aligned with a multiple string alignment algorithm and the consensus translation is then computed. We describe the multiple string alignment algorithm and the consensus MT hypothesis computation. We report on the subjective and objective performance of the multilingual acquisition approach on a limited domain spoken language application. We evaluate five domain-independent off-the-shelf MT systems and show that the consensus-based translation performance is equal to or better than any of the given MT systems, in terms of both objective and subjective measures.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126434804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 177
期刊
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1