首页 > 最新文献

2008 IEEE Spoken Language Technology Workshop最新文献

英文 中文
Modelling multimodal user ID in dialogue 在对话中建模多模态用户ID
Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777853
H. Holzapfel, A. Waibel
This paper presents an approach to model user ID in dialogue. A belief network is used to integrate ID classifiers, such as face ID and voice ID, and person related information, such as the first name and last name of a person from speech recognition or spelling. Different network structures are analyzed and compared with each other and are compared with a rule-based user model. The approach is evaluated on dialogue data collected in a person identification scenario, which includes both, identification of known persons and interactive learning of names and ID of unknown persons.
本文提出了一种对对话中的用户ID进行建模的方法。使用信念网络将身份分类器(如人脸识别和语音识别)与人相关信息(如语音识别或拼写中的人的姓和名)集成在一起。对不同的网络结构进行了分析和比较,并与基于规则的用户模型进行了比较。该方法在人员识别场景中收集的对话数据上进行了评估,该场景包括已知人员的识别和未知人员姓名和身份的交互式学习。
{"title":"Modelling multimodal user ID in dialogue","authors":"H. Holzapfel, A. Waibel","doi":"10.1109/SLT.2008.4777853","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777853","url":null,"abstract":"This paper presents an approach to model user ID in dialogue. A belief network is used to integrate ID classifiers, such as face ID and voice ID, and person related information, such as the first name and last name of a person from speech recognition or spelling. Different network structures are analyzed and compared with each other and are compared with a rule-based user model. The approach is evaluated on dialogue data collected in a person identification scenario, which includes both, identification of known persons and interactive learning of names and ID of unknown persons.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124961889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Identifying salient utterances of online spoken documents using descriptive hypertext 使用描述性超文本识别在线口语文档的显著话语
Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777868
Xiao-Dan Zhu, Siavash Kazemian, Gerald Penn
The Internet has become an important supply channel of spoken documents. Efficient ways of navigating their content are highly desirable. This paper aims to identify the most salient utterances from online spoken documents using relevant hypertext that encapsulates key information. Experimental results show that hypertext features are helpful when properly utilized and if the bit rates used to compress the spoken documents are reasonable.
互联网已成为口头文件的重要供应渠道。高效的内容导航方式是非常可取的。本文旨在利用包含关键信息的相关超文本从在线口语文档中识别出最突出的话语。实验结果表明,如果使用合理的比特率压缩语音文档,超文本特征是有帮助的。
{"title":"Identifying salient utterances of online spoken documents using descriptive hypertext","authors":"Xiao-Dan Zhu, Siavash Kazemian, Gerald Penn","doi":"10.1109/SLT.2008.4777868","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777868","url":null,"abstract":"The Internet has become an important supply channel of spoken documents. Efficient ways of navigating their content are highly desirable. This paper aims to identify the most salient utterances from online spoken documents using relevant hypertext that encapsulates key information. Experimental results show that hypertext features are helpful when properly utilized and if the bit rates used to compress the spoken documents are reasonable.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"7 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123610746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint n-best rescoring for repeated utterances in spoken dialog systems 口语对话系统中重复话语的联合n-best评分
Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777858
D. Bohus, G. Zweig, Patrick Nguyen, Xiao Li
Due to speech recognition errors, repetitions are a frequent phenomenon in spoken dialog systems. In previous work (G. Zweig et al., 2008) we have proposed a joint decoding model that can leverage structural relationships between repeated utterances for improving recognition performance. In this paper we extend this work in two directions. First, we propose a direct, classification-based model for the same task. The new model can leverage features that were fundamentally hard to capture in the previous framework (e.g. spellings, false-starts, etc.) and leads to an additional performance improvement. Second, we show how both models can be used to perform a combined rescoring of two n-best lists that are part of a repetition pair.
由于语音识别错误,重复是口语对话系统中常见的现象。在之前的工作中(G. Zweig et al., 2008),我们提出了一个联合解码模型,可以利用重复话语之间的结构关系来提高识别性能。在本文中,我们从两个方面扩展了这项工作。首先,我们为相同的任务提出了一个直接的、基于分类的模型。新模型可以利用在以前的框架中难以捕获的特性(例如拼写、误启动等),并带来额外的性能改进。其次,我们展示了如何使用这两个模型对作为重复对的一部分的两个n个最佳列表执行组合评分。
{"title":"Joint n-best rescoring for repeated utterances in spoken dialog systems","authors":"D. Bohus, G. Zweig, Patrick Nguyen, Xiao Li","doi":"10.1109/SLT.2008.4777858","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777858","url":null,"abstract":"Due to speech recognition errors, repetitions are a frequent phenomenon in spoken dialog systems. In previous work (G. Zweig et al., 2008) we have proposed a joint decoding model that can leverage structural relationships between repeated utterances for improving recognition performance. In this paper we extend this work in two directions. First, we propose a direct, classification-based model for the same task. The new model can leverage features that were fundamentally hard to capture in the previous framework (e.g. spellings, false-starts, etc.) and leads to an additional performance improvement. Second, we show how both models can be used to perform a combined rescoring of two n-best lists that are part of a repetition pair.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130072052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
PDTSL: An annotated resource for speech reconstruction PDTSL:语音重建的带注释资源
Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777848
Jan Hajic, Silvie Cinková, Marie Mikulová, P. Pajas, J. Ptáček, J. Toman, Zdenka Uresová
We present a description of a new resource (Prague Dependency Treebank of Spoken Language) being created for English and Czech to be used for the task of speech understanding, broad natural language analysis for dialog systems and other speech-related tasks, including speech editing. The resources we have created so far contain audio and a standard transcription of spontaneous speech, but as a novel layer, we add an edited (ldquoreconstructedrdquo) version of the spoken utterances. These edits go beyond the scope of current speech reconstruction efforts in that we allow, on top of the usual deletions of speech artifacts, fillers, etc. also for word modifications, insertions and word order changes. We have used both monologue and dialogue recordings in English and Czech to verify the feasibility of such transcription. We have also assessed the quality of the resulting annotation since the relative freedom of the editing raises an issue of what a ldquocorrectrdquo annotation is.
我们介绍了一个为英语和捷克语创建的新资源(Prague Dependency Treebank of Spoken Language)的描述,用于语音理解任务、对话系统的广泛自然语言分析和其他语音相关任务,包括语音编辑。到目前为止,我们创建的资源包含音频和自发语音的标准转录,但作为一个新颖的层,我们添加了语音的编辑(ldquoreconstructedquo)版本。这些编辑超出了当前语音重建工作的范围,因为我们允许,除了通常的语音工件删除,填充等之外,还允许修改,插入和词序更改。我们使用了英语和捷克语的独白和对话录音来验证这种转录的可行性。我们还评估了最终注释的质量,因为编辑的相对自由提出了一个问题,即什么是最不正确的注释。
{"title":"PDTSL: An annotated resource for speech reconstruction","authors":"Jan Hajic, Silvie Cinková, Marie Mikulová, P. Pajas, J. Ptáček, J. Toman, Zdenka Uresová","doi":"10.1109/SLT.2008.4777848","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777848","url":null,"abstract":"We present a description of a new resource (Prague Dependency Treebank of Spoken Language) being created for English and Czech to be used for the task of speech understanding, broad natural language analysis for dialog systems and other speech-related tasks, including speech editing. The resources we have created so far contain audio and a standard transcription of spontaneous speech, but as a novel layer, we add an edited (ldquoreconstructedrdquo) version of the spoken utterances. These edits go beyond the scope of current speech reconstruction efforts in that we allow, on top of the usual deletions of speech artifacts, fillers, etc. also for word modifications, insertions and word order changes. We have used both monologue and dialogue recordings in English and Czech to verify the feasibility of such transcription. We have also assessed the quality of the resulting annotation since the relative freedom of the editing raises an issue of what a ldquocorrectrdquo annotation is.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117227379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Automatic framenet-based annotation of conversational speech 会话语音的自动基于框架的注释
Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777843
Bonaventura Coppola, Alessandro Moschitti, Sara Tonelli, G. Riccardi
Current Spoken Language Understanding technology is based on a simple concept annotation of word sequences, where the interdependencies between concepts and their compositional semantics are neglected. This prevents an effective handling of language phenomena, with a consequential limitation on the design of more complex dialog systems. In this paper, we argue that shallow semantic representation as formulated in the Berkeley FrameNet Project may be useful to improve the capability of managing more complex dialogs. To prove this, the first step is to show that a FrameNet parser of sufficient accuracy can be designed for conversational speech. We show that exploiting a small set of FrameNet-based manual annotations, it is possible to design an effective semantic parser. Our experiments on an Italian spoken dialog corpus, created within the LUNA project, show that our approach is able to automatically annotate unseen dialog turns with a high accuracy.
当前的口语理解技术是基于对词序列的简单概念标注,忽略了概念之间的相互依赖关系及其组成语义。这阻碍了对语言现象的有效处理,从而限制了更复杂对话系统的设计。在本文中,我们认为在伯克利框架项目中制定的浅语义表示可能有助于提高管理更复杂对话的能力。为了证明这一点,第一步是证明可以为会话语音设计一个足够精确的FrameNet解析器。我们展示了利用一小组基于框架的手动注释,可以设计一个有效的语义解析器。我们在LUNA项目中创建的意大利语口语对话语料库上的实验表明,我们的方法能够以高精度自动注释未见过的对话回合。
{"title":"Automatic framenet-based annotation of conversational speech","authors":"Bonaventura Coppola, Alessandro Moschitti, Sara Tonelli, G. Riccardi","doi":"10.1109/SLT.2008.4777843","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777843","url":null,"abstract":"Current Spoken Language Understanding technology is based on a simple concept annotation of word sequences, where the interdependencies between concepts and their compositional semantics are neglected. This prevents an effective handling of language phenomena, with a consequential limitation on the design of more complex dialog systems. In this paper, we argue that shallow semantic representation as formulated in the Berkeley FrameNet Project may be useful to improve the capability of managing more complex dialogs. To prove this, the first step is to show that a FrameNet parser of sufficient accuracy can be designed for conversational speech. We show that exploiting a small set of FrameNet-based manual annotations, it is possible to design an effective semantic parser. Our experiments on an Italian spoken dialog corpus, created within the LUNA project, show that our approach is able to automatically annotate unseen dialog turns with a high accuracy.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134045737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Simultaneous machine translation of german lectures into english: Investigating research challenges for the future 德语讲座同声翻译成英语:调查未来的研究挑战
Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777883
Matthias Wölfel, M. Kolss, Florian Kraft, J. Niehues, M. Paulik, A. Waibel
An increasingly globalized world fosters the exchange of students, researchers or employees. As a result, situations in which people of different native tongues are listening to the same lecture become more and more frequent. In many such situations, human interpreters are prohibitively expensive or simply not available. For this reason, and because first prototypes have already demonstrated the feasibility of such systems, automatic translation of lectures receives increasing attention. A large vocabulary and strong variations in speaking style make lecture translation a challenging, however not hopeless, task. The scope of this paper is to investigate a variety of challenges and to highlight possible solutions in building a system for simultaneous translation of lectures from German to English. While some of the investigated challenges are more general, e.g. environment robustness, other challenges are more specific for this particular task, e.g. pronunciation of foreign words or sentence segmentation. We also report our progress in building an end-to-end system and analyze its performance in terms of objective and subjective measures.
日益全球化的世界促进了学生、研究人员或雇员的交流。因此,讲不同母语的人听同一个讲座的情况变得越来越频繁。在许多这样的情况下,人工口译员非常昂贵或根本无法使用。由于这个原因,并且由于第一个原型已经证明了这种系统的可行性,讲座的自动翻译越来越受到关注。词汇量大,说话风格多变,使得讲座翻译成为一项具有挑战性的任务,但并非毫无希望。本文的范围是调查各种挑战,并强调可能的解决方案,以建立一个系统,从德语到英语的讲座同声翻译。虽然一些被调查的挑战更普遍,例如环境鲁棒性,但其他挑战更具体地针对特定任务,例如外文单词的发音或句子分割。我们也报告我们在建立端到端系统方面的进展,并从客观和主观的角度分析其表现。
{"title":"Simultaneous machine translation of german lectures into english: Investigating research challenges for the future","authors":"Matthias Wölfel, M. Kolss, Florian Kraft, J. Niehues, M. Paulik, A. Waibel","doi":"10.1109/SLT.2008.4777883","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777883","url":null,"abstract":"An increasingly globalized world fosters the exchange of students, researchers or employees. As a result, situations in which people of different native tongues are listening to the same lecture become more and more frequent. In many such situations, human interpreters are prohibitively expensive or simply not available. For this reason, and because first prototypes have already demonstrated the feasibility of such systems, automatic translation of lectures receives increasing attention. A large vocabulary and strong variations in speaking style make lecture translation a challenging, however not hopeless, task. The scope of this paper is to investigate a variety of challenges and to highlight possible solutions in building a system for simultaneous translation of lectures from German to English. While some of the investigated challenges are more general, e.g. environment robustness, other challenges are more specific for this particular task, e.g. pronunciation of foreign words or sentence segmentation. We also report our progress in building an end-to-end system and analyze its performance in terms of objective and subjective measures.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"28 9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132723235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Speaker turn characterization for spoken dialog system monitoring and adaptation 针对口语对话系统监测和适应的说话人转向表征
Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777860
Géraldine Damnati, Frédéric Béchet, R. Mori
This paper describes an utterance classification method based on a multiple decoding scheme. We use the Spoken Language Understanding (SLU) strategy proposed within the European project LUNA. The goal of this classification process is to characterize each speaker's turn, in a dialog context, according to different categories relevant from an SLU point of view: out-of-domain messages, requests not covered by the interpretation module, frequent requests,.... These categories are used for two purposes in an off-line mode: system monitoring for detecting changes in users' behaviour and system adaptation by selecting dialogs likely to contain some phenomenon poorly covered by the models for an active learning scheme. All the models and the evaluations are performed on the France Telecom FT3000 corpus.
本文提出了一种基于多重解码方案的语音分类方法。我们使用在欧洲项目LUNA中提出的口语理解(SLU)策略。这个分类过程的目标是根据从SLU的角度来看相关的不同类别来描述每个说话者在对话上下文中的轮流:域外消息、解释模块未涵盖的请求、频繁请求、....这些类别在离线模式下用于两个目的:通过系统监控来检测用户行为的变化,以及通过选择可能包含一些未被主动学习方案模型覆盖的现象的对话来适应系统。在法国电信FT3000语料库上进行了模型和评价。
{"title":"Speaker turn characterization for spoken dialog system monitoring and adaptation","authors":"Géraldine Damnati, Frédéric Béchet, R. Mori","doi":"10.1109/SLT.2008.4777860","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777860","url":null,"abstract":"This paper describes an utterance classification method based on a multiple decoding scheme. We use the Spoken Language Understanding (SLU) strategy proposed within the European project LUNA. The goal of this classification process is to characterize each speaker's turn, in a dialog context, according to different categories relevant from an SLU point of view: out-of-domain messages, requests not covered by the interpretation module, frequent requests,.... These categories are used for two purposes in an off-line mode: system monitoring for detecting changes in users' behaviour and system adaptation by selecting dialogs likely to contain some phenomenon poorly covered by the models for an active learning scheme. All the models and the evaluations are performed on the France Telecom FT3000 corpus.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131073803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Class-based named entity translation in a speech to speech translation system 语音到语音翻译系统中基于类的命名实体翻译
Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777888
S. Maskey, Martin Cmejrek, Bowen Zhou, Yuqing Gao
Named entity (NE) translation is a challenging problem in machine translation (MT). Most of the training bi-text corpora for MT lack enough samples of NEs to cover the wide variety of contexts NEs can appear in. In this paper, we present a technique to translate NEs based on their NE types in addition to a phrase-based translation model. Our NE translation model is based on a syntax-based system similar to the work of Chiang (2005); but we produce syntax-based rules with non-terminals as NE types instead of general non-terminals. Such class-based rules allow us to better generalize the context NEs. We show that our proposed method obtains an improvement of 0.66 BLEU score absolute as well as 0.26% in F1-measure over the baseline of phrase-based model in NE test set.
命名实体(NE)翻译是机器翻译中一个具有挑战性的问题。大多数用于机器翻译的训练双文语料库缺乏足够的网元样本来覆盖网元可能出现的各种上下文。在本文中,除了基于短语的翻译模型外,我们还提出了一种基于网元类型的翻译技术。我们的NE翻译模型是基于一个类似于Chiang(2005)的基于语法的系统;但是我们生成基于语法的规则,将非终结符作为网元类型,而不是一般的非终结符。这种基于类的规则使我们能够更好地概括上下文网元。结果表明,本文提出的方法在NE测试集中比基于短语的模型的基线提高了0.66 BLEU绝对分数和0.26%的f1测度。
{"title":"Class-based named entity translation in a speech to speech translation system","authors":"S. Maskey, Martin Cmejrek, Bowen Zhou, Yuqing Gao","doi":"10.1109/SLT.2008.4777888","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777888","url":null,"abstract":"Named entity (NE) translation is a challenging problem in machine translation (MT). Most of the training bi-text corpora for MT lack enough samples of NEs to cover the wide variety of contexts NEs can appear in. In this paper, we present a technique to translate NEs based on their NE types in addition to a phrase-based translation model. Our NE translation model is based on a syntax-based system similar to the work of Chiang (2005); but we produce syntax-based rules with non-terminals as NE types instead of general non-terminals. Such class-based rules allow us to better generalize the context NEs. We show that our proposed method obtains an improvement of 0.66 BLEU score absolute as well as 0.26% in F1-measure over the baseline of phrase-based model in NE test set.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123634333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Using prior knowledge to assess relevance in speech summarization 运用先验知识评估语音摘要的相关性
Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777867
Ricardo Ribeiro, David Martins de Matos
We explore the use of topic-based automatically acquired prior knowledge in speech summarization, assessing its influence throughout several term weighting schemes. All information is combined using latent semantic analysis as a core procedure to compute the relevance of the sentence-like units of the given input source. Evaluation is performed using the self-information measure, which tries to capture the informativeness of the summary in relation to the summarized input source. The similarity of the output summaries of the several approaches is also analyzed.
我们探索了基于主题的自动获取先验知识在语音摘要中的使用,评估了其在几种术语加权方案中的影响。所有信息以潜在语义分析为核心程序组合,计算给定输入源的类句子单元的相关性。评估是使用自信息度量来执行的,它试图捕获与汇总输入源相关的摘要的信息量。分析了几种方法输出摘要的相似度。
{"title":"Using prior knowledge to assess relevance in speech summarization","authors":"Ricardo Ribeiro, David Martins de Matos","doi":"10.1109/SLT.2008.4777867","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777867","url":null,"abstract":"We explore the use of topic-based automatically acquired prior knowledge in speech summarization, assessing its influence throughout several term weighting schemes. All information is combined using latent semantic analysis as a core procedure to compute the relevance of the sentence-like units of the given input source. Evaluation is performed using the self-information measure, which tries to capture the informativeness of the summary in relation to the summarized input source. The similarity of the output summaries of the several approaches is also analyzed.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124939076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Methods for improving the quality of syllable based speech synthesis 提高基于音节的语音合成质量的方法
Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777832
Y. R. Venugopalakrishna, M. V. Vinodh, H. Murthy, C. S. Ramalingam
Our earlier work [1] on speech synthesis has shown that syllables can produce reasonably natural quality speech. Nevertheless, audible artifacts are present due to discontinuities in pitch, energy, and formant trajectories at the joining point of the units. In this paper, we present some minimal signal modification techniques for reducing these artifacts.
我们在语音合成方面的早期工作b[1]表明,音节可以产生相当自然的语音质量。然而,由于单元连接点的音高、能量和形成峰轨迹的不连续性,存在可听伪影。在本文中,我们提出了一些最小的信号修改技术来减少这些伪影。
{"title":"Methods for improving the quality of syllable based speech synthesis","authors":"Y. R. Venugopalakrishna, M. V. Vinodh, H. Murthy, C. S. Ramalingam","doi":"10.1109/SLT.2008.4777832","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777832","url":null,"abstract":"Our earlier work [1] on speech synthesis has shown that syllables can produce reasonably natural quality speech. Nevertheless, audible artifacts are present due to discontinuities in pitch, energy, and formant trajectories at the joining point of the units. In this paper, we present some minimal signal modification techniques for reducing these artifacts.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127221713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
期刊
2008 IEEE Spoken Language Technology Workshop
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1