An architecture for spoken dialogue management

Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI:10.21437/ICSLP.1996-270

D. Duff, B. Gates, S. Luperfoy

{"title":"An architecture for spoken dialogue management","authors":"D. Duff, B. Gates, S. Luperfoy","doi":"10.21437/ICSLP.1996-270","DOIUrl":null,"url":null,"abstract":"We propose an architecture for integrating discourse processing and speech recognition (SR) in spoken dialogue systems. It was first developed for computer-mediated bilingual dialogue in voiceto-voice machine translation applications and we apply it here to a distributed battlefield simulation system used for military training. According to this architecture discourse functions previously distributed through the interface code are collected into a centralized discourse capability. The Dialogue Manager (DM) acts as a third-party mediator overseeing the translation of input and output utterances between English and the command language of the backend system. The DM calls the Discourse Processor (DP) to update the context representation each time an utterance is issued or when a salient non-linguistic event occurs in the simulation. The DM is responsible for managing the interaction among components of the interface system and the user. For task-based human-computer dialogue systems it consults three sources of nonlinguistic context constraint in addition to the linguistic Discourse State: (1) a User Model, (2) a static Domain Model containing rules for engaging the backend system, with a grammar for the language of well-formed, executable commands, and (3) a dynamic Backend Model (BEM) that maintains updated status for salient aspects of the non-linguistic context. In this paper we describe its four-step recovery algorithm invoked by DM whenever an item is unclear in the current context, or when an interpretation error is, and show how parameter settings on the algorithm can modify the overall behavior of the system from Tutor to Trainer. This is offered to illustrate how limited (inexpensive) dialogue processing functionality, judiciously selected, and designed in conjunction with expectations for human dialogue behavior can compensate for inevitable limitations in SR, NL processor, the backend software application, or even in the user’s understanding of the task or the software system. 1. SPOKEN DIALOGUE SYSTEMS 1.1 Integrating Discourse and SR Waibel et al., (1989) and De Mori et al., (1988) extend stochastic language modeling techniques to the discourse level to improve spoken dialogue systems. The complexity of discourse state descriptions leads to a sparse data problem during training, and idiosyncratic human behavior at run time can defeat even the best probabilistic dialogue model. Symbolic approaches to spoken discourse data identify discourse constraints on language model selection at run time. Our work collects discourse-level processing into a centralized discourse capability as part of a modular user interface dialogue architecture. Its use in a spoken dialogue interface to a distributed battlefield simulation system used for military training is diagrammed in Figure 1.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"69 1","pages":"1025-1028"},"PeriodicalIF":0.0000,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings : ICSLP. International Conference on Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1996-270","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

We propose an architecture for integrating discourse processing and speech recognition (SR) in spoken dialogue systems. It was first developed for computer-mediated bilingual dialogue in voiceto-voice machine translation applications and we apply it here to a distributed battlefield simulation system used for military training. According to this architecture discourse functions previously distributed through the interface code are collected into a centralized discourse capability. The Dialogue Manager (DM) acts as a third-party mediator overseeing the translation of input and output utterances between English and the command language of the backend system. The DM calls the Discourse Processor (DP) to update the context representation each time an utterance is issued or when a salient non-linguistic event occurs in the simulation. The DM is responsible for managing the interaction among components of the interface system and the user. For task-based human-computer dialogue systems it consults three sources of nonlinguistic context constraint in addition to the linguistic Discourse State: (1) a User Model, (2) a static Domain Model containing rules for engaging the backend system, with a grammar for the language of well-formed, executable commands, and (3) a dynamic Backend Model (BEM) that maintains updated status for salient aspects of the non-linguistic context. In this paper we describe its four-step recovery algorithm invoked by DM whenever an item is unclear in the current context, or when an interpretation error is, and show how parameter settings on the algorithm can modify the overall behavior of the system from Tutor to Trainer. This is offered to illustrate how limited (inexpensive) dialogue processing functionality, judiciously selected, and designed in conjunction with expectations for human dialogue behavior can compensate for inevitable limitations in SR, NL processor, the backend software application, or even in the user’s understanding of the task or the software system. 1. SPOKEN DIALOGUE SYSTEMS 1.1 Integrating Discourse and SR Waibel et al., (1989) and De Mori et al., (1988) extend stochastic language modeling techniques to the discourse level to improve spoken dialogue systems. The complexity of discourse state descriptions leads to a sparse data problem during training, and idiosyncratic human behavior at run time can defeat even the best probabilistic dialogue model. Symbolic approaches to spoken discourse data identify discourse constraints on language model selection at run time. Our work collects discourse-level processing into a centralized discourse capability as part of a modular user interface dialogue architecture. Its use in a spoken dialogue interface to a distributed battlefield simulation system used for military training is diagrammed in Figure 1.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于口语对话管理的体系结构

我们提出了一种在口语对话系统中整合话语处理和语音识别(SR)的架构。它最初是为语音对语音机器翻译应用中的计算机中介双语对话开发的，我们在这里将其应用于用于军事训练的分布式战场模拟系统。根据这种架构，以前通过接口代码分布的话语功能被集中到一个集中的话语能力中。对话管理器(DM)充当第三方中介，监督英语和后端系统的命令语言之间输入和输出话语的翻译。DM调用话语处理器(DP)在每次发出话语或在模拟中发生显著的非语言事件时更新上下文表示。DM负责管理接口系统组件和用户之间的交互。对于基于任务的人机对话系统，除了语言话语状态外，它还参考了三种非语言上下文约束来源:(1)用户模型，(2)静态领域模型，其中包含用于与后端系统交互的规则，以及用于格式良好的可执行命令语言的语法，以及(3)动态后端模型(BEM)，该模型维护非语言上下文突出方面的更新状态。在本文中，我们描述了它的四步恢复算法，当一个项目在当前上下文中不清楚时，或者当解释错误时，DM调用它，并展示了算法上的参数设置如何修改系统从导师到训练者的整体行为。这是为了说明有限的(廉价的)对话处理功能，明智地选择和设计与人类对话行为的期望相结合，可以弥补SR, NL处理器，后端软件应用程序，甚至是用户对任务或软件系统的理解中不可避免的限制。1. SR Waibel等人(1989)和De Mori等人(1988)将随机语言建模技术扩展到话语层面，以改进口语对话系统。话语状态描述的复杂性导致训练过程中的数据稀疏问题，而运行时的特殊人类行为甚至可以击败最好的概率对话模型。口语话语数据的符号化方法在运行时识别语言模型选择的话语约束。我们的工作将话语级处理收集到集中的话语能力中，作为模块化用户界面对话架构的一部分。它在用于军事训练的分布式战场模拟系统的语音对话界面中的使用如图1所示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings : ICSLP. International Conference on Spoken Language Processing

自引率

0.00%

发文量

期刊最新文献

Audiovisual integration of speech by children and adults with cochlear implants AUDIOVISUAL INTEGRATION OF SPEECH BY CHILDREN AND ADULTS WITH COCHEAR IMPLANTS. Efficient adaptation of TTS duration model to new speakers SABLE: a standard for TTS markup A three-dimensional linear articulatory model based on MRI data