In this paper we describe a method for developing a virtual instructor for pedestrian navigation based on real interactions between a human instructor and a human pedestrian. A virtual instructor is an agent capable of fulfilling the role of a human instructor, and its goal is to assist a pedestrian in the accomplishment of different tasks within the context of a real city. The instructor decides what to say using a generation by selection algorithm, based on a corpus of real interactions generated within the world of interest. The instructor is able to react to different requests by the pedestrian. It is also aware of the pedestrian position with a certain degree of uncertainty, and it can use different city landmarks to guide him.
{"title":"A Natural Language Instructor for pedestrian navigation based in generation by selection","authors":"Santiago Avalos, Luciana Benotti","doi":"10.3115/v1/W14-0205","DOIUrl":"https://doi.org/10.3115/v1/W14-0205","url":null,"abstract":"In this paper we describe a method for developing a virtual instructor for pedestrian navigation based on real interactions between a human instructor and a human pedestrian. A virtual instructor is an agent capable of fulfilling the role of a human instructor, and its goal is to assist a pedestrian in the accomplishment of different tasks within the context of a real city. The instructor decides what to say using a generation by selection algorithm, based on a corpus of real interactions generated within the world of interest. The instructor is able to react to different requests by the pedestrian. It is also aware of the pedestrian position with a certain degree of uncertainty, and it can use different city landmarks to guide him.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121892195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Labský, L. Kunc, T. Macek, Jan Kleindienst, J. Vystrcil
In this paper we describe a set of techniques we found suitable for building multi-modal search applications for automotive environments. As these applications often search across different topical domains, such as maps, weather or Wikipedia, we discuss the problem of switching focus between different domains. Also, we propose techniques useful for minimizing the response time of the search system in mobile environment. We evaluate some of the proposed techniques by means of usability tests with 10 novice test subjects who drove a simulated lane change test on a driving simulator. We report results describing the induced driving distraction and user acceptance.
{"title":"Recipes for building voice search UIs for automotive","authors":"M. Labský, L. Kunc, T. Macek, Jan Kleindienst, J. Vystrcil","doi":"10.3115/v1/W14-0204","DOIUrl":"https://doi.org/10.3115/v1/W14-0204","url":null,"abstract":"In this paper we describe a set of techniques we found suitable for building multi-modal search applications for automotive environments. As these applications often search across different topical domains, such as maps, weather or Wikipedia, we discuss the problem of switching focus between different domains. Also, we propose techniques useful for minimizing the response time of the search system in mobile environment. We evaluate some of the proposed techniques by means of usability tests with 10 novice test subjects who drove a simulated lane change test on a driving simulator. We report results describing the induced driving distraction and user acceptance.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130548495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aasish Pappu, Ming Sun, Seshadri Sridharan, Alexander I. Rudnicky
Open environments present an attention management challenge for conversational systems. We describe a kiosk system (based on Ravenclaw‐Olympus) that uses simple auditory and visual information to interpret human presence and manage the system’s attention. The system robustly differentiates intended interactions from unintended ones at an accuracy of 93% and provides similar task completion rates in both a quiet room and a public space.
{"title":"Conversational Strategies for Robustly Managing Dialog in Public Spaces","authors":"Aasish Pappu, Ming Sun, Seshadri Sridharan, Alexander I. Rudnicky","doi":"10.3115/v1/W14-0211","DOIUrl":"https://doi.org/10.3115/v1/W14-0211","url":null,"abstract":"Open environments present an attention management challenge for conversational systems. We describe a kiosk system (based on Ravenclaw‐Olympus) that uses simple auditory and visual information to interpret human presence and manage the system’s attention. The system robustly differentiates intended interactions from unintended ones at an accuracy of 93% and provides similar task completion rates in both a quiet room and a public space.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114072072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ponencia presentada en la 14th Conference of the European Chapter of the Association for Computational Linguistics. Workshop on Dialogue in Motion. Gotemburgo, Suecia, 26 de abril de 2014
{"title":"Mining human interactions to construct a virtual guide for a virtual fair","authors":"A. Luna, Luciana Benotti","doi":"10.3115/v1/W14-0206","DOIUrl":"https://doi.org/10.3115/v1/W14-0206","url":null,"abstract":"Ponencia presentada en la 14th Conference of the European Chapter of the Association for Computational Linguistics. Workshop on Dialogue in Motion. Gotemburgo, Suecia, 26 de abril de 2014","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130483794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clare R. Voss, Taylor Cassidy, Douglas Summers-Stay
This paper briefly sketches new work-inprogress (i) developing task-based scenarios where human-robot teams collaboratively explore real-world environments in which the robot is immersed but the humans are not, (ii) extracting and constructing “multi-modal interval corpora” from dialog, video, and LIDAR messages that were recorded in ROS bagfiles during task sessions, and (iii) testing automated methods to identify, track, and align co-referent content both within and across modalities in these interval corpora. The pre-pilot study and its corpora provide a unique, empirical starting point for our longerterm research objective: characterizing the balance of explicitly shared and tacitly assumed information exchanged during effective teamwork. 1 Overview
{"title":"Collaborative Exploration in Human-Robot Teams: What’s in their Corpora of Dialog, Video, & LIDAR Messages?","authors":"Clare R. Voss, Taylor Cassidy, Douglas Summers-Stay","doi":"10.3115/v1/W14-0207","DOIUrl":"https://doi.org/10.3115/v1/W14-0207","url":null,"abstract":"This paper briefly sketches new work-inprogress (i) developing task-based scenarios where human-robot teams collaboratively explore real-world environments in which the robot is immersed but the humans are not, (ii) extracting and constructing “multi-modal interval corpora” from dialog, video, and LIDAR messages that were recorded in ROS bagfiles during task sessions, and (iii) testing automated methods to identify, track, and align co-referent content both within and across modalities in these interval corpora. The pre-pilot study and its corpora provide a unique, empirical starting point for our longerterm research objective: characterizing the balance of explicitly shared and tacitly assumed information exchanged during effective teamwork. 1 Overview","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126991760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a first, largely qualitative analysis of a set of human-human dialogues recorded specifically to provide insights in how humans handle pauses and resumptions in situations where the speakers cannot see each other, but have to rely on the acoustic signal alone. The work presented is part of a larger effort to find unobtrusive human dialogue behaviours that can be mimicked and implemented in-car spoken dialogue systems within in the EU project Get Home Safe, a collaboration between KTH, DFKI, Nuance, IBM and Daimler aiming to find ways of driver interaction that minimizes safety issues,. The analysis reveals several human temporal, semantic/pragmatic, and structural behaviours that are good candidates for inclusion in spoken dialogue systems.
本文首先对一组记录下来的人类对话进行了定性分析,以深入了解人类在说话者看不到对方,而只能依赖声音信号的情况下如何处理停顿和恢复。这项工作是欧盟项目Get Home Safe的一部分,旨在寻找可以模仿和实施车内语音对话系统的不引人注目的人类对话行为,该项目由KTH、DFKI、Nuance、IBM和戴姆勒合作,旨在找到将安全问题降至最低的驾驶员互动方式。分析揭示了几种人类的时间、语义/语用和结构行为,这些行为很适合包含在口语对话系统中。
{"title":"Human pause and resume behaviours for unobtrusive humanlike in-car spoken dialogue systems","authors":"Jens Edlund, Fredrik Edelstam, Joakim Gustafson","doi":"10.3115/v1/W14-0213","DOIUrl":"https://doi.org/10.3115/v1/W14-0213","url":null,"abstract":"This paper presents a first, largely qualitative analysis of a set of human-human dialogues recorded specifically to provide insights in how humans handle pauses and resumptions in situations where the speakers cannot see each other, but have to rely on the acoustic signal alone. The work presented is part of a larger effort to find unobtrusive human dialogue behaviours that can be mimicked and implemented in-car spoken dialogue systems within in the EU project Get Home Safe, a collaboration between KTH, DFKI, Nuance, IBM and Daimler aiming to find ways of driver interaction that minimizes safety issues,. The analysis reveals several human temporal, semantic/pragmatic, and structural behaviours that are good candidates for inclusion in spoken dialogue systems.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123042418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Janarthanam, Robin L. Hill, A. Dickinson, Morgan Fredriksson
We present an analysis of a Pedestrian Navigation and Information dialogue corpus collected using a Wizard-of-Oz interface. We analysed how wizards preferred to communicate to users given three different options: preset buttons that can generate an utterance, sequences of buttons and dropdown lists to construct complex utterances and free text utterances. We present our findings and suggestions for future WoZ design based on our findings.
{"title":"Click or Type: An Analysis of Wizard’s Interaction for Future Wizard Interface Design","authors":"S. Janarthanam, Robin L. Hill, A. Dickinson, Morgan Fredriksson","doi":"10.3115/v1/W14-0203","DOIUrl":"https://doi.org/10.3115/v1/W14-0203","url":null,"abstract":"We present an analysis of a Pedestrian Navigation and Information dialogue corpus collected using a Wizard-of-Oz interface. We analysed how wizards preferred to communicate to users given three different options: preset buttons that can generate an utterance, sequences of buttons and dropdown lists to construct complex utterances and free text utterances. We present our findings and suggestions for future WoZ design based on our findings.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122230821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Vystrcil, T. Macek, David Luksch, M. Labský, L. Kunc, Jan Kleindienst, Tereza Kasparová
In this paper we introduce a new UI paradigm that mimics radio broadcast along with a prototype called Radio One. The approach aims to present useful information from multiple domains to mobile users (e.g. drivers on the go or cell phone users). The information is served in an entertaining manner in a mostly passive style – without the user having to ask for it– as in real radio broadcast. The content is generated on the fly by a machine and integrates a mix of personal (calendar, emails) and publicly available but customized information (news, weather, POIs). Most of the spoken audio output is machine synthesized. The implemented prototype permits passive listening as well as interaction using voice commands or buttons. Initial feedback gathered while testing the prototype while driving indicates good acceptance of the system and relatively low distraction levels.
{"title":"Mostly Passive Information Delivery – a Prototype","authors":"J. Vystrcil, T. Macek, David Luksch, M. Labský, L. Kunc, Jan Kleindienst, Tereza Kasparová","doi":"10.3115/v1/W14-0209","DOIUrl":"https://doi.org/10.3115/v1/W14-0209","url":null,"abstract":"In this paper we introduce a new UI paradigm that mimics radio broadcast along with a prototype called Radio One. The approach aims to present useful information from multiple domains to mobile users (e.g. drivers on the go or cell phone users). The information is served in an entertaining manner in a mostly passive style – without the user having to ask for it– as in real radio broadcast. The content is generated on the fly by a machine and integrates a mix of personal (calendar, emails) and publicly available but customized information (news, weather, POIs). Most of the spoken audio output is machine synthesized. The implemented prototype permits passive listening as well as interaction using voice commands or buttons. Initial feedback gathered while testing the prototype while driving indicates good acceptance of the system and relatively low distraction levels.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114340554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a multi-threaded Interaction Manager (IM) that is used to track different dimensions of user-system conversations that are required to interleave with each other in a coherent and timely manner. This is explained in the context of a spoken dialogue system for pedestrian navigation and city question-answering, with information push about nearby or visible points-of-interest (PoI).
{"title":"Multi-threaded Interaction Management for Dynamic Spatial Applications","authors":"S. Janarthanam, Oliver Lemon","doi":"10.3115/v1/W14-0208","DOIUrl":"https://doi.org/10.3115/v1/W14-0208","url":null,"abstract":"We present a multi-threaded Interaction Manager (IM) that is used to track different dimensions of user-system conversations that are required to interleave with each other in a coherent and timely manner. This is explained in the context of a spoken dialogue system for pedestrian navigation and city question-answering, with information push about nearby or visible points-of-interest (PoI).","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125076307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rudolf Kadlec, Jindřich Libovický, Jan Macek, Jan Kleindienst
Accurate dialog state tracking is crucial for the design of an efficient spoken dialog system. Until recently, quantitative comparison of different state tracking methods was difficult. However the 2013 Dialog State Tracking Challenge (DSTC) introduced a common dataset and metrics that allow to evaluate the performance of trackers on a standardized task. In this paper we present our belief tracker based on the Hidden Information State (HIS) model with an adjusted user model component. Further, we report the results of our tracker on test3 dataset from DSTC. Our tracker is competitive with trackers submitted to DSTC, even without training it achieves the best results in L2 metrics and it performs between second and third place in accuracy. After adjusting the tracker using the provided data it outperformed the other submissions also in accuracy and yet improved in L2. Additionally we present preliminary results on another two datasets, test1 and test2, used in the DSTC. Strong performance in L2 metric means that our tracker produces well calibrated hypotheses probabilities.
准确的对话状态跟踪是设计高效口语对话系统的关键。直到最近,对不同状态跟踪方法的定量比较还是很困难的。然而,2013年对话状态跟踪挑战(DSTC)引入了一个通用的数据集和指标,允许评估跟踪器在标准化任务中的性能。本文提出了一种基于HIS (Hidden Information State)模型的信念跟踪器,该模型具有调整后的用户模型组件。此外,我们报告了我们的跟踪器在DSTC的test3数据集上的结果。我们的跟踪器与提交给DSTC的跟踪器相比具有竞争力,即使没有经过培训,它在L2指标中也取得了最好的结果,并且在准确性方面表现在第二到第三名之间。在使用提供的数据调整跟踪器后,它在准确性上也优于其他提交,但在L2中有所提高。此外,我们还介绍了DSTC中使用的另外两个数据集test1和test2的初步结果。L2指标的强劲表现意味着我们的跟踪器产生了校准良好的假设概率。
{"title":"IBM’s Belief Tracker: Results On Dialog State Tracking Challenge Datasets","authors":"Rudolf Kadlec, Jindřich Libovický, Jan Macek, Jan Kleindienst","doi":"10.3115/v1/W14-0202","DOIUrl":"https://doi.org/10.3115/v1/W14-0202","url":null,"abstract":"Accurate dialog state tracking is crucial for the design of an efficient spoken dialog system. Until recently, quantitative comparison of different state tracking methods was difficult. However the 2013 Dialog State Tracking Challenge (DSTC) introduced a common dataset and metrics that allow to evaluate the performance of trackers on a standardized task. In this paper we present our belief tracker based on the Hidden Information State (HIS) model with an adjusted user model component. Further, we report the results of our tracker on test3 dataset from DSTC. Our tracker is competitive with trackers submitted to DSTC, even without training it achieves the best results in L2 metrics and it performs between second and third place in accuracy. After adjusting the tracker using the provided data it outperformed the other submissions also in accuracy and yet improved in L2. Additionally we present preliminary results on another two datasets, test1 and test2, used in the DSTC. Strong performance in L2 metric means that our tracker produces well calibrated hypotheses probabilities.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123316772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}