In this paper we describe a method for developing a virtual instructor for pedestrian navigation based on real interactions between a human instructor and a human pedestrian. A virtual instructor is an agent capable of fulfilling the role of a human instructor, and its goal is to assist a pedestrian in the accomplishment of different tasks within the context of a real city. The instructor decides what to say using a generation by selection algorithm, based on a corpus of real interactions generated within the world of interest. The instructor is able to react to different requests by the pedestrian. It is also aware of the pedestrian position with a certain degree of uncertainty, and it can use different city landmarks to guide him.
{"title":"A Natural Language Instructor for pedestrian navigation based in generation by selection","authors":"Santiago Avalos, Luciana Benotti","doi":"10.3115/v1/W14-0205","DOIUrl":"https://doi.org/10.3115/v1/W14-0205","url":null,"abstract":"In this paper we describe a method for developing a virtual instructor for pedestrian navigation based on real interactions between a human instructor and a human pedestrian. A virtual instructor is an agent capable of fulfilling the role of a human instructor, and its goal is to assist a pedestrian in the accomplishment of different tasks within the context of a real city. The instructor decides what to say using a generation by selection algorithm, based on a corpus of real interactions generated within the world of interest. The instructor is able to react to different requests by the pedestrian. It is also aware of the pedestrian position with a certain degree of uncertainty, and it can use different city landmarks to guide him.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121892195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Labský, L. Kunc, T. Macek, Jan Kleindienst, J. Vystrcil
In this paper we describe a set of techniques we found suitable for building multi-modal search applications for automotive environments. As these applications often search across different topical domains, such as maps, weather or Wikipedia, we discuss the problem of switching focus between different domains. Also, we propose techniques useful for minimizing the response time of the search system in mobile environment. We evaluate some of the proposed techniques by means of usability tests with 10 novice test subjects who drove a simulated lane change test on a driving simulator. We report results describing the induced driving distraction and user acceptance.
{"title":"Recipes for building voice search UIs for automotive","authors":"M. Labský, L. Kunc, T. Macek, Jan Kleindienst, J. Vystrcil","doi":"10.3115/v1/W14-0204","DOIUrl":"https://doi.org/10.3115/v1/W14-0204","url":null,"abstract":"In this paper we describe a set of techniques we found suitable for building multi-modal search applications for automotive environments. As these applications often search across different topical domains, such as maps, weather or Wikipedia, we discuss the problem of switching focus between different domains. Also, we propose techniques useful for minimizing the response time of the search system in mobile environment. We evaluate some of the proposed techniques by means of usability tests with 10 novice test subjects who drove a simulated lane change test on a driving simulator. We report results describing the induced driving distraction and user acceptance.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130548495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aasish Pappu, Ming Sun, Seshadri Sridharan, Alexander I. Rudnicky
Open environments present an attention management challenge for conversational systems. We describe a kiosk system (based on Ravenclaw‐Olympus) that uses simple auditory and visual information to interpret human presence and manage the system’s attention. The system robustly differentiates intended interactions from unintended ones at an accuracy of 93% and provides similar task completion rates in both a quiet room and a public space.
{"title":"Conversational Strategies for Robustly Managing Dialog in Public Spaces","authors":"Aasish Pappu, Ming Sun, Seshadri Sridharan, Alexander I. Rudnicky","doi":"10.3115/v1/W14-0211","DOIUrl":"https://doi.org/10.3115/v1/W14-0211","url":null,"abstract":"Open environments present an attention management challenge for conversational systems. We describe a kiosk system (based on Ravenclaw‐Olympus) that uses simple auditory and visual information to interpret human presence and manage the system’s attention. The system robustly differentiates intended interactions from unintended ones at an accuracy of 93% and provides similar task completion rates in both a quiet room and a public space.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114072072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ponencia presentada en la 14th Conference of the European Chapter of the Association for Computational Linguistics. Workshop on Dialogue in Motion. Gotemburgo, Suecia, 26 de abril de 2014
{"title":"Mining human interactions to construct a virtual guide for a virtual fair","authors":"A. Luna, Luciana Benotti","doi":"10.3115/v1/W14-0206","DOIUrl":"https://doi.org/10.3115/v1/W14-0206","url":null,"abstract":"Ponencia presentada en la 14th Conference of the European Chapter of the Association for Computational Linguistics. Workshop on Dialogue in Motion. Gotemburgo, Suecia, 26 de abril de 2014","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130483794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clare R. Voss, Taylor Cassidy, Douglas Summers-Stay
This paper briefly sketches new work-inprogress (i) developing task-based scenarios where human-robot teams collaboratively explore real-world environments in which the robot is immersed but the humans are not, (ii) extracting and constructing “multi-modal interval corpora” from dialog, video, and LIDAR messages that were recorded in ROS bagfiles during task sessions, and (iii) testing automated methods to identify, track, and align co-referent content both within and across modalities in these interval corpora. The pre-pilot study and its corpora provide a unique, empirical starting point for our longerterm research objective: characterizing the balance of explicitly shared and tacitly assumed information exchanged during effective teamwork. 1 Overview
{"title":"Collaborative Exploration in Human-Robot Teams: What’s in their Corpora of Dialog, Video, & LIDAR Messages?","authors":"Clare R. Voss, Taylor Cassidy, Douglas Summers-Stay","doi":"10.3115/v1/W14-0207","DOIUrl":"https://doi.org/10.3115/v1/W14-0207","url":null,"abstract":"This paper briefly sketches new work-inprogress (i) developing task-based scenarios where human-robot teams collaboratively explore real-world environments in which the robot is immersed but the humans are not, (ii) extracting and constructing “multi-modal interval corpora” from dialog, video, and LIDAR messages that were recorded in ROS bagfiles during task sessions, and (iii) testing automated methods to identify, track, and align co-referent content both within and across modalities in these interval corpora. The pre-pilot study and its corpora provide a unique, empirical starting point for our longerterm research objective: characterizing the balance of explicitly shared and tacitly assumed information exchanged during effective teamwork. 1 Overview","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126991760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Janarthanam, Robin L. Hill, A. Dickinson, Morgan Fredriksson
We present an analysis of a Pedestrian Navigation and Information dialogue corpus collected using a Wizard-of-Oz interface. We analysed how wizards preferred to communicate to users given three different options: preset buttons that can generate an utterance, sequences of buttons and dropdown lists to construct complex utterances and free text utterances. We present our findings and suggestions for future WoZ design based on our findings.
{"title":"Click or Type: An Analysis of Wizard’s Interaction for Future Wizard Interface Design","authors":"S. Janarthanam, Robin L. Hill, A. Dickinson, Morgan Fredriksson","doi":"10.3115/v1/W14-0203","DOIUrl":"https://doi.org/10.3115/v1/W14-0203","url":null,"abstract":"We present an analysis of a Pedestrian Navigation and Information dialogue corpus collected using a Wizard-of-Oz interface. We analysed how wizards preferred to communicate to users given three different options: preset buttons that can generate an utterance, sequences of buttons and dropdown lists to construct complex utterances and free text utterances. We present our findings and suggestions for future WoZ design based on our findings.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122230821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a first, largely qualitative analysis of a set of human-human dialogues recorded specifically to provide insights in how humans handle pauses and resumptions in situations where the speakers cannot see each other, but have to rely on the acoustic signal alone. The work presented is part of a larger effort to find unobtrusive human dialogue behaviours that can be mimicked and implemented in-car spoken dialogue systems within in the EU project Get Home Safe, a collaboration between KTH, DFKI, Nuance, IBM and Daimler aiming to find ways of driver interaction that minimizes safety issues,. The analysis reveals several human temporal, semantic/pragmatic, and structural behaviours that are good candidates for inclusion in spoken dialogue systems.
本文首先对一组记录下来的人类对话进行了定性分析,以深入了解人类在说话者看不到对方,而只能依赖声音信号的情况下如何处理停顿和恢复。这项工作是欧盟项目Get Home Safe的一部分,旨在寻找可以模仿和实施车内语音对话系统的不引人注目的人类对话行为,该项目由KTH、DFKI、Nuance、IBM和戴姆勒合作,旨在找到将安全问题降至最低的驾驶员互动方式。分析揭示了几种人类的时间、语义/语用和结构行为,这些行为很适合包含在口语对话系统中。
{"title":"Human pause and resume behaviours for unobtrusive humanlike in-car spoken dialogue systems","authors":"Jens Edlund, Fredrik Edelstam, Joakim Gustafson","doi":"10.3115/v1/W14-0213","DOIUrl":"https://doi.org/10.3115/v1/W14-0213","url":null,"abstract":"This paper presents a first, largely qualitative analysis of a set of human-human dialogues recorded specifically to provide insights in how humans handle pauses and resumptions in situations where the speakers cannot see each other, but have to rely on the acoustic signal alone. The work presented is part of a larger effort to find unobtrusive human dialogue behaviours that can be mimicked and implemented in-car spoken dialogue systems within in the EU project Get Home Safe, a collaboration between KTH, DFKI, Nuance, IBM and Daimler aiming to find ways of driver interaction that minimizes safety issues,. The analysis reveals several human temporal, semantic/pragmatic, and structural behaviours that are good candidates for inclusion in spoken dialogue systems.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123042418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Vystrcil, T. Macek, David Luksch, M. Labský, L. Kunc, Jan Kleindienst, Tereza Kasparová
In this paper we introduce a new UI paradigm that mimics radio broadcast along with a prototype called Radio One. The approach aims to present useful information from multiple domains to mobile users (e.g. drivers on the go or cell phone users). The information is served in an entertaining manner in a mostly passive style – without the user having to ask for it– as in real radio broadcast. The content is generated on the fly by a machine and integrates a mix of personal (calendar, emails) and publicly available but customized information (news, weather, POIs). Most of the spoken audio output is machine synthesized. The implemented prototype permits passive listening as well as interaction using voice commands or buttons. Initial feedback gathered while testing the prototype while driving indicates good acceptance of the system and relatively low distraction levels.
{"title":"Mostly Passive Information Delivery – a Prototype","authors":"J. Vystrcil, T. Macek, David Luksch, M. Labský, L. Kunc, Jan Kleindienst, Tereza Kasparová","doi":"10.3115/v1/W14-0209","DOIUrl":"https://doi.org/10.3115/v1/W14-0209","url":null,"abstract":"In this paper we introduce a new UI paradigm that mimics radio broadcast along with a prototype called Radio One. The approach aims to present useful information from multiple domains to mobile users (e.g. drivers on the go or cell phone users). The information is served in an entertaining manner in a mostly passive style – without the user having to ask for it– as in real radio broadcast. The content is generated on the fly by a machine and integrates a mix of personal (calendar, emails) and publicly available but customized information (news, weather, POIs). Most of the spoken audio output is machine synthesized. The implemented prototype permits passive listening as well as interaction using voice commands or buttons. Initial feedback gathered while testing the prototype while driving indicates good acceptance of the system and relatively low distraction levels.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114340554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a multi-threaded Interaction Manager (IM) that is used to track different dimensions of user-system conversations that are required to interleave with each other in a coherent and timely manner. This is explained in the context of a spoken dialogue system for pedestrian navigation and city question-answering, with information push about nearby or visible points-of-interest (PoI).
{"title":"Multi-threaded Interaction Management for Dynamic Spatial Applications","authors":"S. Janarthanam, Oliver Lemon","doi":"10.3115/v1/W14-0208","DOIUrl":"https://doi.org/10.3115/v1/W14-0208","url":null,"abstract":"We present a multi-threaded Interaction Manager (IM) that is used to track different dimensions of user-system conversations that are required to interleave with each other in a coherent and timely manner. This is explained in the context of a spoken dialogue system for pedestrian navigation and city question-answering, with information push about nearby or visible points-of-interest (PoI).","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125076307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spyros Kousidis, C. Kennington, Timo Baumann, Hendrik Buschmeier, S. Kopp, David Schlangen
Holding non-co-located conversations while driving is dangerous (Horrey and Wickens, 2006; Strayer et al., 2006), much more so than conversations with physically present, “situated” interlocutors (Drews et al., 2004). In-car dialogue systems typically resemble non-co-located conversations more, and share their negative impact (Strayer et al., 2013). We implemented and tested a simple strategy for making in-car dialogue systems aware of the driving situation, by giving them the capability to interrupt themselves when a dangerous situation is detected, and resume when over. We show that this improves both driving performance and recall of system-presented information, compared to a non-adaptive strategy.
开车时进行非同一地点的谈话是危险的(Horrey和Wickens, 2006;Strayer et al., 2006),比与实际在场的“情境”对话者进行对话(Drews et al., 2004)要重要得多。车内对话系统通常更类似于非同地对话,并分享其负面影响(Strayer et al., 2013)。我们实施并测试了一种简单的策略,使车内对话系统能够感知驾驶情况,让它们在检测到危险情况时中断自己,并在结束后恢复。我们表明,与非自适应策略相比,这提高了驾驶性能和系统呈现信息的召回。
{"title":"Situationally Aware In-Car Information Presentation Using Incremental Speech Generation: Safer, and More Effective","authors":"Spyros Kousidis, C. Kennington, Timo Baumann, Hendrik Buschmeier, S. Kopp, David Schlangen","doi":"10.3115/v1/W14-0212","DOIUrl":"https://doi.org/10.3115/v1/W14-0212","url":null,"abstract":"Holding non-co-located conversations while driving is dangerous (Horrey and Wickens, 2006; Strayer et al., 2006), much more so than conversations with physically present, “situated” interlocutors (Drews et al., 2004). In-car dialogue systems typically resemble non-co-located conversations more, and share their negative impact (Strayer et al., 2013). We implemented and tested a simple strategy for making in-car dialogue systems aware of the driving situation, by giving them the capability to interrupt themselves when a dangerous situation is detected, and resume when over. We show that this improves both driving performance and recall of system-presented information, compared to a non-adaptive strategy.","PeriodicalId":198983,"journal":{"name":"DM@EACL","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121099334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}