位于车辆中的多模态对话系统

GazeIn '13 Pub Date : 2013-12-13 DOI:10.1145/2535948.2535951

Teruhisa Misu, Antoine Raux, Ian Lane, Joan Devassy, Rakesh Gupta

{"title":"位于车辆中的多模态对话系统","authors":"Teruhisa Misu, Antoine Raux, Ian Lane, Joan Devassy, Rakesh Gupta","doi":"10.1145/2535948.2535951","DOIUrl":null,"url":null,"abstract":"In this paper, we address Townsurfer, a situated multi-modal dialog system in vehicles. The system integrates multi-modal inputs of speech, geo-location, gaze (face direction) and dialog history to answer drivers' queries about their surroundings. To select appropriate data source used to answer queries, we apply belief tracking across the above modalities. We conducted a preliminary data collection and an evaluation focusing on the effect of gaze (head irection) and geo-location estimations. We report the result and analysis on the data.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Situated multi-modal dialog system in vehicles\",\"authors\":\"Teruhisa Misu, Antoine Raux, Ian Lane, Joan Devassy, Rakesh Gupta\",\"doi\":\"10.1145/2535948.2535951\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we address Townsurfer, a situated multi-modal dialog system in vehicles. The system integrates multi-modal inputs of speech, geo-location, gaze (face direction) and dialog history to answer drivers' queries about their surroundings. To select appropriate data source used to answer queries, we apply belief tracking across the above modalities. We conducted a preliminary data collection and an evaluation focusing on the effect of gaze (head irection) and geo-location estimations. We report the result and analysis on the data.\",\"PeriodicalId\":403097,\"journal\":{\"name\":\"GazeIn '13\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"GazeIn '13\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2535948.2535951\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"GazeIn '13","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2535948.2535951","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

摘要

在本文中，我们讨论了Townsurfer，一个位于车辆中的多模态对话系统。该系统集成了语音、地理位置、凝视(面部方向)和对话历史的多模式输入，以回答驾驶员对周围环境的询问。为了选择合适的数据源来回答查询，我们跨上述模式应用信念跟踪。我们进行了初步的数据收集和评估，重点关注凝视(头部方向)和地理位置估计的影响。我们报告结果并对数据进行分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Situated multi-modal dialog system in vehicles

In this paper, we address Townsurfer, a situated multi-modal dialog system in vehicles. The system integrates multi-modal inputs of speech, geo-location, gaze (face direction) and dialog history to answer drivers' queries about their surroundings. To select appropriate data source used to answer queries, we apply belief tracking across the above modalities. We conducted a preliminary data collection and an evaluation focusing on the effect of gaze (head irection) and geo-location estimations. We report the result and analysis on the data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

GazeIn '13

自引率

0.00%

发文量

期刊最新文献

Agent-assisted multi-viewpoint video viewer and its gaze-based evaluation Learning aspects of interest from Gaze A dominance estimation mechanism using eye-gaze and turn-taking information Unrawelling the interaction strategies and gaze in collaborative learning with online video lectures Mutual disambiguation of eye gaze and speech for sight translation and reading