Teruhisa Misu, Antoine Raux, Ian Lane, Joan Devassy, Rakesh Gupta
{"title":"位于车辆中的多模态对话系统","authors":"Teruhisa Misu, Antoine Raux, Ian Lane, Joan Devassy, Rakesh Gupta","doi":"10.1145/2535948.2535951","DOIUrl":null,"url":null,"abstract":"In this paper, we address Townsurfer, a situated multi-modal dialog system in vehicles. The system integrates multi-modal inputs of speech, geo-location, gaze (face direction) and dialog history to answer drivers' queries about their surroundings. To select appropriate data source used to answer queries, we apply belief tracking across the above modalities. We conducted a preliminary data collection and an evaluation focusing on the effect of gaze (head irection) and geo-location estimations. We report the result and analysis on the data.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Situated multi-modal dialog system in vehicles\",\"authors\":\"Teruhisa Misu, Antoine Raux, Ian Lane, Joan Devassy, Rakesh Gupta\",\"doi\":\"10.1145/2535948.2535951\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we address Townsurfer, a situated multi-modal dialog system in vehicles. The system integrates multi-modal inputs of speech, geo-location, gaze (face direction) and dialog history to answer drivers' queries about their surroundings. To select appropriate data source used to answer queries, we apply belief tracking across the above modalities. We conducted a preliminary data collection and an evaluation focusing on the effect of gaze (head irection) and geo-location estimations. We report the result and analysis on the data.\",\"PeriodicalId\":403097,\"journal\":{\"name\":\"GazeIn '13\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"GazeIn '13\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2535948.2535951\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"GazeIn '13","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2535948.2535951","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In this paper, we address Townsurfer, a situated multi-modal dialog system in vehicles. The system integrates multi-modal inputs of speech, geo-location, gaze (face direction) and dialog history to answer drivers' queries about their surroundings. To select appropriate data source used to answer queries, we apply belief tracking across the above modalities. We conducted a preliminary data collection and an evaluation focusing on the effect of gaze (head irection) and geo-location estimations. We report the result and analysis on the data.