{"title":"Alexa, let's work together! How Alexa Helps Customers Complete Tasks with Verbal and Visual Guidance in the Alexa Prize TaskBot Challenge","authors":"Y. Maarek","doi":"10.1145/3503161.3549912","DOIUrl":null,"url":null,"abstract":"In this talk, I will present the Alexa Prize TaskBot Challenge, which allows selected academic teams to develop TaskBots. TaskBots are agents that interact with Alexa users who require assistance (via \"Alexa, let's work together\") to complete everyday tasks requiring multiple steps and decisions, such as cooking and home improvement. One of the unique elements of this challenge is its multi-modal nature, where users receive both verbal guidance and visual instructions, when a screen is available (e.g., on Echo Show devices). Some of the hard AI challenges the teams addressed included leveraging domain knowledge, tacking dialogue state, supporting adaptive and robust conversations and probably the most relevant to this conference: handling multi-modal interactions.","PeriodicalId":412792,"journal":{"name":"Proceedings of the 30th ACM International Conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503161.3549912","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this talk, I will present the Alexa Prize TaskBot Challenge, which allows selected academic teams to develop TaskBots. TaskBots are agents that interact with Alexa users who require assistance (via "Alexa, let's work together") to complete everyday tasks requiring multiple steps and decisions, such as cooking and home improvement. One of the unique elements of this challenge is its multi-modal nature, where users receive both verbal guidance and visual instructions, when a screen is available (e.g., on Echo Show devices). Some of the hard AI challenges the teams addressed included leveraging domain knowledge, tacking dialogue state, supporting adaptive and robust conversations and probably the most relevant to this conference: handling multi-modal interactions.