{"title":"Natural language spoken interface control using data-driven semantic inference","authors":"J. Bellegarda, Kim E. A. Silverman","doi":"10.1109/TSA.2003.811534","DOIUrl":null,"url":null,"abstract":"Spoken interaction tasks are typically approached using a formal grammar as language model. While ensuring good system performance, this imposes a rigid framework on users, by implicitly forcing them to conform to a pre-defined interaction structure. This paper introduces the concept of data-driven semantic inference, which in principle allows for any word constructs in command/query formulation. Each unconstrained word string is automatically mapped onto the intended action through a semantic classification against the set of supported actions. As a result, it is no longer necessary for users to memorize the exact syntax of every command. The underlying (latent semantic analysis) framework relies on co-occurrences between words and commands, as observed in a training corpus. A suitable extension can also handle commands that are ambiguous at the word level. The behavior of semantic inference is characterized using a desktop user interface control task involving 113 different actions. Under realistic usage conditions, this approach exhibits a 2 to 5% classification error rate. Various training scenarios of increasing scope are considered to assess the influence of coverage on performance. Sufficient semantic knowledge about the task domain is found to be captured at a level of coverage as low as 70%. This illustrates the good generalization properties of semantic inference.","PeriodicalId":13155,"journal":{"name":"IEEE Trans. Speech Audio Process.","volume":"28 1","pages":"267-277"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Trans. Speech Audio Process.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TSA.2003.811534","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
Spoken interaction tasks are typically approached using a formal grammar as language model. While ensuring good system performance, this imposes a rigid framework on users, by implicitly forcing them to conform to a pre-defined interaction structure. This paper introduces the concept of data-driven semantic inference, which in principle allows for any word constructs in command/query formulation. Each unconstrained word string is automatically mapped onto the intended action through a semantic classification against the set of supported actions. As a result, it is no longer necessary for users to memorize the exact syntax of every command. The underlying (latent semantic analysis) framework relies on co-occurrences between words and commands, as observed in a training corpus. A suitable extension can also handle commands that are ambiguous at the word level. The behavior of semantic inference is characterized using a desktop user interface control task involving 113 different actions. Under realistic usage conditions, this approach exhibits a 2 to 5% classification error rate. Various training scenarios of increasing scope are considered to assess the influence of coverage on performance. Sufficient semantic knowledge about the task domain is found to be captured at a level of coverage as low as 70%. This illustrates the good generalization properties of semantic inference.