H. Lee, Shinsuke Kobayashi, N. Koshizuka, K. Sakamura
In this paper, we propose a robust natural language interface called CASIS for controlling devices in an intelligent environment. CASIS is novel in a sense that it integrates physical context acquired from the sensors embedded in the environment with traditionally used context to reduce the system error rate and disambiguate deictic references and elliptical inputs. The n-best result of the speech recognizer is re-ranked by a score calculated using a Bayesian network consisting of information from the input utterance and context. In our prototype system that uses device states, brightness, speaker location, chair occupancy, speech direction and action history as context, the system error rate has been reduced by 41% compared to a baseline system that does not leverage on context information.
{"title":"CASIS: a context-aware speech interface system","authors":"H. Lee, Shinsuke Kobayashi, N. Koshizuka, K. Sakamura","doi":"10.1145/1040830.1040880","DOIUrl":"https://doi.org/10.1145/1040830.1040880","url":null,"abstract":"In this paper, we propose a robust natural language interface called CASIS for controlling devices in an intelligent environment. CASIS is novel in a sense that it integrates physical context acquired from the sensors embedded in the environment with traditionally used context to reduce the system error rate and disambiguate deictic references and elliptical inputs. The n-best result of the speech recognizer is re-ranked by a score calculated using a Bayesian network consisting of information from the input utterance and context. In our prototype system that uses device states, brightness, speaker location, chair occupancy, speech direction and action history as context, the system error rate has been reduced by 41% compared to a baseline system that does not leverage on context information.","PeriodicalId":376409,"journal":{"name":"Proceedings of the 10th international conference on Intelligent user interfaces","volume":"340 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124779167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Finding the optimal teaching strategy for an individual student is difficult even for an experienced teacher. Identifying and incorporating multiple optimal teaching strategies for different students in a class is even harder. This paper presents an Adaptive tutor for online Learning, AtoL, for Computer Science laboratories that identifies and applies the appropriate teaching strategies for students on an individual basis. The optimal strategy for a student is identified in two steps. First, a basic strategy for a student is identified using rules learned from a supervised learning system. Then the basic strategy is refined to better fit the student using models learned using an unsupervised learning system that takes into account the temporal nature of the problem solving process. The learning algorithms as well as the initial experimental results are presented.
{"title":"Adaptive teaching strategy for online learning","authors":"Jungsoon P. Yoo, Cen Li, C. Pettey","doi":"10.1145/1040830.1040892","DOIUrl":"https://doi.org/10.1145/1040830.1040892","url":null,"abstract":"Finding the optimal teaching strategy for an individual student is difficult even for an experienced teacher. Identifying and incorporating multiple optimal teaching strategies for different students in a class is even harder. This paper presents an Adaptive tutor for online Learning, AtoL, for Computer Science laboratories that identifies and applies the appropriate teaching strategies for students on an individual basis. The optimal strategy for a student is identified in two steps. First, a basic strategy for a student is identified using rules learned from a supervised learning system. Then the basic strategy is refined to better fit the student using models learned using an unsupervised learning system that takes into account the temporal nature of the problem solving process. The learning algorithms as well as the initial experimental results are presented.","PeriodicalId":376409,"journal":{"name":"Proceedings of the 10th international conference on Intelligent user interfaces","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124996224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SemanticTalk is a tool for supporting face-to-face meetings and discussions by automatically generating a semantic context from spoken conversations. We use speech recognition and topic extraction from a large terminological database to create a network of discussion topics in real-time. This network includes concepts explicitly addressed in the discussion as well as semantically associated terms, and is visualized to increase conversational awareness and creativity in the group.
{"title":"Generating semantic contexts from spoken conversation in meetings","authors":"J. Ziegler, Zoulfa El Jerroudi, Karsten Böhm","doi":"10.1145/1040830.1040902","DOIUrl":"https://doi.org/10.1145/1040830.1040902","url":null,"abstract":"SemanticTalk is a tool for supporting face-to-face meetings and discussions by automatically generating a semantic context from spoken conversations. We use speech recognition and topic extraction from a large terminological database to create a network of discussion topics in real-time. This network includes concepts explicitly addressed in the discussion as well as semantically associated terms, and is visualized to increase conversational awareness and creativity in the group.","PeriodicalId":376409,"journal":{"name":"Proceedings of the 10th international conference on Intelligent user interfaces","volume":"69 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125874408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we explore the conventions that people use in managing multiple dialogue threads. In particular, we focus on where in a thread people interrupt when switching to another thread. We find that some subjects are able to vary where they switch depending on how urgent the interrupting task is. When time-allowed, they switched at the end of a discourse segment, which we hypothesize is less disruptive to the interrupted task when it is later resumed.
{"title":"Conventions in human-human multi-threaded dialogues: a preliminary study","authors":"P. Heeman, Fan Yang, A. Kun, A. Shyrokov","doi":"10.1145/1040830.1040903","DOIUrl":"https://doi.org/10.1145/1040830.1040903","url":null,"abstract":"In this paper, we explore the conventions that people use in managing multiple dialogue threads. In particular, we focus on where in a thread people interrupt when switching to another thread. We find that some subjects are able to vary where they switch depending on how urgent the interrupting task is. When time-allowed, they switched at the end of a discourse segment, which we hypothesize is less disruptive to the interrupted task when it is later resumed.","PeriodicalId":376409,"journal":{"name":"Proceedings of the 10th international conference on Intelligent user interfaces","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129154612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The aim of this research was to develop methods for the automatic person-independent estimation of experienced emotions from facial expressions. Ten subjects watched series of emotionally arousing pictures and videos, while the electromyographic (EMG) activity of two facial muscles: zygomaticus major (activated in smiling) and corrugator supercilii (activated in frowning) was registered. Based on the changes in the activity of these two facial muscles, it was possible to distinguish between ratings of positive and negative emotional experiences at a rate of almost 70% for pictures and over 80% for videos. Using these methods, the computer could adapt its behavior according to the user's emotions during human-computer interaction.
{"title":"Person-independent estimation of emotional experiences from facial expressions","authors":"Timo Partala, Veikko Surakka, T. Vanhala","doi":"10.1145/1040830.1040883","DOIUrl":"https://doi.org/10.1145/1040830.1040883","url":null,"abstract":"The aim of this research was to develop methods for the automatic person-independent estimation of experienced emotions from facial expressions. Ten subjects watched series of emotionally arousing pictures and videos, while the electromyographic (EMG) activity of two facial muscles: zygomaticus major (activated in smiling) and corrugator supercilii (activated in frowning) was registered. Based on the changes in the activity of these two facial muscles, it was possible to distinguish between ratings of positive and negative emotional experiences at a rate of almost 70% for pictures and over 80% for videos. Using these methods, the computer could adapt its behavior according to the user's emotions during human-computer interaction.","PeriodicalId":376409,"journal":{"name":"Proceedings of the 10th international conference on Intelligent user interfaces","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114815731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we introduce the User Interface Pilot, a model-based software tool that enables designers and engineers to create the initial specifications for the pages of a website, or for the screens of a desktop or mobile application. The tool guides the design of these specifications, commonly known as wireframes, in a user-centered fashion by framing the context of the design within the concepts of user tasks, user types, and data objects. Unlike previous model-based tools, the User Interface Pilot does not impose a rigid model-driven methodology and functions well within common software engineering development processes. The tool has been used in over twenty real-world user interface design projects.
{"title":"The UI pilot: a model-based tool to guide early interface design","authors":"A. Puerta, M. Micheletti, Alan Mak","doi":"10.1145/1040830.1040877","DOIUrl":"https://doi.org/10.1145/1040830.1040877","url":null,"abstract":"In this paper, we introduce the User Interface Pilot, a model-based software tool that enables designers and engineers to create the initial specifications for the pages of a website, or for the screens of a desktop or mobile application. The tool guides the design of these specifications, commonly known as wireframes, in a user-centered fashion by framing the context of the design within the concepts of user tasks, user types, and data objects. Unlike previous model-based tools, the User Interface Pilot does not impose a rigid model-driven methodology and functions well within common software engineering development processes. The tool has been used in over twenty real-world user interface design projects.","PeriodicalId":376409,"journal":{"name":"Proceedings of the 10th international conference on Intelligent user interfaces","volume":"270 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127322081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Integrating and mining data from different web sources can make end-users well-informed when they make decisions. One of many limitations that bars end-users from taking advantages of such process is the complexity in each of the steps required to gather, integrate, monitor, and mine data from different websites. We present the idea of combining the data integration, monitoring, and mining as one single process in the form of an intelligent assistant that guides end-users to specify their mining tasks by just answering questions. This easy-to-use approach, which trades off complexity in terms of available operations with the ease of use, has the ability to provide interesting insight into the data that would requires days of human effort to gather, combine, and mine manually from the web.
{"title":"Interactively building agents for consumer-side data mining","authors":"R. Tuchinda, Craig A. Knoblock","doi":"10.1145/1040830.1040891","DOIUrl":"https://doi.org/10.1145/1040830.1040891","url":null,"abstract":"Integrating and mining data from different web sources can make end-users well-informed when they make decisions. One of many limitations that bars end-users from taking advantages of such process is the complexity in each of the steps required to gather, integrate, monitor, and mine data from different websites. We present the idea of combining the data integration, monitoring, and mining as one single process in the form of an intelligent assistant that guides end-users to specify their mining tasks by just answering questions. This easy-to-use approach, which trades off complexity in terms of available operations with the ease of use, has the ability to provide interesting insight into the data that would requires days of human effort to gather, combine, and mine manually from the web.","PeriodicalId":376409,"journal":{"name":"Proceedings of the 10th international conference on Intelligent user interfaces","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124803351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper presents a first prototype of a handheld museum guide delivering contextualized information based on the recognition of drawing details selected by the user through the guide camera. The resulting interaction modality has been analyzed and compared to previous approaches. Finally, alternative, more scalable, solutions are presented that preserve the most interesting features of the system described.
{"title":"Communicating user's focus of attention by image processing as input for a mobile museum guide","authors":"A. Albertini, R. Brunelli, O. Stock, M. Zancanaro","doi":"10.1145/1040830.1040905","DOIUrl":"https://doi.org/10.1145/1040830.1040905","url":null,"abstract":"The paper presents a first prototype of a handheld museum guide delivering contextualized information based on the recognition of drawing details selected by the user through the guide camera. The resulting interaction modality has been analyzed and compared to previous approaches. Finally, alternative, more scalable, solutions are presented that preserve the most interesting features of the system described.","PeriodicalId":376409,"journal":{"name":"Proceedings of the 10th international conference on Intelligent user interfaces","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127288030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A new generation of intelligent applications can be enabled by broad-coverage knowledge repositories about everyday objects. We distill lessons in design of intelligent user interfaces which collect such broad-coverage knowledge from untrained volunteers. We motivate the knowledge-driven template-based approach adopted in Learner2, a second generation proactive acquisition interface for eliciting such knowledge. We present volume, accuracy, and recall of knowledge collected by fielding the system for 5 months. Learner2 has so far acquired 99,018 general statements, emphasizing knowledge about parts of and typical uses of objects.
{"title":"Designing interfaces for guided collection of knowledge about everyday objects from volunteers","authors":"Timothy Chklovski","doi":"10.1145/1040830.1040910","DOIUrl":"https://doi.org/10.1145/1040830.1040910","url":null,"abstract":"A new generation of intelligent applications can be enabled by broad-coverage knowledge repositories about everyday objects. We distill lessons in design of intelligent user interfaces which collect such broad-coverage knowledge from untrained volunteers. We motivate the knowledge-driven template-based approach adopted in Learner2, a second generation proactive acquisition interface for eliciting such knowledge. We present volume, accuracy, and recall of knowledge collected by fielding the system for 5 months. Learner2 has so far acquired 99,018 general statements, emphasizing knowledge about parts of and typical uses of objects.","PeriodicalId":376409,"journal":{"name":"Proceedings of the 10th international conference on Intelligent user interfaces","volume":"155 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131072783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe several approaches for using prosodic features of speech and audio localization to control interactive applications. This information can be applied to parameter control, as well as to speech disambiguation. We discuss how characteristics of spoken sentences can be exploited in the user interface; for example, by considering the speed with which a sentence is spoken and the presence of extraneous utterances. We also show how coarse audio localization can be used for low-fidelity gesture tracking, by inferring the speaker's head position.
{"title":"Interaction techniques using prosodic features of speech and audio localization","authors":"A. Olwal, Steven K. Feiner","doi":"10.1145/1040830.1040900","DOIUrl":"https://doi.org/10.1145/1040830.1040900","url":null,"abstract":"We describe several approaches for using prosodic features of speech and audio localization to control interactive applications. This information can be applied to parameter control, as well as to speech disambiguation. We discuss how characteristics of spoken sentences can be exploited in the user interface; for example, by considering the speed with which a sentence is spoken and the presence of extraneous utterances. We also show how coarse audio localization can be used for low-fidelity gesture tracking, by inferring the speaker's head position.","PeriodicalId":376409,"journal":{"name":"Proceedings of the 10th international conference on Intelligent user interfaces","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121124557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}