Powen Yao, Yu Hou, Yuan He, Da Cheng, Huanpu Hu, Michael Zyda
{"title":"基于多模态机器学习的扩展现实模拟智能家居用户行为预测研究","authors":"Powen Yao, Yu Hou, Yuan He, Da Cheng, Huanpu Hu, Michael Zyda","doi":"10.1109/VRW55335.2022.00195","DOIUrl":null,"url":null,"abstract":"In this work, we propose a multi-modal approach to manipulate smart home devices in a smart home environment simulated in virtual reality (VR). We determine the user's target device and the desired action by their utterance, spatial information (gestures, positions, etc.), or a combination of the two. Since the information contained in the user's utterance and the spatial information can be disjoint or complementary to each other, we process the two sources of information in parallel using our array of machine learning models. We use ensemble modeling to aggregate the results of these models and enhance the quality of our final prediction results. We present our preliminary architecture, models, and findings.","PeriodicalId":326252,"journal":{"name":"2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Toward Using Multi-Modal Machine Learning for User Behavior Prediction in Simulated Smart Home for Extended Reality\",\"authors\":\"Powen Yao, Yu Hou, Yuan He, Da Cheng, Huanpu Hu, Michael Zyda\",\"doi\":\"10.1109/VRW55335.2022.00195\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we propose a multi-modal approach to manipulate smart home devices in a smart home environment simulated in virtual reality (VR). We determine the user's target device and the desired action by their utterance, spatial information (gestures, positions, etc.), or a combination of the two. Since the information contained in the user's utterance and the spatial information can be disjoint or complementary to each other, we process the two sources of information in parallel using our array of machine learning models. We use ensemble modeling to aggregate the results of these models and enhance the quality of our final prediction results. We present our preliminary architecture, models, and findings.\",\"PeriodicalId\":326252,\"journal\":{\"name\":\"2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)\",\"volume\":\"112 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VRW55335.2022.00195\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VRW55335.2022.00195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Toward Using Multi-Modal Machine Learning for User Behavior Prediction in Simulated Smart Home for Extended Reality
In this work, we propose a multi-modal approach to manipulate smart home devices in a smart home environment simulated in virtual reality (VR). We determine the user's target device and the desired action by their utterance, spatial information (gestures, positions, etc.), or a combination of the two. Since the information contained in the user's utterance and the spatial information can be disjoint or complementary to each other, we process the two sources of information in parallel using our array of machine learning models. We use ensemble modeling to aggregate the results of these models and enhance the quality of our final prediction results. We present our preliminary architecture, models, and findings.