Jacob Arkin, Siddharth Patki, J. Rosser, T. Howard
{"title":"一种有效的基于语言模型的可视化和解释算法","authors":"Jacob Arkin, Siddharth Patki, J. Rosser, T. Howard","doi":"10.1109/RO-MAN53752.2022.9900835","DOIUrl":null,"url":null,"abstract":"Contemporary approaches to grounded language communication accept an utterance and current world representation as input and produce symbols representing the meaning as output. Since modern approaches to language understanding for human-robot interaction use techniques rooted in machine learning, the quality or sensitivity of the solution is often opaque relative to small changes in input. Although it is possible to sample and visualize solutions over a large space of inputs, naïve application of current techniques is often prohibitively expensive for real-time feedback. In this paper we address this problem by reformulating the inference process of Distributed Correspondence Graphs to only recompute subsets of spatially dependent constituent features over a space of sampled environment models. We quantitatively evaluate the speed of inference in physical experiments involving a tabletop robot manipulation scenario. We demonstrate the ability to visualize configurations of the environment where symbol grounding produces consistent solutions in real-time and illustrate how these techniques can be used to identify and repair gaps or inaccuracies in training data.","PeriodicalId":250997,"journal":{"name":"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Efficient Algorithm for Visualization and Interpretation of Grounded Language Models\",\"authors\":\"Jacob Arkin, Siddharth Patki, J. Rosser, T. Howard\",\"doi\":\"10.1109/RO-MAN53752.2022.9900835\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Contemporary approaches to grounded language communication accept an utterance and current world representation as input and produce symbols representing the meaning as output. Since modern approaches to language understanding for human-robot interaction use techniques rooted in machine learning, the quality or sensitivity of the solution is often opaque relative to small changes in input. Although it is possible to sample and visualize solutions over a large space of inputs, naïve application of current techniques is often prohibitively expensive for real-time feedback. In this paper we address this problem by reformulating the inference process of Distributed Correspondence Graphs to only recompute subsets of spatially dependent constituent features over a space of sampled environment models. We quantitatively evaluate the speed of inference in physical experiments involving a tabletop robot manipulation scenario. We demonstrate the ability to visualize configurations of the environment where symbol grounding produces consistent solutions in real-time and illustrate how these techniques can be used to identify and repair gaps or inaccuracies in training data.\",\"PeriodicalId\":250997,\"journal\":{\"name\":\"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RO-MAN53752.2022.9900835\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RO-MAN53752.2022.9900835","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Efficient Algorithm for Visualization and Interpretation of Grounded Language Models
Contemporary approaches to grounded language communication accept an utterance and current world representation as input and produce symbols representing the meaning as output. Since modern approaches to language understanding for human-robot interaction use techniques rooted in machine learning, the quality or sensitivity of the solution is often opaque relative to small changes in input. Although it is possible to sample and visualize solutions over a large space of inputs, naïve application of current techniques is often prohibitively expensive for real-time feedback. In this paper we address this problem by reformulating the inference process of Distributed Correspondence Graphs to only recompute subsets of spatially dependent constituent features over a space of sampled environment models. We quantitatively evaluate the speed of inference in physical experiments involving a tabletop robot manipulation scenario. We demonstrate the ability to visualize configurations of the environment where symbol grounding produces consistent solutions in real-time and illustrate how these techniques can be used to identify and repair gaps or inaccuracies in training data.