{"title":"通过深度学习支持的增强视频界面进行多视角人机交互","authors":"Grimaldo Silva, K. Rekik, A. Kanso, L. Schnitman","doi":"10.1109/RO-MAN53752.2022.9900671","DOIUrl":null,"url":null,"abstract":"As the world surpasses a billion cameras [1] and their coverage of the public and private spaces increases, the possibility of using their visual feed to not just observe, but to command robots through their video becomes an ever more interesting prospect. Our work deals with multi-perspective interaction, where a robot autonomously maps image pixels from reachable cameras to positions on its global coordinate space. This enables an operator to send the robot to specific positions in a camera with no manual calibration. Furthermore, robot information, such as planned paths, can be used to augment all affected camera images with an overlayed projection of their visual information. The robustness of this approach has been validated in both simulated and real world experiments.","PeriodicalId":250997,"journal":{"name":"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Multi-perspective human robot interaction through an augmented video interface supported by deep learning\",\"authors\":\"Grimaldo Silva, K. Rekik, A. Kanso, L. Schnitman\",\"doi\":\"10.1109/RO-MAN53752.2022.9900671\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the world surpasses a billion cameras [1] and their coverage of the public and private spaces increases, the possibility of using their visual feed to not just observe, but to command robots through their video becomes an ever more interesting prospect. Our work deals with multi-perspective interaction, where a robot autonomously maps image pixels from reachable cameras to positions on its global coordinate space. This enables an operator to send the robot to specific positions in a camera with no manual calibration. Furthermore, robot information, such as planned paths, can be used to augment all affected camera images with an overlayed projection of their visual information. The robustness of this approach has been validated in both simulated and real world experiments.\",\"PeriodicalId\":250997,\"journal\":{\"name\":\"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RO-MAN53752.2022.9900671\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RO-MAN53752.2022.9900671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-perspective human robot interaction through an augmented video interface supported by deep learning
As the world surpasses a billion cameras [1] and their coverage of the public and private spaces increases, the possibility of using their visual feed to not just observe, but to command robots through their video becomes an ever more interesting prospect. Our work deals with multi-perspective interaction, where a robot autonomously maps image pixels from reachable cameras to positions on its global coordinate space. This enables an operator to send the robot to specific positions in a camera with no manual calibration. Furthermore, robot information, such as planned paths, can be used to augment all affected camera images with an overlayed projection of their visual information. The robustness of this approach has been validated in both simulated and real world experiments.