{"title":"无监督的脱离情境的行动理解","authors":"Hirokatsu Kataoka, Y. Satoh","doi":"10.1109/ICRA.2019.8793709","DOIUrl":null,"url":null,"abstract":"The paper presents an unsupervised out-of-context action (O2CA) paradigm that is based on facilitating understanding by separately presenting both human action and context within a video sequence. As a means of generating an unsupervised label, we comprehensively evaluate responses from action-based (ActionNet) and context-based (ContextNet) convolutional neural networks (CNNs). Additionally, we have created three synthetic databases based on the human action (UCF101, HMDB51) and motion capture (mocap) (SURREAL) datasets. We then conducted experimental comparisons between our approach and conventional approaches. We also compared our unsupervised learning method with supervised learning using an O2CA ground truth given by synthetic data. From the results obtained, we achieved a 96.8 score on Synth-UCF, a 96.8 score on Synth-HMDB, and 89.0 on SURREAL-O2CA with F-score.","PeriodicalId":6730,"journal":{"name":"2019 International Conference on Robotics and Automation (ICRA)","volume":"11 1","pages":"8227-8233"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Unsupervised Out-of-context Action Understanding\",\"authors\":\"Hirokatsu Kataoka, Y. Satoh\",\"doi\":\"10.1109/ICRA.2019.8793709\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper presents an unsupervised out-of-context action (O2CA) paradigm that is based on facilitating understanding by separately presenting both human action and context within a video sequence. As a means of generating an unsupervised label, we comprehensively evaluate responses from action-based (ActionNet) and context-based (ContextNet) convolutional neural networks (CNNs). Additionally, we have created three synthetic databases based on the human action (UCF101, HMDB51) and motion capture (mocap) (SURREAL) datasets. We then conducted experimental comparisons between our approach and conventional approaches. We also compared our unsupervised learning method with supervised learning using an O2CA ground truth given by synthetic data. From the results obtained, we achieved a 96.8 score on Synth-UCF, a 96.8 score on Synth-HMDB, and 89.0 on SURREAL-O2CA with F-score.\",\"PeriodicalId\":6730,\"journal\":{\"name\":\"2019 International Conference on Robotics and Automation (ICRA)\",\"volume\":\"11 1\",\"pages\":\"8227-8233\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Robotics and Automation (ICRA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRA.2019.8793709\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA.2019.8793709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The paper presents an unsupervised out-of-context action (O2CA) paradigm that is based on facilitating understanding by separately presenting both human action and context within a video sequence. As a means of generating an unsupervised label, we comprehensively evaluate responses from action-based (ActionNet) and context-based (ContextNet) convolutional neural networks (CNNs). Additionally, we have created three synthetic databases based on the human action (UCF101, HMDB51) and motion capture (mocap) (SURREAL) datasets. We then conducted experimental comparisons between our approach and conventional approaches. We also compared our unsupervised learning method with supervised learning using an O2CA ground truth given by synthetic data. From the results obtained, we achieved a 96.8 score on Synth-UCF, a 96.8 score on Synth-HMDB, and 89.0 on SURREAL-O2CA with F-score.