Graph-based High-order Relation Modeling for Long-term Action Recognition

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2021-06-01 DOI:10.1109/CVPR46437.2021.00887

Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Weishi Zheng

{"title":"Graph-based High-order Relation Modeling for Long-term Action Recognition","authors":"Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Weishi Zheng","doi":"10.1109/CVPR46437.2021.00887","DOIUrl":null,"url":null,"abstract":"Long-term actions involve many important visual concepts, e.g., objects, motions, and sub-actions, and there are various relations among these concepts, which we call basic relations. These basic relations will jointly affect each other during the temporal evolution of long-term actions, which forms the high-order relations that are essential for long-term action recognition. In this paper, we propose a Graph-based High-order Relation Modeling (GHRM) module to exploit the high-order relations in the long-term actions for long-term action recognition. In GHRM, each basic relation in the long-term actions will be modeled by a graph, where each node represents a segment in a long video. Moreover, when modeling each basic relation, the information from all the other basic relations will be incorporated by GHRM, and thus the high-order relations in the long-term actions can be well exploited. To better exploit the high-order relations along the time dimension, we design a GHRM-layer consisting of a Temporal-GHRM branch and a Semantic-GHRM branch, which aims to model the local temporal high-order relations and global semantic high-order relations. The experimental results on three long-term action recognition datasets, namely, Breakfast, Charades, and MultiThumos, demonstrate the effectiveness of our model.","PeriodicalId":339646,"journal":{"name":"2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR46437.2021.00887","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 27

Abstract

Long-term actions involve many important visual concepts, e.g., objects, motions, and sub-actions, and there are various relations among these concepts, which we call basic relations. These basic relations will jointly affect each other during the temporal evolution of long-term actions, which forms the high-order relations that are essential for long-term action recognition. In this paper, we propose a Graph-based High-order Relation Modeling (GHRM) module to exploit the high-order relations in the long-term actions for long-term action recognition. In GHRM, each basic relation in the long-term actions will be modeled by a graph, where each node represents a segment in a long video. Moreover, when modeling each basic relation, the information from all the other basic relations will be incorporated by GHRM, and thus the high-order relations in the long-term actions can be well exploited. To better exploit the high-order relations along the time dimension, we design a GHRM-layer consisting of a Temporal-GHRM branch and a Semantic-GHRM branch, which aims to model the local temporal high-order relations and global semantic high-order relations. The experimental results on three long-term action recognition datasets, namely, Breakfast, Charades, and MultiThumos, demonstrate the effectiveness of our model.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于图的长期动作识别高阶关系建模

长期动作涉及许多重要的视觉概念，如物体、运动、子动作等，这些概念之间存在着各种各样的关系，我们称之为基本关系。这些基本关系在长期行为的时间演化过程中相互影响，形成了长期行为认知所必需的高阶关系。本文提出了一种基于图的高阶关系建模(GHRM)模块，利用长期动作中的高阶关系进行长期动作识别。在GHRM中，长期动作中的每个基本关系将通过一个图来建模，其中每个节点表示长视频中的一个片段。此外，在对每个基本关系建模时，GHRM将所有其他基本关系的信息纳入其中，从而可以很好地利用长期行动中的高阶关系。为了更好地利用时间维度上的高阶关系，我们设计了一个由时间- ghrm分支和语义- ghrm分支组成的ghrm层，旨在对局部时间高阶关系和全局语义高阶关系进行建模。在Breakfast、Charades和MultiThumos三个长期动作识别数据集上的实验结果验证了该模型的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量

期刊最新文献

Multi-Label Learning from Single Positive Labels Panoramic Image Reflection Removal Self-Aligned Video Deraining with Transmission-Depth Consistency PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors Ultra-High-Definition Image Dehazing via Multi-Guided Bilateral Learning