MLP-AIR：基于 MLP 的群体活动识别中演员互动关系学习的有效模块

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Knowledge-Based Systems Pub Date : 2024-11-25 Epub Date: 2024-09-02 DOI:10.1016/j.knosys.2024.112453

Guoliang Xu, Jianqin Yin, Shaojie Zhang, Moonjun Gong

{"title":"MLP-AIR：基于 MLP 的群体活动识别中演员互动关系学习的有效模块","authors":"Guoliang Xu, Jianqin Yin, Shaojie Zhang, Moonjun Gong","doi":"10.1016/j.knosys.2024.112453","DOIUrl":null,"url":null,"abstract":"<div><p>Modeling actor interaction relations is crucial for group activity recognition. Previous approaches often adopt a fixed paradigm that involves calculating an affinity matrix to model these interaction relations, yielding significant performance. On the one hand, the affinity matrix introduces an inductive bias that actor interaction relations should be dynamically computed based on the input actor features. On the other hand, MLPs with static parameterization, in which parameters are fixed after training, can represent arbitrary functions. Therefore, it is an open question whether inductive bias is necessary for modeling actor interaction relations. To explore the impact of this inductive bias, we propose an affinity matrix-free paradigm that directly uses the MLP with static parameterization to model actor interaction relations. We term this approach MLP-AIR. This paradigm overcomes the limitations of the inductive bias and enhances the capture of implicit actor interaction relations. Specifically, MLP-AIR consists of two sub-modules: the MLP-based Interaction relation modeling module (MLP-I) and the MLP-based Relation refining module (MLP-R). MLP-I is used to model the spatial–temporal interaction relations by emphasizing cross-actor and cross-frame feature learning. Meanwhile, MLP-R is used to refine the relation between different channels of each relation feature, thereby enhancing the expression ability of the features. MLP-AIR is a plug-and-play module. To evaluate our module, we applied MLP-AIR to replicate three representative methods. We conducted extensive experiments on two widely used benchmarks—the Volleyball and Collective Activity datasets. The experiments demonstrate that MLP-AIR achieves favorable results. The code is available at <span><span>https://github.com/Xuguoliang12/MLP-AIR</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"304 ","pages":"Article 112453"},"PeriodicalIF":7.6000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MLP-AIR: An effective MLP-based module for actor interaction relation learning in group activity recognition\",\"authors\":\"Guoliang Xu, Jianqin Yin, Shaojie Zhang, Moonjun Gong\",\"doi\":\"10.1016/j.knosys.2024.112453\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Modeling actor interaction relations is crucial for group activity recognition. Previous approaches often adopt a fixed paradigm that involves calculating an affinity matrix to model these interaction relations, yielding significant performance. On the one hand, the affinity matrix introduces an inductive bias that actor interaction relations should be dynamically computed based on the input actor features. On the other hand, MLPs with static parameterization, in which parameters are fixed after training, can represent arbitrary functions. Therefore, it is an open question whether inductive bias is necessary for modeling actor interaction relations. To explore the impact of this inductive bias, we propose an affinity matrix-free paradigm that directly uses the MLP with static parameterization to model actor interaction relations. We term this approach MLP-AIR. This paradigm overcomes the limitations of the inductive bias and enhances the capture of implicit actor interaction relations. Specifically, MLP-AIR consists of two sub-modules: the MLP-based Interaction relation modeling module (MLP-I) and the MLP-based Relation refining module (MLP-R). MLP-I is used to model the spatial–temporal interaction relations by emphasizing cross-actor and cross-frame feature learning. Meanwhile, MLP-R is used to refine the relation between different channels of each relation feature, thereby enhancing the expression ability of the features. MLP-AIR is a plug-and-play module. To evaluate our module, we applied MLP-AIR to replicate three representative methods. We conducted extensive experiments on two widely used benchmarks—the Volleyball and Collective Activity datasets. The experiments demonstrate that MLP-AIR achieves favorable results. The code is available at <span><span>https://github.com/Xuguoliang12/MLP-AIR</span><svg><path></path></svg></span>.</p></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"304 \",\"pages\":\"Article 112453\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2024-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705124010876\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/2 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124010876","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/2 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

建立演员互动关系模型对于群体活动识别至关重要。以往的方法通常采用一种固定的模式，即通过计算亲和矩阵来模拟这些交互关系，从而获得显著的性能。一方面，亲和矩阵引入了一种归纳偏差，即应根据输入的演员特征动态计算演员互动关系。另一方面，静态参数化的 MLP（参数在训练后固定）可以表示任意函数。因此，归纳偏差是否是演员互动关系建模的必要条件是一个未决问题。为了探索这种归纳偏差的影响，我们提出了一种无亲和矩阵范式，直接使用静态参数化的 MLP 来模拟演员互动关系。我们将这种方法称为 MLP-AIR。这种范式克服了归纳偏差的局限性，并增强了对隐含演员互动关系的捕捉。具体来说，MLP-AIR 包括两个子模块：基于 MLP 的交互关系建模模块（MLP-I）和基于 MLP 的关系提炼模块（MLP-R）。MLP-I 通过强调跨角色和跨帧特征学习来建立时空交互关系模型。同时，MLP-R 用于细化每个关系特征的不同通道之间的关系，从而增强特征的表达能力。MLP-AIR 是一个即插即用的模块。为了评估我们的模块，我们应用 MLP-AIR 复制了三种具有代表性的方法。我们在两个广泛使用的基准--排球数据集和集体活动数据集上进行了大量实验。实验结果表明，MLP-AIR 取得了良好的效果。代码见 https://github.com/Xuguoliang12/MLP-AIR。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MLP-AIR: An effective MLP-based module for actor interaction relation learning in group activity recognition

Modeling actor interaction relations is crucial for group activity recognition. Previous approaches often adopt a fixed paradigm that involves calculating an affinity matrix to model these interaction relations, yielding significant performance. On the one hand, the affinity matrix introduces an inductive bias that actor interaction relations should be dynamically computed based on the input actor features. On the other hand, MLPs with static parameterization, in which parameters are fixed after training, can represent arbitrary functions. Therefore, it is an open question whether inductive bias is necessary for modeling actor interaction relations. To explore the impact of this inductive bias, we propose an affinity matrix-free paradigm that directly uses the MLP with static parameterization to model actor interaction relations. We term this approach MLP-AIR. This paradigm overcomes the limitations of the inductive bias and enhances the capture of implicit actor interaction relations. Specifically, MLP-AIR consists of two sub-modules: the MLP-based Interaction relation modeling module (MLP-I) and the MLP-based Relation refining module (MLP-R). MLP-I is used to model the spatial–temporal interaction relations by emphasizing cross-actor and cross-frame feature learning. Meanwhile, MLP-R is used to refine the relation between different channels of each relation feature, thereby enhancing the expression ability of the features. MLP-AIR is a plug-and-play module. To evaluate our module, we applied MLP-AIR to replicate three representative methods. We conducted extensive experiments on two widely used benchmarks—the Volleyball and Collective Activity datasets. The experiments demonstrate that MLP-AIR achieves favorable results. The code is available at https://github.com/Xuguoliang12/MLP-AIR.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.