Learning human-object interactions by attention aggregation

Dongzhou Gu, Shuang Cai, Shiwei Ma
{"title":"Learning human-object interactions by attention aggregation","authors":"Dongzhou Gu, Shuang Cai, Shiwei Ma","doi":"10.1117/12.2604708","DOIUrl":null,"url":null,"abstract":"Recent years, deep neural networks have achieved impressive progress in object detection. However, detecting the interactions between objects is still challenging. Many researchers pay attention to human-object interaction (HOI) detection as a basic task in detailed scene understanding. Most conventional HOI detectors are in a two-stage manner and usually slow in inference. One-stage methods for direct parallel detection of HOI triples breaks through the limitation of object detection, but the extracted features are still insufficient. To overcome these drawbacks above, we propose an improved one-stage HOI detection approach, in which attention aggregation module and dynamic point matching strategy play key roles. The attention aggregation enhances the semantic expression ability of interaction points explicitly by aggregating contextually important information, while the matching strategy can filter the negative HOI pairs effectively in the inference stage. Extensive experiments on two challenging HOI detection benchmarks: VCOCO and HICO-DET show that our method achieves considerable performance compared to state-of-the-art performance without any additional human pose and language features.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"68 1","pages":"119130H - 119130H-5"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2604708","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recent years, deep neural networks have achieved impressive progress in object detection. However, detecting the interactions between objects is still challenging. Many researchers pay attention to human-object interaction (HOI) detection as a basic task in detailed scene understanding. Most conventional HOI detectors are in a two-stage manner and usually slow in inference. One-stage methods for direct parallel detection of HOI triples breaks through the limitation of object detection, but the extracted features are still insufficient. To overcome these drawbacks above, we propose an improved one-stage HOI detection approach, in which attention aggregation module and dynamic point matching strategy play key roles. The attention aggregation enhances the semantic expression ability of interaction points explicitly by aggregating contextually important information, while the matching strategy can filter the negative HOI pairs effectively in the inference stage. Extensive experiments on two challenging HOI detection benchmarks: VCOCO and HICO-DET show that our method achieves considerable performance compared to state-of-the-art performance without any additional human pose and language features.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过注意力聚合学习人与物体的交互
近年来,深度神经网络在目标检测方面取得了令人瞩目的进展。然而,检测物体之间的相互作用仍然具有挑战性。人-物交互(HOI)检测作为详细场景理解的一项基本任务,受到了众多研究者的关注。大多数传统的HOI检测器采用两阶段方式,通常推理速度较慢。HOI三元组直接并行检测的单阶段方法突破了目标检测的局限性,但提取的特征仍然不足。为了克服这些缺点,我们提出了一种改进的单阶段HOI检测方法,其中注意力聚集模块和动态点匹配策略发挥了关键作用。注意聚合通过聚合上下文重要信息,显式地增强了交互点的语义表达能力,而匹配策略可以在推理阶段有效地过滤负面的HOI对。在两个具有挑战性的HOI检测基准:VCOCO和HICO-DET上进行的大量实验表明,与没有任何额外的人体姿势和语言特征的最先进性能相比,我们的方法取得了相当大的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Learning human-object interactions by attention aggregation Modified SLIC segmentation for medical hyperspectral cell images Detection of Wagyu beef sources with image classification using convolutional neural network Color constancy using AlexNet convolutional neural network Tensor-patch-based discriminative marginalized least squares regression for membranous nephropathy hyperspectral data classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1