Part2Pose:从复杂场景中的部分推断人体姿势

IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Signal Processing Letters Pub Date : 2024-12-13 DOI:10.1109/LSP.2024.3517418
Rong Zhang;Junneng Feng;Cun Feng;Yirui Wang;Lijun Guo
{"title":"Part2Pose:从复杂场景中的部分推断人体姿势","authors":"Rong Zhang;Junneng Feng;Cun Feng;Yirui Wang;Lijun Guo","doi":"10.1109/LSP.2024.3517418","DOIUrl":null,"url":null,"abstract":"Most of existing Human Pose Estimation (HPE) methods struggle to handle with challenges such as changeable poses, complex backgrounds, and occlusion encountered in complex scenes. To address these problems, a novel HPE network, called Part2Pose, is proposed in this paper. In our Part2Pose, instead of focusing on small-sized keypoints like existing HPE methods do, we first extract image features based on human body parts to expand the detection scope. This strategy enhances the robustness of the extracted features to variations and distractions in complex scenes. Then, a Transformer-based Global Part Relation Module (GPRM) and a graph convolutional network-based Local Part Relation Module (LPRM) are used to capture global and local relationships among different body parts to help infer the position of keypoints. Extensive experiments on challenging datasets, including COCO, CrowdPose and OCHuman, show that the proposed Part2Pose can surpass existing popular state-of-the-art HPE methods. The combination with lightweight networks confirms the robustness and generalizability of our Part2Pose.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"441-445"},"PeriodicalIF":3.2000,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Part2Pose: Inferring Human Pose From Parts in Complex Scenes\",\"authors\":\"Rong Zhang;Junneng Feng;Cun Feng;Yirui Wang;Lijun Guo\",\"doi\":\"10.1109/LSP.2024.3517418\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most of existing Human Pose Estimation (HPE) methods struggle to handle with challenges such as changeable poses, complex backgrounds, and occlusion encountered in complex scenes. To address these problems, a novel HPE network, called Part2Pose, is proposed in this paper. In our Part2Pose, instead of focusing on small-sized keypoints like existing HPE methods do, we first extract image features based on human body parts to expand the detection scope. This strategy enhances the robustness of the extracted features to variations and distractions in complex scenes. Then, a Transformer-based Global Part Relation Module (GPRM) and a graph convolutional network-based Local Part Relation Module (LPRM) are used to capture global and local relationships among different body parts to help infer the position of keypoints. Extensive experiments on challenging datasets, including COCO, CrowdPose and OCHuman, show that the proposed Part2Pose can surpass existing popular state-of-the-art HPE methods. The combination with lightweight networks confirms the robustness and generalizability of our Part2Pose.\",\"PeriodicalId\":13154,\"journal\":{\"name\":\"IEEE Signal Processing Letters\",\"volume\":\"32 \",\"pages\":\"441-445\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Signal Processing Letters\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10798470/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10798470/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

大多数现有的人体姿态估计(HPE)方法都难以处理复杂场景中遇到的姿态变化、复杂背景和遮挡等挑战。为了解决这些问题,本文提出了一种新的HPE网络,称为Part2Pose。在我们的Part2Pose中,我们不像现有的HPE方法那样专注于小尺寸的关键点,而是首先基于人体部位提取图像特征,扩大检测范围。该策略增强了提取的特征对复杂场景中变化和干扰的鲁棒性。然后,使用基于变压器的全局部分关系模块(GPRM)和基于图卷积网络的局部部分关系模块(LPRM)来捕获不同身体部位之间的全局和局部关系,以帮助推断关键点的位置。在具有挑战性的数据集(包括COCO、CrowdPose和ochhuman)上进行的大量实验表明,提出的Part2Pose可以超越现有流行的最先进的HPE方法。与轻量级网络的结合证实了我们Part2Pose的鲁棒性和泛化性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Part2Pose: Inferring Human Pose From Parts in Complex Scenes
Most of existing Human Pose Estimation (HPE) methods struggle to handle with challenges such as changeable poses, complex backgrounds, and occlusion encountered in complex scenes. To address these problems, a novel HPE network, called Part2Pose, is proposed in this paper. In our Part2Pose, instead of focusing on small-sized keypoints like existing HPE methods do, we first extract image features based on human body parts to expand the detection scope. This strategy enhances the robustness of the extracted features to variations and distractions in complex scenes. Then, a Transformer-based Global Part Relation Module (GPRM) and a graph convolutional network-based Local Part Relation Module (LPRM) are used to capture global and local relationships among different body parts to help infer the position of keypoints. Extensive experiments on challenging datasets, including COCO, CrowdPose and OCHuman, show that the proposed Part2Pose can surpass existing popular state-of-the-art HPE methods. The combination with lightweight networks confirms the robustness and generalizability of our Part2Pose.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Signal Processing Letters
IEEE Signal Processing Letters 工程技术-工程:电子与电气
CiteScore
7.40
自引率
12.80%
发文量
339
审稿时长
2.8 months
期刊介绍: The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.
期刊最新文献
Diffusion Generalized Minimum Total Error Entropy Algorithm FDDM: Frequency-Decomposed Diffusion Model for Dose Prediction in Radiotherapy Heterogeneous Dual-Branch Emotional Consistency Network for Facial Expression Recognition Conjugate Gradient and Variance Reduction Based Online ADMM for Low-Rank Distributed Networks Blind Light Field Image Quality Assessment via Frequency Domain Analysis and Auxiliary Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1