SD-Pose: facilitating space-decoupled human pose estimation via adaptive pose perception guidance

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Multimedia Systems Pub Date : 2024-05-31 DOI:10.1007/s00530-024-01368-y

Zhi Liu, Shengzhao Hao, Yunhua Lu, Lei Liu, Cong Chen, Ruohuang Wang

{"title":"SD-Pose: facilitating space-decoupled human pose estimation via adaptive pose perception guidance","authors":"Zhi Liu, Shengzhao Hao, Yunhua Lu, Lei Liu, Cong Chen, Ruohuang Wang","doi":"10.1007/s00530-024-01368-y","DOIUrl":null,"url":null,"abstract":"<p>Human pose estimation is a popular and challenging task in computer vision. Currently, the mainstream methods for pose estimation are based on Gaussian heatmaps and coordinate regression techniques. However, the intensive computational overhead and quantization error introduced by heatmaps pose many limitations on their application. And coordinate regression faces difficulties in learning mapping cross and misaligned keypoints, resulting in poor robustness. Recently, pose estimation based on Coordinate Classification encodes global spatial information into one-dimensional representations in X and Y directions, which turns keypoint localization into a classification problem and thus simplifies the model while effectively improving pose estimation accuracy. Motivated by this, SD-Pose is proposed in this work, which is a spatially decoupled human pose estimation model guided by adaptive pose perception. Specifically, the model first employs a Pyramid Adaptive Feature Extractor (PAFE) to obtain multi-scale featuremaps and generate adaptive keypoint weights to assist the model in extracting unique features for keypoints at different locations. Then, the Spatial Decoupling and Coordinated Analysis Module (SDCAM) simplifies the localization problem while considering both global and fine-grained features. Experimental results on MPII human pose and COCO keypoint detection datasets validate the effectiveness of the SD-Pose model and also display satisfied performance in recovering detailed information for keypoints such as Elbow, Hip, and Ankle.</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"48 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01368-y","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Human pose estimation is a popular and challenging task in computer vision. Currently, the mainstream methods for pose estimation are based on Gaussian heatmaps and coordinate regression techniques. However, the intensive computational overhead and quantization error introduced by heatmaps pose many limitations on their application. And coordinate regression faces difficulties in learning mapping cross and misaligned keypoints, resulting in poor robustness. Recently, pose estimation based on Coordinate Classification encodes global spatial information into one-dimensional representations in X and Y directions, which turns keypoint localization into a classification problem and thus simplifies the model while effectively improving pose estimation accuracy. Motivated by this, SD-Pose is proposed in this work, which is a spatially decoupled human pose estimation model guided by adaptive pose perception. Specifically, the model first employs a Pyramid Adaptive Feature Extractor (PAFE) to obtain multi-scale featuremaps and generate adaptive keypoint weights to assist the model in extracting unique features for keypoints at different locations. Then, the Spatial Decoupling and Coordinated Analysis Module (SDCAM) simplifies the localization problem while considering both global and fine-grained features. Experimental results on MPII human pose and COCO keypoint detection datasets validate the effectiveness of the SD-Pose model and also display satisfied performance in recovering detailed information for keypoints such as Elbow, Hip, and Ankle.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SD-Pose：通过自适应姿势感知引导促进空间解耦人体姿势估计

人体姿态估计是计算机视觉领域一项热门而又具有挑战性的任务。目前，姿势估计的主流方法基于高斯热图和坐标回归技术。然而，热图带来的密集计算开销和量化误差对其应用造成了诸多限制。而坐标回归在学习映射交叉和错位关键点时面临困难，导致鲁棒性较差。最近，基于坐标分类的姿态估计将全局空间信息编码为 X 和 Y 方向的一维表示，这就把关键点定位变成了分类问题，从而简化了模型，同时有效提高了姿态估计的准确性。受此启发，本研究提出了 SD-Pose 模型，它是一种以自适应姿势感知为指导的空间解耦人体姿势估计模型。具体来说，该模型首先采用金字塔自适应特征提取器（PAFE）获取多尺度特征图，并生成自适应关键点权重，以帮助模型提取不同位置关键点的独特特征。然后，空间解耦与协调分析模块（SDCAM）简化了定位问题，同时考虑了全局和细粒度特征。在 MPII 人体姿态和 COCO 关键点检测数据集上的实验结果验证了 SD-Pose 模型的有效性，并显示了在恢复肘部、髋部和踝部等关键点的详细信息方面令人满意的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Multimedia Systems 工程技术-计算机：理论方法

CiteScore

5.40

自引率

7.70%

发文量

148

审稿时长

4.5 months

期刊介绍： This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.