Occluded human pose estimation based on part-aware discrete diffusion priors

IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Knowledge-Based Systems Pub Date : 2025-04-22 Epub Date: 2025-03-08 DOI:10.1016/j.knosys.2025.113272
Hongyu Xiao , Hui He , Yifan Xie , Yi Zheng
{"title":"Occluded human pose estimation based on part-aware discrete diffusion priors","authors":"Hongyu Xiao ,&nbsp;Hui He ,&nbsp;Yifan Xie ,&nbsp;Yi Zheng","doi":"10.1016/j.knosys.2025.113272","DOIUrl":null,"url":null,"abstract":"<div><div>In this work, we focus on reconstructing human poses from RGB images, with particular attention given to the ambiguity issues caused by complex scenes such as occlusions. The main challenges we face are twofold: how to reconstruct a complete pose based on limited visible cues and how to handle the uncertainty of occluded parts. To address these issues, our primary approach is to leverage human prior knowledge to ensure the physical plausibility of the reconstructed pose and simulate occluded scenarios through the forward process of the diffusion model, followed by recovering the occluded parts through the reverse process. Specifically, we first train hierarchical encoders, codebooks, and decoders to learn rich pose prior knowledge and then incorporate these priors into a discrete diffusion model with multimodal guidance. We train the network to gradually predict clean discrete pose tokens that are consistent with prior knowledge and ultimately decode them into complete body poses. Extensive experimental results on the COCO and 3DMPB datasets demonstrate that our method achieves state-of-the-art performance compared with previous approaches. The code will be publicly available.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113272"},"PeriodicalIF":7.6000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125003193","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/8 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In this work, we focus on reconstructing human poses from RGB images, with particular attention given to the ambiguity issues caused by complex scenes such as occlusions. The main challenges we face are twofold: how to reconstruct a complete pose based on limited visible cues and how to handle the uncertainty of occluded parts. To address these issues, our primary approach is to leverage human prior knowledge to ensure the physical plausibility of the reconstructed pose and simulate occluded scenarios through the forward process of the diffusion model, followed by recovering the occluded parts through the reverse process. Specifically, we first train hierarchical encoders, codebooks, and decoders to learn rich pose prior knowledge and then incorporate these priors into a discrete diffusion model with multimodal guidance. We train the network to gradually predict clean discrete pose tokens that are consistent with prior knowledge and ultimately decode them into complete body poses. Extensive experimental results on the COCO and 3DMPB datasets demonstrate that our method achieves state-of-the-art performance compared with previous approaches. The code will be publicly available.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于部分感知离散扩散先验的遮挡人体姿态估计
在这项工作中,我们专注于从RGB图像重建人体姿势,特别关注由复杂场景(如遮挡)引起的模糊性问题。我们面临的主要挑战是双重的:如何基于有限的可见线索重建一个完整的姿势,以及如何处理遮挡部分的不确定性。为了解决这些问题,我们的主要方法是利用人类的先验知识来确保重建姿态的物理合理性,并通过扩散模型的正向过程模拟遮挡场景,然后通过反向过程恢复遮挡部分。具体来说,我们首先训练分层编码器、码本和解码器来学习丰富的姿态先验知识,然后将这些先验合并到具有多模态制导的离散扩散模型中。我们训练网络逐渐预测干净的离散姿势令牌,这些姿势令牌与先验知识一致,并最终将它们解码成完整的身体姿势。在COCO和3DMPB数据集上的大量实验结果表明,与以前的方法相比,我们的方法达到了最先进的性能。代码将是公开的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Knowledge-Based Systems
Knowledge-Based Systems 工程技术-计算机:人工智能
CiteScore
14.80
自引率
12.50%
发文量
1245
审稿时长
7.8 months
期刊介绍: Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.
期刊最新文献
Efficient intrusion detection in internet of vehicles through optimized node-level capsule graph neural networks for advanced security CG-CGSL: Clustering and graph topological properties co-guided graph structure learning Enhancing fairness and privacy in federated graph neural networks via macro-level restructuring HAQ-ViT: A hardware-aware post-training quantization for efficient vision transformer inference Polarization information restoration for visual reflection removal via cross dual-stream network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1