{"title":"HIDE:针对多视角三维人体参数回归的分层迭代解码增强技术","authors":"Weitao Lin, Jiguang Zhang, Weiliang Meng, Xianglong Liu, Xiaopeng Zhang","doi":"10.1002/cav.2266","DOIUrl":null,"url":null,"abstract":"<p>Parametric human modeling are limited to either single-view frameworks or simple multi-view frameworks, failing to fully leverage the advantages of easily trainable single-view networks and the occlusion-resistant capabilities of multi-view images. The prevalent presence of object occlusion and self-occlusion in real-world scenarios leads to issues of robustness and accuracy in predicting human body parameters. Additionally, many methods overlook the spatial connectivity of human joints in the global estimation of model pose parameters, resulting in cumulative errors in continuous joint parameters.To address these challenges, we propose a flexible and efficient iterative decoding strategy. By extending from single-view images to multi-view video inputs, we achieve local-to-global optimization. We utilize attention mechanisms to capture the rotational dependencies between any node in the human body and all its ancestor nodes, thereby enhancing pose decoding capability. We employ a parameter-level iterative fusion of multi-view image data to achieve flexible integration of global pose information, rapidly obtaining appropriate projection features from different viewpoints, ultimately resulting in precise parameter estimation. Through experiments, we validate the effectiveness of the HIDE method on the Human3.6M and 3DPW datasets, demonstrating significantly improved visualization results compared to previous methods.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HIDE: Hierarchical iterative decoding enhancement for multi-view 3D human parameter regression\",\"authors\":\"Weitao Lin, Jiguang Zhang, Weiliang Meng, Xianglong Liu, Xiaopeng Zhang\",\"doi\":\"10.1002/cav.2266\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Parametric human modeling are limited to either single-view frameworks or simple multi-view frameworks, failing to fully leverage the advantages of easily trainable single-view networks and the occlusion-resistant capabilities of multi-view images. The prevalent presence of object occlusion and self-occlusion in real-world scenarios leads to issues of robustness and accuracy in predicting human body parameters. Additionally, many methods overlook the spatial connectivity of human joints in the global estimation of model pose parameters, resulting in cumulative errors in continuous joint parameters.To address these challenges, we propose a flexible and efficient iterative decoding strategy. By extending from single-view images to multi-view video inputs, we achieve local-to-global optimization. We utilize attention mechanisms to capture the rotational dependencies between any node in the human body and all its ancestor nodes, thereby enhancing pose decoding capability. We employ a parameter-level iterative fusion of multi-view image data to achieve flexible integration of global pose information, rapidly obtaining appropriate projection features from different viewpoints, ultimately resulting in precise parameter estimation. Through experiments, we validate the effectiveness of the HIDE method on the Human3.6M and 3DPW datasets, demonstrating significantly improved visualization results compared to previous methods.</p>\",\"PeriodicalId\":50645,\"journal\":{\"name\":\"Computer Animation and Virtual Worlds\",\"volume\":\"35 3\",\"pages\":\"\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2024-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Animation and Virtual Worlds\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cav.2266\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Animation and Virtual Worlds","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cav.2266","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
HIDE: Hierarchical iterative decoding enhancement for multi-view 3D human parameter regression
Parametric human modeling are limited to either single-view frameworks or simple multi-view frameworks, failing to fully leverage the advantages of easily trainable single-view networks and the occlusion-resistant capabilities of multi-view images. The prevalent presence of object occlusion and self-occlusion in real-world scenarios leads to issues of robustness and accuracy in predicting human body parameters. Additionally, many methods overlook the spatial connectivity of human joints in the global estimation of model pose parameters, resulting in cumulative errors in continuous joint parameters.To address these challenges, we propose a flexible and efficient iterative decoding strategy. By extending from single-view images to multi-view video inputs, we achieve local-to-global optimization. We utilize attention mechanisms to capture the rotational dependencies between any node in the human body and all its ancestor nodes, thereby enhancing pose decoding capability. We employ a parameter-level iterative fusion of multi-view image data to achieve flexible integration of global pose information, rapidly obtaining appropriate projection features from different viewpoints, ultimately resulting in precise parameter estimation. Through experiments, we validate the effectiveness of the HIDE method on the Human3.6M and 3DPW datasets, demonstrating significantly improved visualization results compared to previous methods.
期刊介绍:
With the advent of very powerful PCs and high-end graphics cards, there has been an incredible development in Virtual Worlds, real-time computer animation and simulation, games. But at the same time, new and cheaper Virtual Reality devices have appeared allowing an interaction with these real-time Virtual Worlds and even with real worlds through Augmented Reality. Three-dimensional characters, especially Virtual Humans are now of an exceptional quality, which allows to use them in the movie industry. But this is only a beginning, as with the development of Artificial Intelligence and Agent technology, these characters will become more and more autonomous and even intelligent. They will inhabit the Virtual Worlds in a Virtual Life together with animals and plants.