基于弱监督的人体姿态估计

Q1 Computer Science Virtual Reality Intelligent Hardware Pub Date : 2023-08-01 Epub Date: 2023-08-24 DOI:10.1016/j.vrih.2022.08.010

Xiaoyan Hu, Xizhao Bao, Guoli Wei, Zhaoyu Li

{"title":"基于弱监督的人体姿态估计","authors":"Xiaoyan Hu, Xizhao Bao, Guoli Wei, Zhaoyu Li","doi":"10.1016/j.vrih.2022.08.010","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>In computer vision, simultaneously estimating human pose, shape, and clothing is a practical issue in real life, but remains a challenging task owing to the variety of clothing, complexity of deformation, shortage of large-scale datasets, and difficulty in estimating clothing style.</p></div><div><h3>Methods</h3><p>We propose a multistage weakly supervised method that makes full use of data with less labeled information for learning to estimate human body shape, pose, and clothing deformation. In the first stage, the SMPL human-body model parameters were regressed using the multi-view 2D key points of the human body. Using multi-view information as weakly supervised information can avoid the deep ambiguity problem of a single view, obtain a more accurate human posture, and access supervisory information easily. In the second stage, clothing is represented by a PCAbased model that uses two-dimensional key points of clothing as supervised information to regress the parameters. In the third stage, we predefine an embedding graph for each type of clothing to describe the deformation. Then, the mask information of the clothing is used to further adjust the deformation of the clothing. To facilitate training, we constructed a multi-view synthetic dataset that included BCNet and SURREAL.</p></div><div><h3>Results</h3><p>The Experiments show that the accuracy of our method reaches the same level as that of SOTA methods using strong supervision information while only using weakly supervised information. Because this study uses only weakly supervised information, which is much easier to obtain, it has the advantage of utilizing existing data as training data. Experiments on the DeepFashion2 dataset show that our method can make full use of the existing weak supervision information for fine-tuning on a dataset with little supervision information, compared with the strong supervision information that cannot be trained or adjusted owing to the lack of exact annotation information.</p></div><div><h3>Conclusions</h3><p>Our weak supervision method can accurately estimate human body size, pose, and several common types of clothing and overcome the issues of the current shortage of clothing data.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 4","pages":"Pages 366-377"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Human-pose estimation based on weak supervision\",\"authors\":\"Xiaoyan Hu, Xizhao Bao, Guoli Wei, Zhaoyu Li\",\"doi\":\"10.1016/j.vrih.2022.08.010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><p>In computer vision, simultaneously estimating human pose, shape, and clothing is a practical issue in real life, but remains a challenging task owing to the variety of clothing, complexity of deformation, shortage of large-scale datasets, and difficulty in estimating clothing style.</p></div><div><h3>Methods</h3><p>We propose a multistage weakly supervised method that makes full use of data with less labeled information for learning to estimate human body shape, pose, and clothing deformation. In the first stage, the SMPL human-body model parameters were regressed using the multi-view 2D key points of the human body. Using multi-view information as weakly supervised information can avoid the deep ambiguity problem of a single view, obtain a more accurate human posture, and access supervisory information easily. In the second stage, clothing is represented by a PCAbased model that uses two-dimensional key points of clothing as supervised information to regress the parameters. In the third stage, we predefine an embedding graph for each type of clothing to describe the deformation. Then, the mask information of the clothing is used to further adjust the deformation of the clothing. To facilitate training, we constructed a multi-view synthetic dataset that included BCNet and SURREAL.</p></div><div><h3>Results</h3><p>The Experiments show that the accuracy of our method reaches the same level as that of SOTA methods using strong supervision information while only using weakly supervised information. Because this study uses only weakly supervised information, which is much easier to obtain, it has the advantage of utilizing existing data as training data. Experiments on the DeepFashion2 dataset show that our method can make full use of the existing weak supervision information for fine-tuning on a dataset with little supervision information, compared with the strong supervision information that cannot be trained or adjusted owing to the lack of exact annotation information.</p></div><div><h3>Conclusions</h3><p>Our weak supervision method can accurately estimate human body size, pose, and several common types of clothing and overcome the issues of the current shortage of clothing data.</p></div>\",\"PeriodicalId\":33538,\"journal\":{\"name\":\"Virtual Reality Intelligent Hardware\",\"volume\":\"5 4\",\"pages\":\"Pages 366-377\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Virtual Reality Intelligent Hardware\",\"FirstCategoryId\":\"1093\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2096579622000857\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/8/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virtual Reality Intelligent Hardware","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096579622000857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/8/24 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

摘要

背景在计算机视觉中，同时估计人体姿势、形状和服装是现实生活中的一个实际问题，但由于服装的多样性、变形的复杂性、缺乏大规模数据集以及估计服装风格的困难，这仍然是一项具有挑战性的任务。方法我们提出了一种多阶段弱监督方法，该方法充分利用标记信息较少的数据来学习估计人体形状、姿势和服装变形。在第一阶段，使用人体的多视图2D关键点对SMPL人体模型参数进行回归。使用多视图信息作为弱监督信息可以避免单个视图的深度模糊问题，获得更准确的人体姿态，并方便地访问监督信息。在第二阶段，服装由基于PCA的模型表示，该模型使用服装的二维关键点作为监督信息来回归参数。在第三阶段，我们为每种类型的服装预先定义了一个嵌入图来描述变形。然后，使用衣服的掩码信息来进一步调整衣服的变形。为了便于训练，我们构建了一个包括BCNet和SURREAL的多视图合成数据集。结果实验表明，我们的方法在使用强监督信息而仅使用弱监督信息的情况下，其精度达到了与SOTA方法相同的水平。由于该研究只使用弱监督信息，更容易获得，因此它具有利用现有数据作为训练数据的优势。在DeepFashion2数据集上的实验表明，与由于缺乏精确的注释信息而无法训练或调整的强监督信息相比，我们的方法可以充分利用现有的弱监督信息在监督信息很少的数据集上进行微调。结论我们的弱监督方法可以准确估计人体大小、姿势和几种常见的服装类型，克服了目前服装数据短缺的问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Human-pose estimation based on weak supervision

Background

In computer vision, simultaneously estimating human pose, shape, and clothing is a practical issue in real life, but remains a challenging task owing to the variety of clothing, complexity of deformation, shortage of large-scale datasets, and difficulty in estimating clothing style.

Methods

We propose a multistage weakly supervised method that makes full use of data with less labeled information for learning to estimate human body shape, pose, and clothing deformation. In the first stage, the SMPL human-body model parameters were regressed using the multi-view 2D key points of the human body. Using multi-view information as weakly supervised information can avoid the deep ambiguity problem of a single view, obtain a more accurate human posture, and access supervisory information easily. In the second stage, clothing is represented by a PCAbased model that uses two-dimensional key points of clothing as supervised information to regress the parameters. In the third stage, we predefine an embedding graph for each type of clothing to describe the deformation. Then, the mask information of the clothing is used to further adjust the deformation of the clothing. To facilitate training, we constructed a multi-view synthetic dataset that included BCNet and SURREAL.

Results

The Experiments show that the accuracy of our method reaches the same level as that of SOTA methods using strong supervision information while only using weakly supervised information. Because this study uses only weakly supervised information, which is much easier to obtain, it has the advantage of utilizing existing data as training data. Experiments on the DeepFashion2 dataset show that our method can make full use of the existing weak supervision information for fine-tuning on a dataset with little supervision information, compared with the strong supervision information that cannot be trained or adjusted owing to the lack of exact annotation information.

Conclusions

Our weak supervision method can accurately estimate human body size, pose, and several common types of clothing and overcome the issues of the current shortage of clothing data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊