通过将姿势动作与形状分离，让静止的人重新行走

IF 2.2 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Graphical Models Pub Date : 2024-08-01 Epub Date: 2024-07-03 DOI:10.1016/j.gmod.2024.101222

Yongwei Nie , Meihua Zhao , Qing Zhang , Ping Li , Jian Zhu , Hongmin Cai

{"title":"通过将姿势动作与形状分离，让静止的人重新行走","authors":"Yongwei Nie , Meihua Zhao , Qing Zhang , Ping Li , Jian Zhu , Hongmin Cai","doi":"10.1016/j.gmod.2024.101222","DOIUrl":null,"url":null,"abstract":"<div><p>This paper addresses the problem of animating a person in static images, the core task of which is to infer future poses for the person. Existing approaches predict future poses in the 2D space, suffering from entanglement of pose action and shape. We propose a method that generates actions in the 3D space and then transfers them to the 2D person. We first lift the 2D pose of the person to a 3D skeleton, then propose a 3D action synthesis network predicting future skeletons, and finally devise a self-supervised action transfer network that transfers the actions of 3D skeletons to the 2D person. Actions generated in the 3D space look plausible and vivid. More importantly, self-supervised action transfer allows our method to be trained only on a 3D MoCap dataset while being able to process images in different domains. Experiments on three image datasets validate the effectiveness of our method.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"134 ","pages":"Article 101222"},"PeriodicalIF":2.2000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1524070324000109/pdfft?md5=625da7fe01537f9691e2758137e210d0&pid=1-s2.0-S1524070324000109-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Make static person walk again via separating pose action from shape\",\"authors\":\"Yongwei Nie , Meihua Zhao , Qing Zhang , Ping Li , Jian Zhu , Hongmin Cai\",\"doi\":\"10.1016/j.gmod.2024.101222\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This paper addresses the problem of animating a person in static images, the core task of which is to infer future poses for the person. Existing approaches predict future poses in the 2D space, suffering from entanglement of pose action and shape. We propose a method that generates actions in the 3D space and then transfers them to the 2D person. We first lift the 2D pose of the person to a 3D skeleton, then propose a 3D action synthesis network predicting future skeletons, and finally devise a self-supervised action transfer network that transfers the actions of 3D skeletons to the 2D person. Actions generated in the 3D space look plausible and vivid. More importantly, self-supervised action transfer allows our method to be trained only on a 3D MoCap dataset while being able to process images in different domains. Experiments on three image datasets validate the effectiveness of our method.</p></div>\",\"PeriodicalId\":55083,\"journal\":{\"name\":\"Graphical Models\",\"volume\":\"134 \",\"pages\":\"Article 101222\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1524070324000109/pdfft?md5=625da7fe01537f9691e2758137e210d0&pid=1-s2.0-S1524070324000109-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Graphical Models\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1524070324000109\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/7/3 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Graphical Models","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1524070324000109","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/3 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

本文探讨了在静态图像中制作人物动画的问题，其核心任务是推断人物的未来姿势。现有的方法是在二维空间中预测未来的姿势，存在姿势动作和形状的纠缠问题。我们提出了一种在三维空间生成动作，然后将其转移到二维人物的方法。我们首先将人的二维姿势提升为三维骨架，然后提出一个预测未来骨架的三维动作合成网络，最后设计一个自我监督的动作转移网络，将三维骨架的动作转移到二维人身上。在三维空间中生成的动作看起来合理而生动。更重要的是，自监督动作转移使我们的方法只需在三维 MoCap 数据集上进行训练，就能处理不同领域的图像。在三个图像数据集上的实验验证了我们方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Make static person walk again via separating pose action from shape

This paper addresses the problem of animating a person in static images, the core task of which is to infer future poses for the person. Existing approaches predict future poses in the 2D space, suffering from entanglement of pose action and shape. We propose a method that generates actions in the 3D space and then transfers them to the 2D person. We first lift the 2D pose of the person to a 3D skeleton, then propose a 3D action synthesis network predicting future skeletons, and finally devise a self-supervised action transfer network that transfers the actions of 3D skeletons to the 2D person. Actions generated in the 3D space look plausible and vivid. More importantly, self-supervised action transfer allows our method to be trained only on a 3D MoCap dataset while being able to process images in different domains. Experiments on three image datasets validate the effectiveness of our method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Graphical Models 工程技术-计算机：软件工程

CiteScore

3.60

自引率

5.90%

发文量

审稿时长

47 days

期刊介绍： Graphical Models is recognized internationally as a highly rated, top tier journal and is focused on the creation, geometric processing, animation, and visualization of graphical models and on their applications in engineering, science, culture, and entertainment. GMOD provides its readers with thoroughly reviewed and carefully selected papers that disseminate exciting innovations, that teach rigorous theoretical foundations, that propose robust and efficient solutions, or that describe ambitious systems or applications in a variety of topics. We invite papers in five categories: research (contributions of novel theoretical or practical approaches or solutions), survey (opinionated views of the state-of-the-art and challenges in a specific topic), system (the architecture and implementation details of an innovative architecture for a complete system that supports model/animation design, acquisition, analysis, visualization?), application (description of a novel application of know techniques and evaluation of its impact), or lecture (an elegant and inspiring perspective on previously published results that clarifies them and teaches them in a new way). GMOD offers its authors an accelerated review, feedback from experts in the field, immediate online publication of accepted papers, no restriction on color and length (when justified by the content) in the online version, and a broad promotion of published papers. A prestigious group of editors selected from among the premier international researchers in their fields oversees the review process.