Diffusion-based Human Motion Style Transfer with Semantic Guidance

IF 2.7 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Computer Graphics Forum Pub Date : 2024-10-09 DOI:10.1111/cgf.15169
Lei Hu, Zihao Zhang, Yongjing Ye, Yiwen Xu, Shihong Xia
{"title":"Diffusion-based Human Motion Style Transfer with Semantic Guidance","authors":"Lei Hu,&nbsp;Zihao Zhang,&nbsp;Yongjing Ye,&nbsp;Yiwen Xu,&nbsp;Shihong Xia","doi":"10.1111/cgf.15169","DOIUrl":null,"url":null,"abstract":"<p>3D Human motion style transfer is a fundamental problem in computer graphic and animation processing. Existing AdaIN-based methods necessitate datasets with balanced style distribution and content/style labels to train the clustered latent space. However, we may encounter a single unseen style example in practical scenarios, but not in sufficient quantity to constitute a style cluster for AdaIN-based methods. Therefore, in this paper, we propose a novel two-stage framework for few-shot style transfer learning based on the diffusion model. Specifically, in the first stage, we pre-train a diffusion-based text-to-motion model as a generative prior so that it can cope with various content motion inputs. In the second stage, based on the single style example, we fine-tune the pre-trained diffusion model in a few-shot manner to make it capable of style transfer. The key idea is regarding the reverse process of diffusion as a motion-style translation process since the motion styles can be viewed as special motion variations. During the fine-tuning for style transfer, a simple yet effective semantic-guided style transfer loss coordinated with style example reconstruction loss is introduced to supervise the style transfer in CLIP semantic space. The qualitative and quantitative evaluations demonstrate that our method can achieve state-of-the-art performance and has practical applications. The source code is available at https://github.com/hlcdyy/diffusion-based-motion-style-transfer.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 8","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.15169","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

3D Human motion style transfer is a fundamental problem in computer graphic and animation processing. Existing AdaIN-based methods necessitate datasets with balanced style distribution and content/style labels to train the clustered latent space. However, we may encounter a single unseen style example in practical scenarios, but not in sufficient quantity to constitute a style cluster for AdaIN-based methods. Therefore, in this paper, we propose a novel two-stage framework for few-shot style transfer learning based on the diffusion model. Specifically, in the first stage, we pre-train a diffusion-based text-to-motion model as a generative prior so that it can cope with various content motion inputs. In the second stage, based on the single style example, we fine-tune the pre-trained diffusion model in a few-shot manner to make it capable of style transfer. The key idea is regarding the reverse process of diffusion as a motion-style translation process since the motion styles can be viewed as special motion variations. During the fine-tuning for style transfer, a simple yet effective semantic-guided style transfer loss coordinated with style example reconstruction loss is introduced to supervise the style transfer in CLIP semantic space. The qualitative and quantitative evaluations demonstrate that our method can achieve state-of-the-art performance and has practical applications. The source code is available at https://github.com/hlcdyy/diffusion-based-motion-style-transfer.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于扩散的人体运动风格转移与语义指导
三维人体运动风格转换是计算机图形和动画处理中的一个基本问题。现有的基于 AdaIN 的方法需要具有均衡风格分布和内容/风格标签的数据集来训练聚类潜在空间。然而,在实际场景中,我们可能会遇到单一的未见风格示例,但其数量不足以构成基于 AdaIN 方法的风格聚类。因此,在本文中,我们提出了一种基于扩散模型的两阶段式风格迁移学习框架。具体来说,在第一阶段,我们预先训练一个基于扩散的文本到运动模型作为生成先验,使其能够应对各种内容运动输入。在第二阶段,基于单一风格示例,我们对预训练的扩散模型进行微调,使其能够进行风格转换。关键的思路是将扩散的反向过程视为运动风格的转换过程,因为运动风格可以被视为特殊的运动变化。在风格转换的微调过程中,引入了一种简单而有效的语义指导风格转换损失,该损失与风格示例重构损失相协调,用于监督 CLIP 语义空间中的风格转换。定性和定量评估表明,我们的方法可以达到最先进的性能,并具有实际应用价值。源代码见 https://github.com/hlcdyy/diffusion-based-motion-style-transfer。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer Graphics Forum
Computer Graphics Forum 工程技术-计算机:软件工程
CiteScore
5.80
自引率
12.00%
发文量
175
审稿时长
3-6 weeks
期刊介绍: Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.
期刊最新文献
Front Matter DiffPop: Plausibility-Guided Object Placement Diffusion for Image Composition Front Matter LGSur-Net: A Local Gaussian Surface Representation Network for Upsampling Highly Sparse Point Cloud 𝒢-Style: Stylized Gaussian Splatting
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1