A Tool for Extracting 3D Avatar-Ready Gesture Animations from Monocular Videos

Andrew W. Feng, Samuel Shin, Youngwoo Yoon
{"title":"A Tool for Extracting 3D Avatar-Ready Gesture Animations from Monocular Videos","authors":"Andrew W. Feng, Samuel Shin, Youngwoo Yoon","doi":"10.1145/3561975.3562953","DOIUrl":null,"url":null,"abstract":"Modeling and generating realistic human gesture animations from speech audios has great impacts on creating a believable virtual human that can interact with human users and mimic real-world face-to-face communications. Large-scale datasets are essential in data-driven research, but creating multi-modal gesture datasets with 3D gesture motions and corresponding speech audios is either expensive to create via traditional workflow such as mocap, or producing subpar results via pose estimations from in-the-wild videos. As a result of such limitations, existing gesture datasets either suffer from shorter duration or lower animation quality, making them less ideal for training gesture synthesis models. Motivated by the key limitations from previous datasets and recent progress in human mesh recovery (HMR), we developed a tool for extracting avatar-ready gesture motions from monocular videos with improved animation quality. The tool utilizes a variational autoencoder (VAE) to refine raw gesture motions. The resulting gestures are in a unified pose representation that includes both body and finger motions and can be readily applied to a virtual avatar via online motion retargeting. We validated the proposed tool on existing datasets and created the refined dataset TED-SMPLX by re-processing videos from the original TED dataset. The new dataset is available at https://andrewfengusa.github.io/TED_SMPLX_Dataset.","PeriodicalId":246179,"journal":{"name":"Proceedings of the 15th ACM SIGGRAPH Conference on Motion, Interaction and Games","volume":"199 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 15th ACM SIGGRAPH Conference on Motion, Interaction and Games","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3561975.3562953","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Modeling and generating realistic human gesture animations from speech audios has great impacts on creating a believable virtual human that can interact with human users and mimic real-world face-to-face communications. Large-scale datasets are essential in data-driven research, but creating multi-modal gesture datasets with 3D gesture motions and corresponding speech audios is either expensive to create via traditional workflow such as mocap, or producing subpar results via pose estimations from in-the-wild videos. As a result of such limitations, existing gesture datasets either suffer from shorter duration or lower animation quality, making them less ideal for training gesture synthesis models. Motivated by the key limitations from previous datasets and recent progress in human mesh recovery (HMR), we developed a tool for extracting avatar-ready gesture motions from monocular videos with improved animation quality. The tool utilizes a variational autoencoder (VAE) to refine raw gesture motions. The resulting gestures are in a unified pose representation that includes both body and finger motions and can be readily applied to a virtual avatar via online motion retargeting. We validated the proposed tool on existing datasets and created the refined dataset TED-SMPLX by re-processing videos from the original TED dataset. The new dataset is available at https://andrewfengusa.github.io/TED_SMPLX_Dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一个从单目视频中提取3D Avatar-Ready手势动画的工具
从语音音频中建模和生成逼真的人类手势动画对创建一个可信的虚拟人具有重要影响,该虚拟人可以与人类用户交互并模拟现实世界中的面对面交流。大规模数据集在数据驱动的研究中是必不可少的,但是通过传统的工作流程(如动作捕捉)创建具有3D手势运动和相应语音音频的多模态手势数据集是昂贵的,或者通过从野生视频中进行姿态估计产生低于标准的结果。由于这些限制,现有的手势数据集要么持续时间较短,要么动画质量较低,这使得它们不太适合训练手势合成模型。由于先前数据集的关键限制和人类网格恢复(HMR)的最新进展,我们开发了一种工具,用于从单眼视频中提取具有改进动画质量的虚拟姿态运动。该工具利用变分自编码器(VAE)来改进原始手势动作。由此产生的手势是一个统一的姿态表示,包括身体和手指的动作,可以很容易地通过在线动作重定向应用于虚拟化身。我们在现有数据集上验证了提出的工具,并通过重新处理原始TED数据集的视频创建了改进的数据集TED- smplx。新的数据集可在https://andrewfengusa.github.io/TED_SMPLX_Dataset上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A new framework for the evaluation of locomotive motion datasets through motion matching techniques Stealthy path planning against dynamic observers Simulating Fracture in Anisotropic Materials Containing Impurities A Practical Method for Butterfly Motion Capture A Tool for Extracting 3D Avatar-Ready Gesture Animations from Monocular Videos
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1