像我们一样做:多人视频到视频传输

2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2021-04-10 DOI:10.1109/MIPR51284.2021.00020

Mickael Cormier, Houraalsadat Mortazavi Moshkenan, Franz Lörch, J. Metzler, J. Beyerer

{"title":"像我们一样做:多人视频到视频传输","authors":"Mickael Cormier, Houraalsadat Mortazavi Moshkenan, Franz Lörch, J. Metzler, J. Beyerer","doi":"10.1109/MIPR51284.2021.00020","DOIUrl":null,"url":null,"abstract":"Our goal is to transfer the motion of real people from a source video to a target video with realistic results. While recent advances significantly improved image-to-image translations, only few works account for body motions and temporal consistency. However, those focus only on video retargeting for a single actor/ for single actors. In this work, we propose a marker-less approach for multiple-person video-to-video transfer using pose as an intermediate representation. Given a source video with multiple persons dancing or working out, our method transfers the body motion of all actors to a new set of actors in a different video. Differently from recent \"do as I do\" methods, we focus specifically on transferring multiple person at the same time and tackle the related identity switch problem. Our method is able to convincingly transfer body motion to the target video, while preserving specific features of the target video, such as feet touching the floor and relative position of the actors. The evaluation is performed with visual quality and appearance metrics using publicly available videos with the permission of their owners.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Do as we do: Multiple Person Video-To-Video Transfer\",\"authors\":\"Mickael Cormier, Houraalsadat Mortazavi Moshkenan, Franz Lörch, J. Metzler, J. Beyerer\",\"doi\":\"10.1109/MIPR51284.2021.00020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Our goal is to transfer the motion of real people from a source video to a target video with realistic results. While recent advances significantly improved image-to-image translations, only few works account for body motions and temporal consistency. However, those focus only on video retargeting for a single actor/ for single actors. In this work, we propose a marker-less approach for multiple-person video-to-video transfer using pose as an intermediate representation. Given a source video with multiple persons dancing or working out, our method transfers the body motion of all actors to a new set of actors in a different video. Differently from recent \\\"do as I do\\\" methods, we focus specifically on transferring multiple person at the same time and tackle the related identity switch problem. Our method is able to convincingly transfer body motion to the target video, while preserving specific features of the target video, such as feet touching the floor and relative position of the actors. The evaluation is performed with visual quality and appearance metrics using publicly available videos with the permission of their owners.\",\"PeriodicalId\":139543,\"journal\":{\"name\":\"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MIPR51284.2021.00020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MIPR51284.2021.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

我们的目标是将真实人物的动作从源视频转移到目标视频，并获得逼真的效果。虽然最近的进展显著改善了图像到图像的翻译，但只有很少的作品考虑到身体运动和时间一致性。然而，这些只关注单个演员/单个演员的视频重定向。在这项工作中，我们提出了一种使用姿势作为中间表示的多人视频到视频传输的无标记方法。给定一个有多人跳舞或锻炼的源视频，我们的方法将所有演员的身体动作转移到另一个视频中的一组新演员身上。与最近的“照我做”方法不同，我们专注于多人同时转移，并解决相关的身份转换问题。我们的方法能够令人信服地将身体运动转移到目标视频中，同时保留目标视频的特定特征，例如脚接触地板和演员的相对位置。评估是通过视觉质量和外观指标进行的，使用公开可用的视频，并获得其所有者的许可。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Do as we do: Multiple Person Video-To-Video Transfer

Our goal is to transfer the motion of real people from a source video to a target video with realistic results. While recent advances significantly improved image-to-image translations, only few works account for body motions and temporal consistency. However, those focus only on video retargeting for a single actor/ for single actors. In this work, we propose a marker-less approach for multiple-person video-to-video transfer using pose as an intermediate representation. Given a source video with multiple persons dancing or working out, our method transfers the body motion of all actors to a new set of actors in a different video. Differently from recent "do as I do" methods, we focus specifically on transferring multiple person at the same time and tackle the related identity switch problem. Our method is able to convincingly transfer body motion to the target video, while preserving specific features of the target video, such as feet touching the floor and relative position of the actors. The evaluation is performed with visual quality and appearance metrics using publicly available videos with the permission of their owners.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)

自引率

0.00%

发文量

期刊最新文献

XM2A: Multi-Scale Multi-Head Attention with Cross-Talk for Multi-Variate Time Series Analysis Demo Paper: Ad Hoc Search On Statistical Data Based On Categorization And Metadata Augmentation An Introduction to the JPEG Fake Media Initiative Augmented Tai-Chi Chuan Practice Tool with Pose Evaluation Exploring the Spatial-Visual Locality of Geo-tagged Urban Street Images