Music-stylized hierarchical dance synthesis with user control

Q1 Computer Science Virtual Reality Intelligent Hardware Pub Date : 2024-10-01 DOI:10.1016/j.vrih.2024.06.004

Yanbo Cheng, Yichen Jiang, Yingying Wang

{"title":"Music-stylized hierarchical dance synthesis with user control","authors":"Yanbo Cheng, Yichen Jiang, Yingying Wang","doi":"10.1016/j.vrih.2024.06.004","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Synthesizing dance motions to match musical inputs is a significant challenge in animation research. Compared to functional human motions, such as locomotion, dance motions are creative and artistic, often influenced by music, and can be independent body language expressions. Dance choreography requires motion content to follow a general dance genre, whereas dance performances under musical influence are infused with diverse impromptu motion styles. Considering the high expressiveness and variations in space and time, providing accessible and effective user control for tuning dance motion styles remains an open problem.</div></div><div><h3>Methods</h3><div>In this study, we present a hierarchical framework that decouples the dance synthesis task into independent modules. We use a high-level choreography module built as a Transformer-based sequence model to predict the long-term structure of a dance genre and a low-level realization module that implements dance stylization and synchronization to match the musical input or user preferences. This novel framework allows the individual modules to be trained separately. Because of the decoupling, dance composition can fully utilize existing high-quality dance datasets that do not have musical accompaniments, and the dance implementation can conveniently incorporate user controls and edit motions through a decoder network. Each module is replaceable at runtime, which adds flexibility to the synthesis of dance sequences.</div></div><div><h3>Results</h3><div>Synthesized results demonstrate that our framework generates high-quality diverse dance motions that are well adapted to varying musical conditions and user controls.</div></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 5","pages":"Pages 339-357"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virtual Reality Intelligent Hardware","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096579624000342","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Synthesizing dance motions to match musical inputs is a significant challenge in animation research. Compared to functional human motions, such as locomotion, dance motions are creative and artistic, often influenced by music, and can be independent body language expressions. Dance choreography requires motion content to follow a general dance genre, whereas dance performances under musical influence are infused with diverse impromptu motion styles. Considering the high expressiveness and variations in space and time, providing accessible and effective user control for tuning dance motion styles remains an open problem.

Methods

In this study, we present a hierarchical framework that decouples the dance synthesis task into independent modules. We use a high-level choreography module built as a Transformer-based sequence model to predict the long-term structure of a dance genre and a low-level realization module that implements dance stylization and synchronization to match the musical input or user preferences. This novel framework allows the individual modules to be trained separately. Because of the decoupling, dance composition can fully utilize existing high-quality dance datasets that do not have musical accompaniments, and the dance implementation can conveniently incorporate user controls and edit motions through a decoder network. Each module is replaceable at runtime, which adds flexibility to the synthesis of dance sequences.

Results

Synthesized results demonstrate that our framework generates high-quality diverse dance motions that are well adapted to varying musical conditions and user controls.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用户控制的音乐风格化分层舞蹈合成

背景合成与音乐输入相匹配的舞蹈动作是动画研究中的一项重大挑战。与运动等人体功能性动作相比，舞蹈动作具有创造性和艺术性，经常受到音乐的影响，可以是独立的肢体语言表达。舞蹈编排要求动作内容遵循一般的舞蹈流派，而音乐影响下的舞蹈表演则注入了多样化的即兴动作风格。考虑到舞蹈在空间和时间上的高表现力和变化，为调整舞蹈动作风格提供方便有效的用户控制仍是一个有待解决的问题。方法在本研究中，我们提出了一个分层框架，将舞蹈合成任务分解为独立的模块。我们使用一个高级舞蹈编排模块，该模块由一个基于变换器的序列模型和一个低级实现模块组成，前者用于预测舞蹈流派的长期结构，后者用于实现舞蹈风格化和同步，以匹配音乐输入或用户偏好。这种新颖的框架允许对各个模块进行单独训练。由于解耦，舞蹈创作可以充分利用现有的没有音乐伴奏的高质量舞蹈数据集，舞蹈实现可以通过解码器网络方便地纳入用户控制和编辑动作。每个模块都可以在运行时更换，这增加了舞蹈序列合成的灵活性。结果合成结果表明，我们的框架能生成高质量的多样化舞蹈动作，并能很好地适应不同的音乐条件和用户控制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊