Remixing Music with Visual Conditioning

Li-Chia Yang, Alexander Lerch
{"title":"Remixing Music with Visual Conditioning","authors":"Li-Chia Yang, Alexander Lerch","doi":"10.1109/ISM.2020.00039","DOIUrl":null,"url":null,"abstract":"We propose a visually conditioned music remixing system by incorporating deep visual and audio models. The method is based on a state of the art audio-visual source separation model which performs music instrument source separation with video information. We modified the model to work with user-selected images instead of videos as visual input during inference to enable separation of audio-only content. Furthermore, we propose a remixing engine that generalizes the task of source separation into music remixing. The proposed method is able to achieve improved audio quality compared to remixing performed by the separate-and-add method with a state-of-the-art audiovisual source separation model.","PeriodicalId":120972,"journal":{"name":"2020 IEEE International Symposium on Multimedia (ISM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Multimedia (ISM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISM.2020.00039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

We propose a visually conditioned music remixing system by incorporating deep visual and audio models. The method is based on a state of the art audio-visual source separation model which performs music instrument source separation with video information. We modified the model to work with user-selected images instead of videos as visual input during inference to enable separation of audio-only content. Furthermore, we propose a remixing engine that generalizes the task of source separation into music remixing. The proposed method is able to achieve improved audio quality compared to remixing performed by the separate-and-add method with a state-of-the-art audiovisual source separation model.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
混合音乐与视觉条件
我们提出了一种结合深度视觉和音频模型的视觉条件音乐混音系统。该方法基于最先进的视听源分离模型,该模型使用视频信息进行乐器源分离。我们修改了模型,在推理过程中使用用户选择的图像而不是视频作为视觉输入,以实现音频内容的分离。此外,我们提出了一个重混引擎,将源分离的任务推广到音乐重混中。与使用最先进的视听源分离模型的分离和添加方法进行的重混音相比,所提出的方法能够实现改进的音频质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Structured Pruning of LSTMs via Eigenanalysis and Geometric Median for Mobile Multimedia and Deep Learning Applications Real-Time Detection of Events in Soccer Videos using 3D Convolutional Neural Networks Audio Captioning Based on Combined Audio and Semantic Embeddings Two types of flows admission control method for maximizing all user satisfaction considering seek-bar operation Better Look Twice - Improving Visual Scene Perception Using a Two-Stage Approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1