Comparative Analysis of Pretrained Audio Representations in Music Recommender Systems

Yan-Martin Tamm, Anna Aljanaki
{"title":"Comparative Analysis of Pretrained Audio Representations in Music Recommender Systems","authors":"Yan-Martin Tamm, Anna Aljanaki","doi":"arxiv-2409.08987","DOIUrl":null,"url":null,"abstract":"Over the years, Music Information Retrieval (MIR) has proposed various models\npretrained on large amounts of music data. Transfer learning showcases the\nproven effectiveness of pretrained backend models with a broad spectrum of\ndownstream tasks, including auto-tagging and genre classification. However, MIR\npapers generally do not explore the efficiency of pretrained models for Music\nRecommender Systems (MRS). In addition, the Recommender Systems community tends\nto favour traditional end-to-end neural network learning over these models. Our\nresearch addresses this gap and evaluates the applicability of six pretrained\nbackend models (MusicFM, Music2Vec, MERT, EncodecMAE, Jukebox, and MusiCNN) in\nthe context of MRS. We assess their performance using three recommendation\nmodels: K-nearest neighbours (KNN), shallow neural network, and BERT4Rec. Our\nfindings suggest that pretrained audio representations exhibit significant\nperformance variability between traditional MIR tasks and MRS, indicating that\nvaluable aspects of musical information captured by backend models may differ\ndepending on the task. This study establishes a foundation for further\nexploration of pretrained audio representations to enhance music recommendation\nsystems.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"19 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Over the years, Music Information Retrieval (MIR) has proposed various models pretrained on large amounts of music data. Transfer learning showcases the proven effectiveness of pretrained backend models with a broad spectrum of downstream tasks, including auto-tagging and genre classification. However, MIR papers generally do not explore the efficiency of pretrained models for Music Recommender Systems (MRS). In addition, the Recommender Systems community tends to favour traditional end-to-end neural network learning over these models. Our research addresses this gap and evaluates the applicability of six pretrained backend models (MusicFM, Music2Vec, MERT, EncodecMAE, Jukebox, and MusiCNN) in the context of MRS. We assess their performance using three recommendation models: K-nearest neighbours (KNN), shallow neural network, and BERT4Rec. Our findings suggest that pretrained audio representations exhibit significant performance variability between traditional MIR tasks and MRS, indicating that valuable aspects of musical information captured by backend models may differ depending on the task. This study establishes a foundation for further exploration of pretrained audio representations to enhance music recommendation systems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
音乐推荐系统中的预训练音频表示比较分析
多年来,音乐信息检索(MIR)提出了各种在大量音乐数据上进行预训练的模型。迁移学习展示了预训练后端模型在自动标记和流派分类等广泛下游任务中的有效性。然而,MIR 论文一般不探讨预训练模型在音乐推荐系统(MRS)中的效率。此外,与这些模型相比,推荐系统社区更倾向于传统的端到端神经网络学习。我们的研究填补了这一空白,并评估了六种预训练后端模型(MusicFM、Music2Vec、MERT、EncodecMAE、Jukebox 和 MusiCNN)在 MRS 中的适用性。我们使用三种推荐模型来评估它们的性能:我们使用三种推荐模型评估了它们的性能:K-近邻(KNN)、浅层神经网络和 BERT4Rec。我们的研究结果表明,在传统的 MIR 任务和 MRS 之间,预训练的音频表征表现出显著的性能差异,这表明后端模型捕捉到的音乐信息的宝贵方面可能因任务而异。这项研究为进一步探索预训练音频表征以增强音乐推荐系统奠定了基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference Retrieve, Annotate, Evaluate, Repeat: Leveraging Multimodal LLMs for Large-Scale Product Retrieval Evaluation Active Reconfigurable Intelligent Surface Empowered Synthetic Aperture Radar Imaging FLARE: Fusing Language Models and Collaborative Architectures for Recommender Enhancement Basket-Enhanced Heterogenous Hypergraph for Price-Sensitive Next Basket Recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1