{"title":"利用监督对比学习和艺术家信息识别音乐年代","authors":"Qiqi He, Xuchen Song, Weituo Hao, Ju-Chiang Wang, Wei-Tsung Lu, Wei Li","doi":"arxiv-2407.05368","DOIUrl":null,"url":null,"abstract":"Does popular music from the 60s sound different than that of the 90s? Prior\nstudy has shown that there would exist some variations of patterns and\nregularities related to instrumentation changes and growing loudness across\nmulti-decadal trends. This indicates that perceiving the era of a song from\nmusical features such as audio and artist information is possible. Music era\ninformation can be an important feature for playlist generation and\nrecommendation. However, the release year of a song can be inaccessible in many\ncircumstances. This paper addresses a novel task of music era recognition. We\nformulate the task as a music classification problem and propose solutions\nbased on supervised contrastive learning. An audio-based model is developed to\npredict the era from audio. For the case where the artist information is\navailable, we extend the audio-based model to take multimodal inputs and\ndevelop a framework, called MultiModal Contrastive (MMC) learning, to enhance\nthe training. Experimental result on Million Song Dataset demonstrates that the\naudio-based model achieves 54% in accuracy with a tolerance of 3-years range;\nincorporating the artist information with the MMC framework for training leads\nto 9% improvement further.","PeriodicalId":501178,"journal":{"name":"arXiv - CS - Sound","volume":"38 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Music Era Recognition Using Supervised Contrastive Learning and Artist Information\",\"authors\":\"Qiqi He, Xuchen Song, Weituo Hao, Ju-Chiang Wang, Wei-Tsung Lu, Wei Li\",\"doi\":\"arxiv-2407.05368\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Does popular music from the 60s sound different than that of the 90s? Prior\\nstudy has shown that there would exist some variations of patterns and\\nregularities related to instrumentation changes and growing loudness across\\nmulti-decadal trends. This indicates that perceiving the era of a song from\\nmusical features such as audio and artist information is possible. Music era\\ninformation can be an important feature for playlist generation and\\nrecommendation. However, the release year of a song can be inaccessible in many\\ncircumstances. This paper addresses a novel task of music era recognition. We\\nformulate the task as a music classification problem and propose solutions\\nbased on supervised contrastive learning. An audio-based model is developed to\\npredict the era from audio. For the case where the artist information is\\navailable, we extend the audio-based model to take multimodal inputs and\\ndevelop a framework, called MultiModal Contrastive (MMC) learning, to enhance\\nthe training. Experimental result on Million Song Dataset demonstrates that the\\naudio-based model achieves 54% in accuracy with a tolerance of 3-years range;\\nincorporating the artist information with the MMC framework for training leads\\nto 9% improvement further.\",\"PeriodicalId\":501178,\"journal\":{\"name\":\"arXiv - CS - Sound\",\"volume\":\"38 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Sound\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.05368\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Sound","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.05368","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Music Era Recognition Using Supervised Contrastive Learning and Artist Information
Does popular music from the 60s sound different than that of the 90s? Prior
study has shown that there would exist some variations of patterns and
regularities related to instrumentation changes and growing loudness across
multi-decadal trends. This indicates that perceiving the era of a song from
musical features such as audio and artist information is possible. Music era
information can be an important feature for playlist generation and
recommendation. However, the release year of a song can be inaccessible in many
circumstances. This paper addresses a novel task of music era recognition. We
formulate the task as a music classification problem and propose solutions
based on supervised contrastive learning. An audio-based model is developed to
predict the era from audio. For the case where the artist information is
available, we extend the audio-based model to take multimodal inputs and
develop a framework, called MultiModal Contrastive (MMC) learning, to enhance
the training. Experimental result on Million Song Dataset demonstrates that the
audio-based model achieves 54% in accuracy with a tolerance of 3-years range;
incorporating the artist information with the MMC framework for training leads
to 9% improvement further.