{"title":"FakeMusicCaps: a Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models","authors":"Luca Comanducci, Paolo Bestagini, Stefano Tubaro","doi":"arxiv-2409.10684","DOIUrl":null,"url":null,"abstract":"Text-To-Music (TTM) models have recently revolutionized the automatic music\ngeneration research field. Specifically, by reaching superior performances to\nall previous state-of-the-art models and by lowering the technical proficiency\nneeded to use them. Due to these reasons, they have readily started to be\nadopted for commercial uses and music production practices. This widespread\ndiffusion of TTMs poses several concerns regarding copyright violation and\nrightful attribution, posing the need of serious consideration of them by the\naudio forensics community. In this paper, we tackle the problem of detection\nand attribution of TTM-generated data. We propose a dataset, FakeMusicCaps that\ncontains several versions of the music-caption pairs dataset MusicCaps\nre-generated via several state-of-the-art TTM techniques. We evaluate the\nproposed dataset by performing initial experiments regarding the detection and\nattribution of TTM-generated audio.","PeriodicalId":501284,"journal":{"name":"arXiv - EE - Audio and Speech Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Audio and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10684","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Text-To-Music (TTM) models have recently revolutionized the automatic music
generation research field. Specifically, by reaching superior performances to
all previous state-of-the-art models and by lowering the technical proficiency
needed to use them. Due to these reasons, they have readily started to be
adopted for commercial uses and music production practices. This widespread
diffusion of TTMs poses several concerns regarding copyright violation and
rightful attribution, posing the need of serious consideration of them by the
audio forensics community. In this paper, we tackle the problem of detection
and attribution of TTM-generated data. We propose a dataset, FakeMusicCaps that
contains several versions of the music-caption pairs dataset MusicCaps
re-generated via several state-of-the-art TTM techniques. We evaluate the
proposed dataset by performing initial experiments regarding the detection and
attribution of TTM-generated audio.