Aman Kumar Sharma, Kavya Ranjan Saxena, Vipul Arora
{"title":"FREQUENCY-ANCHORED DEEP NETWORKS FOR POLYPHONIC MELODY EXTRACTION","authors":"Aman Kumar Sharma, Kavya Ranjan Saxena, Vipul Arora","doi":"10.1109/NCC52529.2021.9530037","DOIUrl":null,"url":null,"abstract":"Extraction of the predominant melodic line from polyphonic audio containing more than one source playing simultaneously is a challenging task in the field of music information retrieval. The proposed method aims at providing finer F0s, and not coarse notes while using deep classifiers. Frequency-anchored input features extracted from constant Q-transform allow the signatures of melody to be independent of F0. The proposed scheme also takes care of the data imbalance problem across classes, as it uses only two or three output classes as opposed to a large number of notes. Experimental evaluation shows the proposed method outperforms a state-of-the-art deep learning-based melody estimation method.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC52529.2021.9530037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Extraction of the predominant melodic line from polyphonic audio containing more than one source playing simultaneously is a challenging task in the field of music information retrieval. The proposed method aims at providing finer F0s, and not coarse notes while using deep classifiers. Frequency-anchored input features extracted from constant Q-transform allow the signatures of melody to be independent of F0. The proposed scheme also takes care of the data imbalance problem across classes, as it uses only two or three output classes as opposed to a large number of notes. Experimental evaluation shows the proposed method outperforms a state-of-the-art deep learning-based melody estimation method.