{"title":"On fusion of timbre-motivated features for singing voice detection and singer identification","authors":"T. Nwe, Haizhou Li","doi":"10.1109/ICASSP.2008.4518087","DOIUrl":null,"url":null,"abstract":"Timbre is the quality of sound which allows the ear to distinguish between musical sounds. In this paper, we study timbre effects in identification of singing voice segments in popular songs. Firstly, we identify between singing voice and instrumental segments in a song. Then, singing voice segments are further categorized according to their singer identity. Timbre-motivated effects are formulated by fusion of systems that use the features from vibrato, harmonic information and other features extracted using Mel and Log frequency scale filter banks. Statistical methods to select singing voice segments with high confidence measure are proposed for better performance in singer identification process. The experiments conducted on a database of 214 popular songs show that the proposed approach is effective.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2008.4518087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27
Abstract
Timbre is the quality of sound which allows the ear to distinguish between musical sounds. In this paper, we study timbre effects in identification of singing voice segments in popular songs. Firstly, we identify between singing voice and instrumental segments in a song. Then, singing voice segments are further categorized according to their singer identity. Timbre-motivated effects are formulated by fusion of systems that use the features from vibrato, harmonic information and other features extracted using Mel and Log frequency scale filter banks. Statistical methods to select singing voice segments with high confidence measure are proposed for better performance in singer identification process. The experiments conducted on a database of 214 popular songs show that the proposed approach is effective.