{"title":"听觉表征在稀疏性声源分离中的应用","authors":"J. Burred, T. Sikora","doi":"10.1109/ICICS.2005.1689302","DOIUrl":null,"url":null,"abstract":"Sparsity-based source separation algorithms often rely on a transformation into a sparse domain to improve mixture disjointness and therefore facilitate separation. To this end, the most commonly used time-frequency representation has been the short time Fourier transform (STFT). The purpose of this paper is to study the use of auditory-based representations instead of the STFT. We first evaluate the STFT disjointness properties for the case of speech and music signals, and show that auditory representations based on the equal rectangular bandwidth (ERB) and Bark frequency scales can improve the disjointness of the transformed mixtures","PeriodicalId":425178,"journal":{"name":"2005 5th International Conference on Information Communications & Signal Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":"{\"title\":\"On the Use of Auditory Representations for Sparsity-Based Sound Source Separation\",\"authors\":\"J. Burred, T. Sikora\",\"doi\":\"10.1109/ICICS.2005.1689302\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sparsity-based source separation algorithms often rely on a transformation into a sparse domain to improve mixture disjointness and therefore facilitate separation. To this end, the most commonly used time-frequency representation has been the short time Fourier transform (STFT). The purpose of this paper is to study the use of auditory-based representations instead of the STFT. We first evaluate the STFT disjointness properties for the case of speech and music signals, and show that auditory representations based on the equal rectangular bandwidth (ERB) and Bark frequency scales can improve the disjointness of the transformed mixtures\",\"PeriodicalId\":425178,\"journal\":{\"name\":\"2005 5th International Conference on Information Communications & Signal Processing\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"21\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2005 5th International Conference on Information Communications & Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICS.2005.1689302\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 5th International Conference on Information Communications & Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICS.2005.1689302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On the Use of Auditory Representations for Sparsity-Based Sound Source Separation
Sparsity-based source separation algorithms often rely on a transformation into a sparse domain to improve mixture disjointness and therefore facilitate separation. To this end, the most commonly used time-frequency representation has been the short time Fourier transform (STFT). The purpose of this paper is to study the use of auditory-based representations instead of the STFT. We first evaluate the STFT disjointness properties for the case of speech and music signals, and show that auditory representations based on the equal rectangular bandwidth (ERB) and Bark frequency scales can improve the disjointness of the transformed mixtures