{"title":"Telephone handset identification using sparse representations of spectral feature sketches","authors":"Constantine Kotropoulos","doi":"10.1109/IWBF.2013.6547326","DOIUrl":null,"url":null,"abstract":"Speech signals convey useful information for the recording devices used to capture them. Here, acquisition device identification is studied using the sketches of spectral features (SSFs) as intrinsic fingerprints. The SSFs are extracted from the speech signal by first averaging its spectrogram along the time axis and then by mapping the resulting mean spectrogram into a low-dimension space, such that the “distance properties” of the high-dimensional mean spectrograms are preserved. Such a mapping results by taking the inner product of the mean spectrogram with a vector of independent identically distributed random variables drawn from a p-stable distribution. By applying a sparse-representation based classifier to the SSFs, state-of-the-art identification accuracy exceeding 95% has been measured on a set of 8 telephone handsets from Lincoln-Labs Handset Database (LLHDB).","PeriodicalId":412596,"journal":{"name":"2013 International Workshop on Biometrics and Forensics (IWBF)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Workshop on Biometrics and Forensics (IWBF)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWBF.2013.6547326","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21
Abstract
Speech signals convey useful information for the recording devices used to capture them. Here, acquisition device identification is studied using the sketches of spectral features (SSFs) as intrinsic fingerprints. The SSFs are extracted from the speech signal by first averaging its spectrogram along the time axis and then by mapping the resulting mean spectrogram into a low-dimension space, such that the “distance properties” of the high-dimensional mean spectrograms are preserved. Such a mapping results by taking the inner product of the mean spectrogram with a vector of independent identically distributed random variables drawn from a p-stable distribution. By applying a sparse-representation based classifier to the SSFs, state-of-the-art identification accuracy exceeding 95% has been measured on a set of 8 telephone handsets from Lincoln-Labs Handset Database (LLHDB).