{"title":"A Deeper Look at Gaussian Mixture Model Based Anti-Spoofing Systems","authors":"Bhusan Chettri, Bob L. Sturm","doi":"10.1109/ICASSP.2018.8461467","DOIUrl":null,"url":null,"abstract":"A “replay attack” involves replaying pre-recorded speech of an enrolled speaker to bypass an automatic speaker verification system. The 2017 ASVspoof Challenge focused on this kind of attack. In this paper, we describe our evaluation work after this challenge. First, we study the effectiveness of Gaussian Mixture Model (GMM) systems using six different hand-crafted features for detecting a replay attack. Second, we take a deeper look at these GMM systems and perform a frame-level analysis of log likelihoods. Our analysis shows how system performance can depend on a simple class-dependent cue in the dataset: initial silence frames of zeros appear in the genuine signals but missing in the spoofed version. Third, we show how we can fool these systems using this cue. For example, we find the equal error rate (EER) of one GMM system dramatically rises from 14.82 to 44.44 when we add the cue to the evaluation data. Finally, we explore whether this problem can be mitigated by pre-processing the 2017 ASV spoof Challenge dataset.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"45 1","pages":"5159-5163"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2018.8461467","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
A “replay attack” involves replaying pre-recorded speech of an enrolled speaker to bypass an automatic speaker verification system. The 2017 ASVspoof Challenge focused on this kind of attack. In this paper, we describe our evaluation work after this challenge. First, we study the effectiveness of Gaussian Mixture Model (GMM) systems using six different hand-crafted features for detecting a replay attack. Second, we take a deeper look at these GMM systems and perform a frame-level analysis of log likelihoods. Our analysis shows how system performance can depend on a simple class-dependent cue in the dataset: initial silence frames of zeros appear in the genuine signals but missing in the spoofed version. Third, we show how we can fool these systems using this cue. For example, we find the equal error rate (EER) of one GMM system dramatically rises from 14.82 to 44.44 when we add the cue to the evaluation data. Finally, we explore whether this problem can be mitigated by pre-processing the 2017 ASV spoof Challenge dataset.