SpecAugment Impact on Automatic Speaker Verification System

2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI) Pub Date : 2019-12-01 DOI:10.1109/ISRITI48646.2019.9034603

M. Faisal, S. Suyanto

引用次数: 16

Abstract

An automatic speaker verification (ASV) is one of the challenging problem in speech processing since there are so many models of machine learnings those capable of synthesizing a fake speech from a given text. This paper discusses the impact of SpecAugment to methods such as Gaussian Mixture Models (GMM) and Deep Neural Networks (DNNs). Some experiments on a speech dataset sampled from the ASVSpoof2019, which is specially made to tackle the threat of spoofing, show that DNNs produces an Equal Error Rate (EER) of 18.1% that is better than the GMM system with EER of 19.0%. And after combining with a traditional augmentation technique, the DNNs also gives a better EER of 15.3% than GMM with EER of 15.7%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SpecAugment对自动说话人验证系统的影响

自动说话人验证(ASV)是语音处理中具有挑战性的问题之一，因为有很多机器学习模型能够从给定文本合成假语音。本文讨论了SpecAugment对高斯混合模型(GMM)和深度神经网络(dnn)等方法的影响。在ASVSpoof2019(专门用于解决欺骗威胁)中采样的语音数据集上的一些实验表明，DNNs产生的相等错误率(EER)为18.1%，优于EER为19.0%的GMM系统。与传统增强技术相结合后，dnn的识别率为15.3%，优于GMM的15.7%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)

自引率

0.00%

发文量