The CLIPS System for 2022 Spoofing-Aware Speaker Verification Challenge

Interspeech Pub Date : 2022-09-18 DOI:10.21437/interspeech.2022-320

Jucai Lin, Tingwei Chen, Jingbiao Huang, Ruidong Fang, Jun Yin, Yuanping Yin, W. Shi, Wei Huang, Yapeng Mao

引用次数: 2

Abstract

In this paper, a spooﬁng-aware speaker veriﬁcation (SASV) system that integrates the automatic speaker veriﬁcation (ASV) system and countermeasure (CM) system is developed. Firstly, a modiﬁed re-parameterized VGG (ARepVGG) module is utilized to extract high-level representation from the multi-scale feature that learns from the raw waveform though sinc-ﬁlters, and then a spectra-temporal graph attention network is used to learn the ﬁnal decision information whether the audio is spoofed or not. Secondly, a new network that is inspired from the Max-Feature-Map (MFM) layers is constructed to ﬁne-tune the CM system while keeping the ASV system ﬁxed. Our proposed SASV system signiﬁcantly improves the SASV equal error rate (SASV-EER) from 6.73 % to 1.36 % on the evaluation dataset and 4.85 % to 0.98 % on the development dataset in the 2022 Spooﬁng-Aware Speaker Veriﬁcation Challenge(2022 SASV).

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

2022年欺骗感知说话人验证挑战赛的CLIPS系统

本文开发了一种集自动说话人验证(ASV)系统和对抗(CM)系统于一体的欺骗感知说话人验证(SASV)系统。首先，利用改进的重参数化VGG (ARepVGG)模块，通过自适应滤波器从原始波形中学习多尺度特征，提取高级表征，然后利用谱时图注意网络学习音频是否被欺骗的最终决策信息。其次，从最大特征映射层(MFM)中得到启发，构建了一个新的网络，在保持ASV系统固定的同时对CM系统进行微调。在2022年欺骗感知说话人验证挑战(2022 SASV)中，我们提出的SASV系统显著提高了SASV等错误率(SASV- eer)，在评估数据集中从6.73%提高到1.36%，在开发数据集中从4.85%提高到0.98%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Interspeech

自引率

0.00%

发文量