SV - VLSP 2021: Combine Attentive Statistical Pooling-based Xvector and Pretrained ECAPA-TDNN for Vietnamese Text-Independent Speaker Verification

VNU Journal of Science: Computer Science and Communication Engineering Pub Date : 2022-06-30 DOI:10.25073/2588-1086/vnucsce.320

T. Thang, Huynh Thi Thanh Binh

引用次数: 1

Abstract

Recently, Xvectors and ECAPA-TDNN have been considered state-of-the-art models in designing speaker verification systems. This paper proposes a novel approach that combines Attentive statistic pooling-based Xvector and pre-trained ECAPA-TDNN for Vietnamese speaker verification. Experiments are conducted on various recent Vietnamese speech datasets. The results portrayed that our proposed combination outperformed all constitutive models with 4% to 37% relative EER improvement and ranked second place in Task 2 of the 2021 VLSP Speaker Verification competition.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SV - VLSP 2021:结合细心统计池的Xvector和预训练ECAPA-TDNN的越南语文本独立说话人验证

最近，Xvectors和ECAPA-TDNN被认为是设计扬声器验证系统的最先进模型。本文提出了一种将基于细心统计池的Xvector和预训练ECAPA-TDNN相结合的越南语说话人验证方法。在不同的越南语语音数据集上进行了实验。结果表明，我们提出的组合以4%至37%的相对EER改进优于所有本构模型，并在2021年VLSP演讲者验证竞赛的任务2中排名第二。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

VNU Journal of Science: Computer Science and Communication Engineering

自引率

0.00%

发文量