使用数据驱动的评分分类识别说话人

Image Processing & Communications Pub Date : 2016-06-01 DOI:10.1515/ipc-2016-0011

Hock C. Gan, I. Mporas, Saeid Safavi, R. Sotudeh

{"title":"使用数据驱动的评分分类识别说话人","authors":"Hock C. Gan, I. Mporas, Saeid Safavi, R. Sotudeh","doi":"10.1515/ipc-2016-0011","DOIUrl":null,"url":null,"abstract":"Abstract We present a comparative evaluation of different classification algorithms for a fusion engine that is used in a speaker identity selection task. The fusion engine combines the scores from a number of classifiers, which uses the GMM-UBM approach to match speaker identity. The performances of the evaluated classification algorithms were examined in both the text-dependent and text-independent operation modes. The experimental results indicated a significant improvement in terms of speaker identification accuracy, which was approximately 7% and 14.5% for the text-dependent and the text-independent scenarios, respectively. We suggest the use of fusion with a discriminative algorithm such as a Support Vector Machine in a real-world speaker identification application where the text-independent scenario predominates based on the findings.","PeriodicalId":271906,"journal":{"name":"Image Processing & Communications","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Speaker Identification Using Data-Driven Score Classification\",\"authors\":\"Hock C. Gan, I. Mporas, Saeid Safavi, R. Sotudeh\",\"doi\":\"10.1515/ipc-2016-0011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract We present a comparative evaluation of different classification algorithms for a fusion engine that is used in a speaker identity selection task. The fusion engine combines the scores from a number of classifiers, which uses the GMM-UBM approach to match speaker identity. The performances of the evaluated classification algorithms were examined in both the text-dependent and text-independent operation modes. The experimental results indicated a significant improvement in terms of speaker identification accuracy, which was approximately 7% and 14.5% for the text-dependent and the text-independent scenarios, respectively. We suggest the use of fusion with a discriminative algorithm such as a Support Vector Machine in a real-world speaker identification application where the text-independent scenario predominates based on the findings.\",\"PeriodicalId\":271906,\"journal\":{\"name\":\"Image Processing & Communications\",\"volume\":\"2013 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image Processing & Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1515/ipc-2016-0011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image Processing & Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/ipc-2016-0011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

摘要:本文对用于说话人身份选择任务的融合引擎的不同分类算法进行了比较评估。融合引擎结合了来自多个分类器的分数，使用GMM-UBM方法匹配说话者身份。在文本依赖和文本独立两种操作模式下，对所评估的分类算法的性能进行了测试。实验结果表明，在文本依赖和文本独立场景下，说话人识别的准确率分别提高了约7%和14.5%。我们建议在基于研究结果的文本独立场景占主导地位的真实说话人识别应用中使用融合与判别算法(如支持向量机)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Speaker Identification Using Data-Driven Score Classification

Abstract We present a comparative evaluation of different classification algorithms for a fusion engine that is used in a speaker identity selection task. The fusion engine combines the scores from a number of classifiers, which uses the GMM-UBM approach to match speaker identity. The performances of the evaluated classification algorithms were examined in both the text-dependent and text-independent operation modes. The experimental results indicated a significant improvement in terms of speaker identification accuracy, which was approximately 7% and 14.5% for the text-dependent and the text-independent scenarios, respectively. We suggest the use of fusion with a discriminative algorithm such as a Support Vector Machine in a real-world speaker identification application where the text-independent scenario predominates based on the findings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Image Processing & Communications

自引率

0.00%

发文量