Ivec-PLDA-AHC priors for VB-HMM speaker diarization system

Liang He, Xianhong Chen, Can Xu, Tianyu Liang, Jia Liu
{"title":"Ivec-PLDA-AHC priors for VB-HMM speaker diarization system","authors":"Liang He, Xianhong Chen, Can Xu, Tianyu Liang, Jia Liu","doi":"10.1109/SiPS.2017.8109998","DOIUrl":null,"url":null,"abstract":"This paper proposes a hybrid speaker diarization system. The main body is a variational Bayes — hidden Markov model (VB-HMM) speaker diarization system. The VB-HMM speaker diarization system avoids making premature hard decision and takes advantages of soft speaker information in an iterative way. Thus, it outperforms most of mainstream speaker diarization systems. Unfortunately, this system is sensitive to its prior in some cases. Either a uniform prior or a flat Dirichlet prior may fail and lead to poor results, thus a more robust and informative prior is desired. Another speaker diarization branch is an i-vector — probabilistic linear discriminant analysis — agglomerative hierarchical clustering (Ivec-PLDA-AHC) system. Benefits from the excellent performance of the Ivec-PLDA system in the speaker recognition field, the Ivec-PLDA-AHC speaker diarization system is believed to be more powerful to cluster segmental i-vectors according to their speakers. Inspired by this feature, we take the output of the Ivec-PLDA-AHC as the VB-HMM's prior. Experiments on our collected database show that the proposed system is significantly better than both of the mentioned systems.","PeriodicalId":251688,"journal":{"name":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SiPS.2017.8109998","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

This paper proposes a hybrid speaker diarization system. The main body is a variational Bayes — hidden Markov model (VB-HMM) speaker diarization system. The VB-HMM speaker diarization system avoids making premature hard decision and takes advantages of soft speaker information in an iterative way. Thus, it outperforms most of mainstream speaker diarization systems. Unfortunately, this system is sensitive to its prior in some cases. Either a uniform prior or a flat Dirichlet prior may fail and lead to poor results, thus a more robust and informative prior is desired. Another speaker diarization branch is an i-vector — probabilistic linear discriminant analysis — agglomerative hierarchical clustering (Ivec-PLDA-AHC) system. Benefits from the excellent performance of the Ivec-PLDA system in the speaker recognition field, the Ivec-PLDA-AHC speaker diarization system is believed to be more powerful to cluster segmental i-vectors according to their speakers. Inspired by this feature, we take the output of the Ivec-PLDA-AHC as the VB-HMM's prior. Experiments on our collected database show that the proposed system is significantly better than both of the mentioned systems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于 VB-HMM 说话者记录系统的 Ivec-PLDA-AHC 先验
本文提出了一种混合式说话人日记系统。其主体是变异贝叶斯-隐马尔可夫模型(VB-HMM)扬声器日差化系统。VB-HMM 说话人日记系统避免了过早做出硬性决定,并以迭代方式利用了说话人的软信息。因此,它的性能优于大多数主流的说话人日记系统。遗憾的是,该系统在某些情况下对先验很敏感。无论是均匀先验还是平面 Dirichlet 先验都可能失效,导致效果不佳,因此需要一个更稳健、信息量更大的先验。另一个扬声器分词分支是 i 向量-概率线性判别分析-聚类分层聚类(Ivec-PLDA-AHC)系统。得益于 Ivec-PLDA 系统在扬声器识别领域的出色表现,Ivec-PLDA-AHC 扬声器分层系统被认为在根据扬声器对分段 i 向量进行聚类方面功能更为强大。受此启发,我们将 Ivec-PLDA-AHC 的输出作为 VB-HMM 的先验。在我们收集的数据库上进行的实验表明,所提出的系统明显优于上述两种系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analysing the performance of divide-and-conquer sequential matrix diagonalisation for large broadband sensor arrays Design space exploration of dataflow-based Smith-Waterman FPGA implementations Hardware error correction using local syndromes A stochastic number representation for fully homomorphic cryptography Statistical analysis of Post-HEVC encoded videos
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1