{"title":"Robust speaker clustering quality estimation","authors":"Yishai Cohen, I. Lapidot","doi":"10.1109/ICSEE.2018.8646164","DOIUrl":null,"url":null,"abstract":"This paper focuses on estimating the quality of a clustering process. In our case - the task is to cluster short speech segments that belong to different speakers. Moreover, speaker clustering quality may be well estimated on several clustering approaches if they all based on the same features. This is very important, as it allows us to use the same quality estimation system without retraining, and achieve reasonable results even when the clustering method is changed. We predict the system’s quality by applying a logistic regression estimator on a several statistical parameters of the clustering. In this paper, mean-shift clustering with either cosine or probabilistic linear discriminant analysis (PLDA) score as similarity measure, and stochastic vector quantization (VQ) with cosine distance were applied in order to cluster the short speaker segments represented by i-vectors. The quality of the clustering is measured using the average cluster purity (ACP), average speaker purity (ASP) and K. We show that these measures can be estimated fairly well by applying logistic regression based on various clustering statistics that calculated once clustering is over. These statistical parameters are used as a feature vector representing the clustering.","PeriodicalId":254455,"journal":{"name":"2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSEE.2018.8646164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper focuses on estimating the quality of a clustering process. In our case - the task is to cluster short speech segments that belong to different speakers. Moreover, speaker clustering quality may be well estimated on several clustering approaches if they all based on the same features. This is very important, as it allows us to use the same quality estimation system without retraining, and achieve reasonable results even when the clustering method is changed. We predict the system’s quality by applying a logistic regression estimator on a several statistical parameters of the clustering. In this paper, mean-shift clustering with either cosine or probabilistic linear discriminant analysis (PLDA) score as similarity measure, and stochastic vector quantization (VQ) with cosine distance were applied in order to cluster the short speaker segments represented by i-vectors. The quality of the clustering is measured using the average cluster purity (ACP), average speaker purity (ASP) and K. We show that these measures can be estimated fairly well by applying logistic regression based on various clustering statistics that calculated once clustering is over. These statistical parameters are used as a feature vector representing the clustering.