{"title":"Computational efficient Variational Bayesian Gaussian Mixture Models via Coreset","authors":"Min Zhang, Yinlin Fu, K. Bennett, Teresa Wu","doi":"10.1109/CITS.2016.7546405","DOIUrl":null,"url":null,"abstract":"Variational Bayesian Gaussian Mixture Model is a popular clustering algorithm with a reliable performance. However, it is noted that the model fitting process takes long time, especially when dealing with large scale data, since it utilizes the whole dataset. To address this issue, in paper we propose a new algorithm termed a weighted VBGMM via Coreset. Specifically, a new coreset construction method is first proposed to sample the data which is used to fit the model. To evaluate the algorithm, two datasets are used: 1) six rat kidney images datasets 2) three human kidney images datasets. The results show that our proposed algorithm is much faster (~ 20 times) comparing to classic VBGMM while maintaining the similar performance on whole dataset.","PeriodicalId":340958,"journal":{"name":"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CITS.2016.7546405","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Variational Bayesian Gaussian Mixture Model is a popular clustering algorithm with a reliable performance. However, it is noted that the model fitting process takes long time, especially when dealing with large scale data, since it utilizes the whole dataset. To address this issue, in paper we propose a new algorithm termed a weighted VBGMM via Coreset. Specifically, a new coreset construction method is first proposed to sample the data which is used to fit the model. To evaluate the algorithm, two datasets are used: 1) six rat kidney images datasets 2) three human kidney images datasets. The results show that our proposed algorithm is much faster (~ 20 times) comparing to classic VBGMM while maintaining the similar performance on whole dataset.