{"title":"Multiplicative Mixture Models for Overlapping Clustering","authors":"Qiang Fu, A. Banerjee","doi":"10.1109/ICDM.2008.103","DOIUrl":null,"url":null,"abstract":"The problem of overlapping clustering, where a point is allowed to belong to multiple clusters, is becoming increasingly important in a variety of applications. In this paper, we present an overlapping clustering algorithm based on multiplicative mixture models. We analyze a general setting where each component of the multiplicative mixture is from an exponential family, and present an efficient alternating maximization algorithm to learn the model and infer overlapping clusters. We also show that when each component is assumed to be a Gaussian, we can apply the kernel trick leading to non-linear cluster separators and obtain better clustering quality. The efficacy of the proposed algorithms is demonstrated using experiments on both UCI benchmark datasets and a microarray gene expression dataset.","PeriodicalId":252958,"journal":{"name":"2008 Eighth IEEE International Conference on Data Mining","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Eighth IEEE International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2008.103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 49
Abstract
The problem of overlapping clustering, where a point is allowed to belong to multiple clusters, is becoming increasingly important in a variety of applications. In this paper, we present an overlapping clustering algorithm based on multiplicative mixture models. We analyze a general setting where each component of the multiplicative mixture is from an exponential family, and present an efficient alternating maximization algorithm to learn the model and infer overlapping clusters. We also show that when each component is assumed to be a Gaussian, we can apply the kernel trick leading to non-linear cluster separators and obtain better clustering quality. The efficacy of the proposed algorithms is demonstrated using experiments on both UCI benchmark datasets and a microarray gene expression dataset.