{"title":"一个R包拟合多元污染正态分布的简约混合","authors":"A. Punzo, A. Mazza, P. McNicholas","doi":"10.18637/JSS.V085.I10","DOIUrl":null,"url":null,"abstract":"We introduce the R package ContaminatedMixt, conceived to disseminate the use of mixtures of multivariate contaminated normal distributions as a tool for robust clustering and classification under the common assumption of elliptically contoured groups. Thirteen variants of the model are also implemented to introduce parsimony. The expectation-conditional maximization algorithm is adopted to obtain maximum likelihood parameter estimates, and likelihood-based model selection criteria are used to select the model and the number of groups. Parallel computation can be used on multicore PCs and computer clusters, when several models have to be fitted. Differently from the more popular mixtures of multivariate normal and t distributions, this approach also allows for automatic detection of mild outliers via the maximum a posteriori probabilities procedure. To exemplify the use of the package, applications to artificial and real data are presented.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":"{\"title\":\"ContaminatedMixt: An R Package for Fitting Parsimonious Mixtures of Multivariate Contaminated Normal Distributions\",\"authors\":\"A. Punzo, A. Mazza, P. McNicholas\",\"doi\":\"10.18637/JSS.V085.I10\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce the R package ContaminatedMixt, conceived to disseminate the use of mixtures of multivariate contaminated normal distributions as a tool for robust clustering and classification under the common assumption of elliptically contoured groups. Thirteen variants of the model are also implemented to introduce parsimony. The expectation-conditional maximization algorithm is adopted to obtain maximum likelihood parameter estimates, and likelihood-based model selection criteria are used to select the model and the number of groups. Parallel computation can be used on multicore PCs and computer clusters, when several models have to be fitted. Differently from the more popular mixtures of multivariate normal and t distributions, this approach also allows for automatic detection of mild outliers via the maximum a posteriori probabilities procedure. To exemplify the use of the package, applications to artificial and real data are presented.\",\"PeriodicalId\":8446,\"journal\":{\"name\":\"arXiv: Computation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"41\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv: Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18637/JSS.V085.I10\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv: Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18637/JSS.V085.I10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
ContaminatedMixt: An R Package for Fitting Parsimonious Mixtures of Multivariate Contaminated Normal Distributions
We introduce the R package ContaminatedMixt, conceived to disseminate the use of mixtures of multivariate contaminated normal distributions as a tool for robust clustering and classification under the common assumption of elliptically contoured groups. Thirteen variants of the model are also implemented to introduce parsimony. The expectation-conditional maximization algorithm is adopted to obtain maximum likelihood parameter estimates, and likelihood-based model selection criteria are used to select the model and the number of groups. Parallel computation can be used on multicore PCs and computer clusters, when several models have to be fitted. Differently from the more popular mixtures of multivariate normal and t distributions, this approach also allows for automatic detection of mild outliers via the maximum a posteriori probabilities procedure. To exemplify the use of the package, applications to artificial and real data are presented.