Salvatore D. Tomarchio, Luca Bagnato, Antonio Punzo
{"title":"基于模型的重尾分布新简约混合聚类","authors":"Salvatore D. Tomarchio, Luca Bagnato, Antonio Punzo","doi":"10.1007/s10182-021-00430-8","DOIUrl":null,"url":null,"abstract":"<div><p>Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 2","pages":"315 - 347"},"PeriodicalIF":1.4000,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Model-based clustering via new parsimonious mixtures of heavy-tailed distributions\",\"authors\":\"Salvatore D. Tomarchio, Luca Bagnato, Antonio Punzo\",\"doi\":\"10.1007/s10182-021-00430-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.</p></div>\",\"PeriodicalId\":55446,\"journal\":{\"name\":\"Asta-Advances in Statistical Analysis\",\"volume\":\"106 2\",\"pages\":\"315 - 347\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2022-01-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Asta-Advances in Statistical Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10182-021-00430-8\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Asta-Advances in Statistical Analysis","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s10182-021-00430-8","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Model-based clustering via new parsimonious mixtures of heavy-tailed distributions
Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.
期刊介绍:
AStA - Advances in Statistical Analysis, a journal of the German Statistical Society, is published quarterly and presents original contributions on statistical methods and applications and review articles.