Chengqian Xian, Camila P. E. de Souza, John Jewell, Ronaldo Dias
{"title":"Clustering functional data via variational inference","authors":"Chengqian Xian, Camila P. E. de Souza, John Jewell, Ronaldo Dias","doi":"10.1007/s11634-024-00590-w","DOIUrl":null,"url":null,"abstract":"<p>Among different functional data analyses, clustering analysis aims to determine underlying groups of curves in the dataset when there is no information on the group membership of each curve. In this work, we develop a novel variational Bayes (VB) algorithm for clustering and smoothing functional data simultaneously via a B-spline regression mixture model with random intercepts. We employ the deviance information criterion to select the best number of clusters. The proposed VB algorithm is evaluated and compared with other methods (<i>k</i>-means, functional <i>k</i>-means and two other model-based methods) via a simulation study under various scenarios. We apply our proposed methodology to two publicly available datasets. We demonstrate that the proposed VB algorithm achieves satisfactory clustering performance in both simulation and real data analyses.\n</p>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":"50 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Data Analysis and Classification","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11634-024-00590-w","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
Among different functional data analyses, clustering analysis aims to determine underlying groups of curves in the dataset when there is no information on the group membership of each curve. In this work, we develop a novel variational Bayes (VB) algorithm for clustering and smoothing functional data simultaneously via a B-spline regression mixture model with random intercepts. We employ the deviance information criterion to select the best number of clusters. The proposed VB algorithm is evaluated and compared with other methods (k-means, functional k-means and two other model-based methods) via a simulation study under various scenarios. We apply our proposed methodology to two publicly available datasets. We demonstrate that the proposed VB algorithm achieves satisfactory clustering performance in both simulation and real data analyses.
期刊介绍:
The international journal Advances in Data Analysis and Classification (ADAC) is designed as a forum for high standard publications on research and applications concerning the extraction of knowable aspects from many types of data. It publishes articles on such topics as structural, quantitative, or statistical approaches for the analysis of data; advances in classification, clustering, and pattern recognition methods; strategies for modeling complex data and mining large data sets; methods for the extraction of knowledge from data, and applications of advanced methods in specific domains of practice. Articles illustrate how new domain-specific knowledge can be made available from data by skillful use of data analysis methods. The journal also publishes survey papers that outline, and illuminate the basic ideas and techniques of special approaches.