{"title":"NGPCA: Clustering of high-dimensional and non-stationary data streams","authors":"Nico Migenda , Ralf Möller , Wolfram Schenck","doi":"10.1016/j.simpa.2024.100635","DOIUrl":null,"url":null,"abstract":"<div><p>Neural Gas Principal Component Analysis (NGPCA) is an online clustering algorithm. An NGPCA model is a mixture of local PCA units and combines dimensionality reduction with vector quantization. Recently, NGPCA has been extended with an adaptive learning rate and an adaptive potential function for accurate and efficient clustering of high-dimensional and non-stationary data streams. The algorithm achieved highly competitive results on clustering benchmark datasets compared to the state of the art. Our implementation of the algorithm was developed in MATLAB and is available as open source. This code can be easily applied to the clustering of stationary and non-stationary data.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"20 ","pages":"Article 100635"},"PeriodicalIF":1.3000,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266596382400023X/pdfft?md5=6784f267af3874ee2a02d381441cd5f4&pid=1-s2.0-S266596382400023X-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Software Impacts","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S266596382400023X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Neural Gas Principal Component Analysis (NGPCA) is an online clustering algorithm. An NGPCA model is a mixture of local PCA units and combines dimensionality reduction with vector quantization. Recently, NGPCA has been extended with an adaptive learning rate and an adaptive potential function for accurate and efficient clustering of high-dimensional and non-stationary data streams. The algorithm achieved highly competitive results on clustering benchmark datasets compared to the state of the art. Our implementation of the algorithm was developed in MATLAB and is available as open source. This code can be easily applied to the clustering of stationary and non-stationary data.