{"title":"Data Visualization with Probabilistic Clustering and Neighbor Embedding","authors":"Xiaohui Liao, Jingqi Yan","doi":"10.23919/CHICC.2018.8482651","DOIUrl":null,"url":null,"abstract":"In the era of information explosion, processing and analyzing large-scale and high-dimensional data sets has become a big challenge for data mining and machine learning. In order to obtain and intuitively understand the information underlying the big data, an effective visualization technique is on demand. Many successful visualization techniques project high-dimensional data sets into low-dimensional spaces so that we can present data points in scatter plots, histograms or parallel coordinate plots. In this paper, we propose a new algorithm called PCNE, the algorithm first performs a probabilistic clustering algorithm for coarse classification on the data sets, and then reconstruct the joint probability with the heuristic information of classification results and neighborhood relationship. Our experimental results on the public data sets demonstrate that the PCNE algorithm outperforms the classical embedding algorithms in revealing both local and global structures of data.","PeriodicalId":158442,"journal":{"name":"2018 37th Chinese Control Conference (CCC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 37th Chinese Control Conference (CCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/CHICC.2018.8482651","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the era of information explosion, processing and analyzing large-scale and high-dimensional data sets has become a big challenge for data mining and machine learning. In order to obtain and intuitively understand the information underlying the big data, an effective visualization technique is on demand. Many successful visualization techniques project high-dimensional data sets into low-dimensional spaces so that we can present data points in scatter plots, histograms or parallel coordinate plots. In this paper, we propose a new algorithm called PCNE, the algorithm first performs a probabilistic clustering algorithm for coarse classification on the data sets, and then reconstruct the joint probability with the heuristic information of classification results and neighborhood relationship. Our experimental results on the public data sets demonstrate that the PCNE algorithm outperforms the classical embedding algorithms in revealing both local and global structures of data.