{"title":"探索和比较无监督聚类算法","authors":"M. Lavielle, Philip D. Waggoner","doi":"10.5334/jors.269","DOIUrl":null,"url":null,"abstract":"One of the most widely used approaches to explore and understand non-random structure in data in a largely assumption-free manner is clustering. In this paper, we detail two original Shiny apps written in R, openly developed at Github, and archived at Zenodo, for exploring and comparing major unsupervised algorithms for clustering applications: k-means and Gaussian mixture models via Expectation-Maximization. The first app leverages simulated data and the second uses Fisher’s Iris data set to visually and numerically compare the clustering algorithms using data familiar to many applied researchers. In addition to being valuable tools for comparing these clustering techniques, the open source architecture of our Shiny apps allows for wide engagement and extension by the broader open science community, such as including different data sets and algorithms.","PeriodicalId":37323,"journal":{"name":"Journal of Open Research Software","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Exploring and Comparing Unsupervised Clustering Algorithms\",\"authors\":\"M. Lavielle, Philip D. Waggoner\",\"doi\":\"10.5334/jors.269\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the most widely used approaches to explore and understand non-random structure in data in a largely assumption-free manner is clustering. In this paper, we detail two original Shiny apps written in R, openly developed at Github, and archived at Zenodo, for exploring and comparing major unsupervised algorithms for clustering applications: k-means and Gaussian mixture models via Expectation-Maximization. The first app leverages simulated data and the second uses Fisher’s Iris data set to visually and numerically compare the clustering algorithms using data familiar to many applied researchers. In addition to being valuable tools for comparing these clustering techniques, the open source architecture of our Shiny apps allows for wide engagement and extension by the broader open science community, such as including different data sets and algorithms.\",\"PeriodicalId\":37323,\"journal\":{\"name\":\"Journal of Open Research Software\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Open Research Software\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5334/jors.269\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Open Research Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5334/jors.269","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}
Exploring and Comparing Unsupervised Clustering Algorithms
One of the most widely used approaches to explore and understand non-random structure in data in a largely assumption-free manner is clustering. In this paper, we detail two original Shiny apps written in R, openly developed at Github, and archived at Zenodo, for exploring and comparing major unsupervised algorithms for clustering applications: k-means and Gaussian mixture models via Expectation-Maximization. The first app leverages simulated data and the second uses Fisher’s Iris data set to visually and numerically compare the clustering algorithms using data familiar to many applied researchers. In addition to being valuable tools for comparing these clustering techniques, the open source architecture of our Shiny apps allows for wide engagement and extension by the broader open science community, such as including different data sets and algorithms.