Mohammadreza Ghorvei, Tuomas Karhu, Salla Hietakoste, Daniela Ferreira-Santos, Harald Hrubos-Strøm, Anna Sigridur Islind, Luka Biedebach, Sami Nikkonen, Timo Leppänen, Matias Rusanen
{"title":"A comparative analysis of unsupervised machine-learning methods in PSG-related phenotyping.","authors":"Mohammadreza Ghorvei, Tuomas Karhu, Salla Hietakoste, Daniela Ferreira-Santos, Harald Hrubos-Strøm, Anna Sigridur Islind, Luka Biedebach, Sami Nikkonen, Timo Leppänen, Matias Rusanen","doi":"10.1111/jsr.14349","DOIUrl":null,"url":null,"abstract":"<p><p>Obstructive sleep apnea is a heterogeneous sleep disorder with varying phenotypes. Several studies have already performed cluster analyses to discover various obstructive sleep apnea phenotypic clusters. However, the selection of the clustering method might affect the outputs. Consequently, it is unclear whether similar obstructive sleep apnea clusters can be reproduced using different clustering methods. In this study, we applied four well-known clustering methods: Agglomerative Hierarchical Clustering; K-means; Fuzzy C-means; and Gaussian Mixture Model to a population of 865 suspected obstructive sleep apnea patients. By creating five clusters with each method, we examined the effect of clustering methods on forming obstructive sleep apnea clusters and the differences in their physiological characteristics. We utilized a visualization technique to indicate the cluster formations, Cohen's kappa statistics to find the similarity and agreement between clustering methods, and performance evaluation to compare the clustering performance. As a result, two out of five clusters were distinctly different with all four methods, while three other clusters exhibited overlapping features across all methods. In terms of agreement, Fuzzy C-means and K-means had the strongest (κ = 0.87), and Agglomerative hierarchical clustering and Gaussian Mixture Model had the weakest agreement (κ = 0.51) between each other. The K-means showed the best clustering performance, followed by the Fuzzy C-means in most evaluation criteria. Moreover, Fuzzy C-means showed the greatest potential in handling overlapping clusters compared with other methods. In conclusion, we revealed a direct impact of clustering method selection on the formation and physiological characteristics of obstructive sleep apnea clusters. In addition, we highlighted the capability of soft clustering methods, particularly Fuzzy C-means, in the application of obstructive sleep apnea phenotyping.</p>","PeriodicalId":17057,"journal":{"name":"Journal of Sleep Research","volume":" ","pages":"e14349"},"PeriodicalIF":3.4000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Sleep Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/jsr.14349","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Obstructive sleep apnea is a heterogeneous sleep disorder with varying phenotypes. Several studies have already performed cluster analyses to discover various obstructive sleep apnea phenotypic clusters. However, the selection of the clustering method might affect the outputs. Consequently, it is unclear whether similar obstructive sleep apnea clusters can be reproduced using different clustering methods. In this study, we applied four well-known clustering methods: Agglomerative Hierarchical Clustering; K-means; Fuzzy C-means; and Gaussian Mixture Model to a population of 865 suspected obstructive sleep apnea patients. By creating five clusters with each method, we examined the effect of clustering methods on forming obstructive sleep apnea clusters and the differences in their physiological characteristics. We utilized a visualization technique to indicate the cluster formations, Cohen's kappa statistics to find the similarity and agreement between clustering methods, and performance evaluation to compare the clustering performance. As a result, two out of five clusters were distinctly different with all four methods, while three other clusters exhibited overlapping features across all methods. In terms of agreement, Fuzzy C-means and K-means had the strongest (κ = 0.87), and Agglomerative hierarchical clustering and Gaussian Mixture Model had the weakest agreement (κ = 0.51) between each other. The K-means showed the best clustering performance, followed by the Fuzzy C-means in most evaluation criteria. Moreover, Fuzzy C-means showed the greatest potential in handling overlapping clusters compared with other methods. In conclusion, we revealed a direct impact of clustering method selection on the formation and physiological characteristics of obstructive sleep apnea clusters. In addition, we highlighted the capability of soft clustering methods, particularly Fuzzy C-means, in the application of obstructive sleep apnea phenotyping.
期刊介绍:
The Journal of Sleep Research is dedicated to basic and clinical sleep research. The Journal publishes original research papers and invited reviews in all areas of sleep research (including biological rhythms). The Journal aims to promote the exchange of ideas between basic and clinical sleep researchers coming from a wide range of backgrounds and disciplines. The Journal will achieve this by publishing papers which use multidisciplinary and novel approaches to answer important questions about sleep, as well as its disorders and the treatment thereof.