Mohammadreza Ghorvei, Tuomas Karhu, Salla Hietakoste, Daniela Ferreira-Santos, Harald Hrubos-Strøm, Anna Sigridur Islind, Luka Biedebach, Sami Nikkonen, Timo Leppänen, Matias Rusanen
{"title":"PSG 相关表型中无监督机器学习方法的比较分析。","authors":"Mohammadreza Ghorvei, Tuomas Karhu, Salla Hietakoste, Daniela Ferreira-Santos, Harald Hrubos-Strøm, Anna Sigridur Islind, Luka Biedebach, Sami Nikkonen, Timo Leppänen, Matias Rusanen","doi":"10.1111/jsr.14349","DOIUrl":null,"url":null,"abstract":"<p><p>Obstructive sleep apnea is a heterogeneous sleep disorder with varying phenotypes. Several studies have already performed cluster analyses to discover various obstructive sleep apnea phenotypic clusters. However, the selection of the clustering method might affect the outputs. Consequently, it is unclear whether similar obstructive sleep apnea clusters can be reproduced using different clustering methods. In this study, we applied four well-known clustering methods: Agglomerative Hierarchical Clustering; K-means; Fuzzy C-means; and Gaussian Mixture Model to a population of 865 suspected obstructive sleep apnea patients. By creating five clusters with each method, we examined the effect of clustering methods on forming obstructive sleep apnea clusters and the differences in their physiological characteristics. We utilized a visualization technique to indicate the cluster formations, Cohen's kappa statistics to find the similarity and agreement between clustering methods, and performance evaluation to compare the clustering performance. As a result, two out of five clusters were distinctly different with all four methods, while three other clusters exhibited overlapping features across all methods. In terms of agreement, Fuzzy C-means and K-means had the strongest (κ = 0.87), and Agglomerative hierarchical clustering and Gaussian Mixture Model had the weakest agreement (κ = 0.51) between each other. The K-means showed the best clustering performance, followed by the Fuzzy C-means in most evaluation criteria. Moreover, Fuzzy C-means showed the greatest potential in handling overlapping clusters compared with other methods. In conclusion, we revealed a direct impact of clustering method selection on the formation and physiological characteristics of obstructive sleep apnea clusters. In addition, we highlighted the capability of soft clustering methods, particularly Fuzzy C-means, in the application of obstructive sleep apnea phenotyping.</p>","PeriodicalId":17057,"journal":{"name":"Journal of Sleep Research","volume":" ","pages":"e14349"},"PeriodicalIF":3.4000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A comparative analysis of unsupervised machine-learning methods in PSG-related phenotyping.\",\"authors\":\"Mohammadreza Ghorvei, Tuomas Karhu, Salla Hietakoste, Daniela Ferreira-Santos, Harald Hrubos-Strøm, Anna Sigridur Islind, Luka Biedebach, Sami Nikkonen, Timo Leppänen, Matias Rusanen\",\"doi\":\"10.1111/jsr.14349\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Obstructive sleep apnea is a heterogeneous sleep disorder with varying phenotypes. Several studies have already performed cluster analyses to discover various obstructive sleep apnea phenotypic clusters. However, the selection of the clustering method might affect the outputs. Consequently, it is unclear whether similar obstructive sleep apnea clusters can be reproduced using different clustering methods. In this study, we applied four well-known clustering methods: Agglomerative Hierarchical Clustering; K-means; Fuzzy C-means; and Gaussian Mixture Model to a population of 865 suspected obstructive sleep apnea patients. By creating five clusters with each method, we examined the effect of clustering methods on forming obstructive sleep apnea clusters and the differences in their physiological characteristics. We utilized a visualization technique to indicate the cluster formations, Cohen's kappa statistics to find the similarity and agreement between clustering methods, and performance evaluation to compare the clustering performance. As a result, two out of five clusters were distinctly different with all four methods, while three other clusters exhibited overlapping features across all methods. In terms of agreement, Fuzzy C-means and K-means had the strongest (κ = 0.87), and Agglomerative hierarchical clustering and Gaussian Mixture Model had the weakest agreement (κ = 0.51) between each other. The K-means showed the best clustering performance, followed by the Fuzzy C-means in most evaluation criteria. Moreover, Fuzzy C-means showed the greatest potential in handling overlapping clusters compared with other methods. In conclusion, we revealed a direct impact of clustering method selection on the formation and physiological characteristics of obstructive sleep apnea clusters. In addition, we highlighted the capability of soft clustering methods, particularly Fuzzy C-means, in the application of obstructive sleep apnea phenotyping.</p>\",\"PeriodicalId\":17057,\"journal\":{\"name\":\"Journal of Sleep Research\",\"volume\":\" \",\"pages\":\"e14349\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Sleep Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1111/jsr.14349\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Sleep Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/jsr.14349","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
摘要
阻塞性睡眠呼吸暂停是一种表型各异的睡眠障碍。已有多项研究通过聚类分析发现了各种阻塞性睡眠呼吸暂停的表型聚类。然而,聚类方法的选择可能会影响结果。因此,目前还不清楚使用不同的聚类方法能否再现类似的阻塞性睡眠呼吸暂停群。在本研究中,我们采用了四种著名的聚类方法:聚合分层聚类、K-均值聚类、模糊 C-均值聚类和高斯混合模型。通过使用每种方法创建五个聚类,我们研究了聚类方法对形成阻塞性睡眠呼吸暂停聚类的影响及其生理特征的差异。我们利用可视化技术来显示聚类的形成,利用科恩卡帕统计来发现聚类方法之间的相似性和一致性,并利用性能评估来比较聚类的性能。结果显示,在所有四种方法中,五个聚类中有两个聚类的特征明显不同,而另外三个聚类则在所有方法中表现出重叠特征。就一致性而言,模糊 C 均值聚类和 K 均值聚类的一致性最强(κ = 0.87),聚合分层聚类和高斯混合模型的一致性最弱(κ = 0.51)。在大多数评价标准中,K-均值聚类的聚类性能最好,其次是模糊 C-均值聚类。此外,与其他方法相比,模糊 C-means 在处理重叠聚类方面表现出最大的潜力。总之,我们发现聚类方法的选择对阻塞性睡眠呼吸暂停聚类的形成和生理特征有直接影响。此外,我们还强调了软聚类方法,尤其是模糊均值法在阻塞性睡眠呼吸暂停表型分析中的应用能力。
A comparative analysis of unsupervised machine-learning methods in PSG-related phenotyping.
Obstructive sleep apnea is a heterogeneous sleep disorder with varying phenotypes. Several studies have already performed cluster analyses to discover various obstructive sleep apnea phenotypic clusters. However, the selection of the clustering method might affect the outputs. Consequently, it is unclear whether similar obstructive sleep apnea clusters can be reproduced using different clustering methods. In this study, we applied four well-known clustering methods: Agglomerative Hierarchical Clustering; K-means; Fuzzy C-means; and Gaussian Mixture Model to a population of 865 suspected obstructive sleep apnea patients. By creating five clusters with each method, we examined the effect of clustering methods on forming obstructive sleep apnea clusters and the differences in their physiological characteristics. We utilized a visualization technique to indicate the cluster formations, Cohen's kappa statistics to find the similarity and agreement between clustering methods, and performance evaluation to compare the clustering performance. As a result, two out of five clusters were distinctly different with all four methods, while three other clusters exhibited overlapping features across all methods. In terms of agreement, Fuzzy C-means and K-means had the strongest (κ = 0.87), and Agglomerative hierarchical clustering and Gaussian Mixture Model had the weakest agreement (κ = 0.51) between each other. The K-means showed the best clustering performance, followed by the Fuzzy C-means in most evaluation criteria. Moreover, Fuzzy C-means showed the greatest potential in handling overlapping clusters compared with other methods. In conclusion, we revealed a direct impact of clustering method selection on the formation and physiological characteristics of obstructive sleep apnea clusters. In addition, we highlighted the capability of soft clustering methods, particularly Fuzzy C-means, in the application of obstructive sleep apnea phenotyping.
期刊介绍:
The Journal of Sleep Research is dedicated to basic and clinical sleep research. The Journal publishes original research papers and invited reviews in all areas of sleep research (including biological rhythms). The Journal aims to promote the exchange of ideas between basic and clinical sleep researchers coming from a wide range of backgrounds and disciplines. The Journal will achieve this by publishing papers which use multidisciplinary and novel approaches to answer important questions about sleep, as well as its disorders and the treatment thereof.