{"title":"Efficient genetic K-Means clustering for health care knowledge discovery","authors":"Ahmed Alsayat, H. El-Sayed","doi":"10.1109/SERA.2016.7516127","DOIUrl":null,"url":null,"abstract":"Data mining and machine learning are becoming the most interesting research areas and increasingly popular in health organizations. The hidden patterns among patients data can be extracted by applying data mining. The techniques and tools of data mining are very helpful as they provide health care professionals with significant knowledge toward a decision. Researchers have shown several utilities of data mining techniques such as clustering, classification, and regression in health care domain. Particularly, clustering algorithms which help researchers discover new insights by segmenting patients and providing them with effective treatments. This paper, reviews existing methods of clustering and present an efficient K-Means clustering algorithm which uses Self Organizing Map (SOM) method to overcome the problem of finding number of centroids in traditional K-Means. The SOM based clustering is very efficient due to its unsupervised learning and topology preserving properties. Two-staged clustering algorithm uses SOM to produce the prototypes in the first stage and then use those prototypes to create clusters in the second stage. Two health care datasets are used in the proposed experiments and a cluster accuracy metric was applied to evaluate the performance of the algorithm. Our analysis shows that the proposed method is accurate and shows better clustering performance along with valuable insights for each cluster. Our approach is unsupervised, scalable and can be applied to various domains.","PeriodicalId":412361,"journal":{"name":"2016 IEEE 14th International Conference on Software Engineering Research, Management and Applications (SERA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 14th International Conference on Software Engineering Research, Management and Applications (SERA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SERA.2016.7516127","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28
Abstract
Data mining and machine learning are becoming the most interesting research areas and increasingly popular in health organizations. The hidden patterns among patients data can be extracted by applying data mining. The techniques and tools of data mining are very helpful as they provide health care professionals with significant knowledge toward a decision. Researchers have shown several utilities of data mining techniques such as clustering, classification, and regression in health care domain. Particularly, clustering algorithms which help researchers discover new insights by segmenting patients and providing them with effective treatments. This paper, reviews existing methods of clustering and present an efficient K-Means clustering algorithm which uses Self Organizing Map (SOM) method to overcome the problem of finding number of centroids in traditional K-Means. The SOM based clustering is very efficient due to its unsupervised learning and topology preserving properties. Two-staged clustering algorithm uses SOM to produce the prototypes in the first stage and then use those prototypes to create clusters in the second stage. Two health care datasets are used in the proposed experiments and a cluster accuracy metric was applied to evaluate the performance of the algorithm. Our analysis shows that the proposed method is accurate and shows better clustering performance along with valuable insights for each cluster. Our approach is unsupervised, scalable and can be applied to various domains.