A. Mexicano, Ricardo Rodriguez Jorge, Pascual Noradino Montes Dorantes, Joaquín Pérez Ortega
{"title":"Acceleration of the K-means algorithm by removing stable items","authors":"A. Mexicano, Ricardo Rodriguez Jorge, Pascual Noradino Montes Dorantes, Joaquín Pérez Ortega","doi":"10.1504/IJSSC.2017.10008032","DOIUrl":null,"url":null,"abstract":"This work presents an approach for enhancing the K-means algorithm in the classification phase. The approach consists in a heuristic, which at each time that an object remains in the same group, between the current and the previous iteration, it is identified as stable and it is removed from computations in the classification phase in the current and subsequent iterations. This approach helps to reduce the execution time of the standard version. It can be useful in big data applications. For evaluating computational results, both the standard and the proposal were implemented and executed using three synthetic and seven well-known real instances. After testing both versions, it was possible to validate that the proposed approach spends less time than the standard one. The best result was obtained for the transactions instance when it was grouped into 200 clusters, achieving a time reduction of 90.1% with a reduction in quality of 3.97%.","PeriodicalId":43931,"journal":{"name":"International Journal of Space-Based and Situated Computing","volume":"75 1","pages":"72-81"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Space-Based and Situated Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJSSC.2017.10008032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This work presents an approach for enhancing the K-means algorithm in the classification phase. The approach consists in a heuristic, which at each time that an object remains in the same group, between the current and the previous iteration, it is identified as stable and it is removed from computations in the classification phase in the current and subsequent iterations. This approach helps to reduce the execution time of the standard version. It can be useful in big data applications. For evaluating computational results, both the standard and the proposal were implemented and executed using three synthetic and seven well-known real instances. After testing both versions, it was possible to validate that the proposed approach spends less time than the standard one. The best result was obtained for the transactions instance when it was grouped into 200 clusters, achieving a time reduction of 90.1% with a reduction in quality of 3.97%.