Parallel and fault-tolerant k-means clustering based on the actor model

IF 0.6 Q4 COMPUTER SCIENCE, THEORY & METHODS Multiagent and Grid Systems Pub Date : 2020-01-01 DOI:10.3233/mgs-200336

Salah Taamneh, A. Qawasmeh, A. Aljammal

{"title":"Parallel and fault-tolerant k-means clustering based on the actor model","authors":"Salah Taamneh, A. Qawasmeh, A. Aljammal","doi":"10.3233/mgs-200336","DOIUrl":null,"url":null,"abstract":"K-means algorithm is a well-known unsupervised machine learning tool that aims at splitting a given dataset into a fixed number of clusters via iterative refinement approach. Running such an algorithm on today’s datasets that are characterized by its high multidimensionality and huge size requires using fault-tolerance mechanisms to mitigate the impact of possible failures. In this paper, we propose an actor-based implementation of k-means algorithm. The algorithm was made fault-tolerant by periodically saving the centroids into a stable storage during the failure-free execution, and restarting from the last saved centroids upon a failure. This was implemented in two different ways: optimistic checkpointing (blocking) and pessimistic checkpointing (non-blocking). The actor-based k-means algorithm was evaluated on a machine with eight cores. The experiments showed that the proposed algorithm scales very well as the number of workers increases, and can be up to ∼ 2x faster than a Java-thread-based implementation of k-means algorithm. The results also showed that the optimistic algorithm outperformed the pessimistic one, specifically, in the presence of competing I/O operations. Several failures were forced to occur during the execution to evaluate the performance of the fault-tolerant implementations. The experiments showed that the average amount of lost work ranged from 3–6%.","PeriodicalId":43659,"journal":{"name":"Multiagent and Grid Systems","volume":"11 1","pages":"379-396"},"PeriodicalIF":0.6000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multiagent and Grid Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/mgs-200336","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 1

Abstract

K-means algorithm is a well-known unsupervised machine learning tool that aims at splitting a given dataset into a fixed number of clusters via iterative refinement approach. Running such an algorithm on today’s datasets that are characterized by its high multidimensionality and huge size requires using fault-tolerance mechanisms to mitigate the impact of possible failures. In this paper, we propose an actor-based implementation of k-means algorithm. The algorithm was made fault-tolerant by periodically saving the centroids into a stable storage during the failure-free execution, and restarting from the last saved centroids upon a failure. This was implemented in two different ways: optimistic checkpointing (blocking) and pessimistic checkpointing (non-blocking). The actor-based k-means algorithm was evaluated on a machine with eight cores. The experiments showed that the proposed algorithm scales very well as the number of workers increases, and can be up to ∼ 2x faster than a Java-thread-based implementation of k-means algorithm. The results also showed that the optimistic algorithm outperformed the pessimistic one, specifically, in the presence of competing I/O operations. Several failures were forced to occur during the execution to evaluate the performance of the fault-tolerant implementations. The experiments showed that the average amount of lost work ranged from 3–6%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于参与者模型的并行容错k-means聚类

K-means算法是一种著名的无监督机器学习工具，旨在通过迭代优化的方法将给定的数据集分成固定数量的聚类。在当今的数据集上运行这样的算法，其特点是其高多维度和巨大的规模，需要使用容错机制来减轻可能的故障的影响。在本文中，我们提出了一种基于参与者的k-means算法实现。该算法通过在无故障执行过程中周期性地将质心保存到稳定的存储器中，并在出现故障时从上次保存的质心重新启动来实现容错。这是以两种不同的方式实现的:乐观检查点(阻塞)和悲观检查点(非阻塞)。基于参与者的k-means算法在一台八核机器上进行了评估。实验表明，随着工人数量的增加，所提出的算法可以很好地扩展，并且可以比基于java线程的k-means算法实现快2倍。结果还表明，乐观算法优于悲观算法，特别是在存在竞争I/O操作的情况下。为了评估容错实现的性能，在执行过程中强制发生了几个故障。实验表明，平均损失的工作量在3-6%之间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Multiagent and Grid Systems COMPUTER SCIENCE, THEORY & METHODS-

CiteScore

1.50

自引率

0.00%

发文量