利用粒子群算法通过无监督学习增强K均值

2017 International Conference on Computing, Communication and Automation (ICCCA) Pub Date : 2017-05-01 DOI:10.1109/CCAA.2017.8229805

Aishwarya Gupta, Vishwajeet Pattanaik, Mayank Singh

{"title":"利用粒子群算法通过无监督学习增强K均值","authors":"Aishwarya Gupta, Vishwajeet Pattanaik, Mayank Singh","doi":"10.1109/CCAA.2017.8229805","DOIUrl":null,"url":null,"abstract":"Data Clustering in Data Mining is a domain which never gets out of focus. Clustering a data was always an easy task but achieving the required accuracy, precision and performance was never so easy. K means being an archaic clustering algorithm got tested and experimented thousands of times with variety of datasets and other combination of algorithm due to its robustness and simplicity but what this algorithm proposed was not suggested before. It used K means algorithm for the evaluation and validation purposes whereas optimization of the data is done with the help of Particle Swarm Optimization Algorithm. The drawbacks of K means mainly its local convergence property and initializing number of clusters at an early stage has aroused the process of working on this algorithm. So, for attaining the global convergence the Swarm Intelligence is preferred over Genetic Algorithm and many other techniques and for the latter one we combined two functions one of them helps in knowing the number of clusters which are optimal for the particular dataset and the other one validates the results using another function and compares the various metrics which will define the goodness and fitness of an algorithm. In one line the complete overview of the proposed algorithm can be described as ‘Evaluating the data using an Evalcluster Function, performing Validation with the help of an Evaluate Function of the K means and giving the final touch of Optimizing the data by K means PSO Algorithm’. The algorithm is tested for over 4 datasets available in UCI Repository and the results were unexpectedly great.","PeriodicalId":6627,"journal":{"name":"2017 International Conference on Computing, Communication and Automation (ICCCA)","volume":"17 6 Pt 1 1","pages":"228-233"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Enhancing K means by unsupervised learning using PSO algorithm\",\"authors\":\"Aishwarya Gupta, Vishwajeet Pattanaik, Mayank Singh\",\"doi\":\"10.1109/CCAA.2017.8229805\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data Clustering in Data Mining is a domain which never gets out of focus. Clustering a data was always an easy task but achieving the required accuracy, precision and performance was never so easy. K means being an archaic clustering algorithm got tested and experimented thousands of times with variety of datasets and other combination of algorithm due to its robustness and simplicity but what this algorithm proposed was not suggested before. It used K means algorithm for the evaluation and validation purposes whereas optimization of the data is done with the help of Particle Swarm Optimization Algorithm. The drawbacks of K means mainly its local convergence property and initializing number of clusters at an early stage has aroused the process of working on this algorithm. So, for attaining the global convergence the Swarm Intelligence is preferred over Genetic Algorithm and many other techniques and for the latter one we combined two functions one of them helps in knowing the number of clusters which are optimal for the particular dataset and the other one validates the results using another function and compares the various metrics which will define the goodness and fitness of an algorithm. In one line the complete overview of the proposed algorithm can be described as ‘Evaluating the data using an Evalcluster Function, performing Validation with the help of an Evaluate Function of the K means and giving the final touch of Optimizing the data by K means PSO Algorithm’. The algorithm is tested for over 4 datasets available in UCI Repository and the results were unexpectedly great.\",\"PeriodicalId\":6627,\"journal\":{\"name\":\"2017 International Conference on Computing, Communication and Automation (ICCCA)\",\"volume\":\"17 6 Pt 1 1\",\"pages\":\"228-233\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Computing, Communication and Automation (ICCCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCAA.2017.8229805\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Computing, Communication and Automation (ICCCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCAA.2017.8229805","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

数据挖掘中的数据聚类一直是人们关注的焦点。对数据进行聚类一直是一项简单的任务，但实现所需的准确性、精度和性能从来都不是那么容易的。K是一种古老的聚类算法，由于其鲁棒性和简单性，在各种数据集和其他算法组合上进行了数千次的测试和实验，但该算法提出的内容以前没有提出过。它使用K均值算法进行评估和验证，而数据的优化是借助粒子群优化算法完成的。K均值算法的缺点主要是其局部收敛性和初始化簇数较早，这引起了人们对该算法的研究。因此，为了实现全局收敛，群体智能比遗传算法和许多其他技术更受欢迎，对于后者，我们结合了两个函数，其中一个有助于了解特定数据集最优集群的数量，另一个使用另一个函数验证结果，并比较各种指标，这些指标将定义算法的优度和适应度。在一行中，所提出算法的完整概述可以描述为“使用Evalcluster函数评估数据，在K均值的评估函数的帮助下执行验证，并通过K均值PSO算法优化数据”。该算法在UCI Repository的4个以上数据集上进行了测试，结果出乎意料的好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Enhancing K means by unsupervised learning using PSO algorithm

Data Clustering in Data Mining is a domain which never gets out of focus. Clustering a data was always an easy task but achieving the required accuracy, precision and performance was never so easy. K means being an archaic clustering algorithm got tested and experimented thousands of times with variety of datasets and other combination of algorithm due to its robustness and simplicity but what this algorithm proposed was not suggested before. It used K means algorithm for the evaluation and validation purposes whereas optimization of the data is done with the help of Particle Swarm Optimization Algorithm. The drawbacks of K means mainly its local convergence property and initializing number of clusters at an early stage has aroused the process of working on this algorithm. So, for attaining the global convergence the Swarm Intelligence is preferred over Genetic Algorithm and many other techniques and for the latter one we combined two functions one of them helps in knowing the number of clusters which are optimal for the particular dataset and the other one validates the results using another function and compares the various metrics which will define the goodness and fitness of an algorithm. In one line the complete overview of the proposed algorithm can be described as ‘Evaluating the data using an Evalcluster Function, performing Validation with the help of an Evaluate Function of the K means and giving the final touch of Optimizing the data by K means PSO Algorithm’. The algorithm is tested for over 4 datasets available in UCI Repository and the results were unexpectedly great.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Conference on Computing, Communication and Automation (ICCCA)

自引率

0.00%

发文量

期刊最新文献

Sentiment analysis on product reviews BSS: Blockchain security over software defined network A detailed analysis of data consistency concepts in data exchange formats (JSON & XML) CBIR by cascading features & SVM ADANS: An agriculture domain question answering system using ontologies