An Evolutionary Algorithm for Automatic Recommendation of Clustering Methods and its Parameters

Jessica A. Carballido , Macarena A. Latini , Ignacio Ponzoni , Rocío L. Cecchini
{"title":"An Evolutionary Algorithm for Automatic Recommendation of Clustering Methods and its Parameters","authors":"Jessica A. Carballido ,&nbsp;Macarena A. Latini ,&nbsp;Ignacio Ponzoni ,&nbsp;Rocío L. Cecchini","doi":"10.1016/j.endm.2018.07.030","DOIUrl":null,"url":null,"abstract":"<div><p>One of the main problems being faced at the time of performing data clustering consists in the deteremination of the best clustering method together with defining the ideal amount (k) of groups in which these data should be separated. In this paper, a preliminary approximation of a clustering recommender method is presented which, starting from a set of standardized data, suggests the best clustering strategy and also proposes an advisable k value. For this aim, the algorithm considers four indices for evaluating the final structure of clusters: Dunn, Silhouette, Widest Gap and Entropy. The prototype is implemented as a Genetic Algorithm in which individuals are possible configurations of the methods and their parameters. In this first prototype, the algorithm suggests between four partitioning methods namely K-means, PAM, CLARA and, Fanny. Also, the best set of parameters to execute the suggested method is obtained. The prototype was developed in an R environment, and its findings could be corroborated as consistent when compared with a combination of results provided by other methods with similar objectives. The idea of this prototype is to serve as the initial basis for a more complex framework that also incorporates the reduction of matrices with vast numbers of rows.</p></div>","PeriodicalId":35408,"journal":{"name":"Electronic Notes in Discrete Mathematics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.endm.2018.07.030","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Notes in Discrete Mathematics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1571065318301744","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 2

Abstract

One of the main problems being faced at the time of performing data clustering consists in the deteremination of the best clustering method together with defining the ideal amount (k) of groups in which these data should be separated. In this paper, a preliminary approximation of a clustering recommender method is presented which, starting from a set of standardized data, suggests the best clustering strategy and also proposes an advisable k value. For this aim, the algorithm considers four indices for evaluating the final structure of clusters: Dunn, Silhouette, Widest Gap and Entropy. The prototype is implemented as a Genetic Algorithm in which individuals are possible configurations of the methods and their parameters. In this first prototype, the algorithm suggests between four partitioning methods namely K-means, PAM, CLARA and, Fanny. Also, the best set of parameters to execute the suggested method is obtained. The prototype was developed in an R environment, and its findings could be corroborated as consistent when compared with a combination of results provided by other methods with similar objectives. The idea of this prototype is to serve as the initial basis for a more complex framework that also incorporates the reduction of matrices with vast numbers of rows.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种自动推荐聚类方法及其参数的进化算法
在执行数据聚类时面临的主要问题之一是确定最佳聚类方法以及定义这些数据应该分离的理想数量(k)组。本文提出了一种聚类推荐方法的初步近似,该方法从一组标准化数据出发,提出了最佳聚类策略,并提出了一个可取的k值。为此,该算法考虑了四个指标来评估聚类的最终结构:Dunn、Silhouette、最宽间隙和熵。原型是作为遗传算法实现的,其中个体是方法及其参数的可能配置。在第一个原型中,算法在K-means、PAM、CLARA和Fanny四种划分方法之间提出。此外,还获得了执行所建议方法的最佳参数集。该原型是在R环境中开发的,与具有类似目标的其他方法提供的结果组合相比,其结果可以证实为一致。这个原型的想法是作为一个更复杂的框架的初始基础,这个框架还包含包含大量行的矩阵的约简。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Electronic Notes in Discrete Mathematics
Electronic Notes in Discrete Mathematics Mathematics-Discrete Mathematics and Combinatorics
CiteScore
1.30
自引率
0.00%
发文量
0
期刊介绍: Electronic Notes in Discrete Mathematics is a venue for the rapid electronic publication of the proceedings of conferences, of lecture notes, monographs and other similar material for which quick publication is appropriate. Organizers of conferences whose proceedings appear in Electronic Notes in Discrete Mathematics, and authors of other material appearing as a volume in the series are allowed to make hard copies of the relevant volume for limited distribution. For example, conference proceedings may be distributed to participants at the meeting, and lecture notes can be distributed to those taking a course based on the material in the volume.
期刊最新文献
Preface Minimal condition for shortest vectors in lattices of low dimension Enumerating words with forbidden factors Polygon-circle and word-representable graphs On an arithmetic triangle of numbers arising from inverses of analytic functions
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1