Relational data partitioning using evolutionary game theory

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2014-12-01 DOI:10.1109/CIDM.2014.7008656

L. Hall, Alireza Chakeri

{"title":"Relational data partitioning using evolutionary game theory","authors":"L. Hall, Alireza Chakeri","doi":"10.1109/CIDM.2014.7008656","DOIUrl":null,"url":null,"abstract":"This paper presents a new approach for relational data partitioning using the notion of dominant sets. A dominant set is a subset of data points satisfying the constraints of internal homogeneity and external in-homogeneity, i.e. a cluster. However, since any subset of a dominant set cannot be a dominant set itself, dominant sets tend to be compact sets. Hence, in this paper, we present a novel approach to enumerate well distributed clusters where the number of clusters need not be known. When the number of clusters is known, in order to search the solution space appropriately, after finding each dominant set, data points are partitioned into two disjoint subsets of data points using spectral graph image segmentation methods to enumerate the other well distributed dominant sets. For the latter case, we introduce a new hierarchical approach for relational data partitioning using a new class of evolutionary game theory dynamics called InImDynamics which is very fast and linear, in computational time, with the number of data points. In this regard, at each level of the proposed hierarchy, Dunn's index is used to find the appropriate number of clusters. Then the objects are partitioned based on the projected number of clusters using game theoretic relations. The same method is applied to each partition to extract its underlying structure. Although the resulting clusters exist in their equivalent partitions, they may not be clusters of the entire data. Hence, they are checked for being an actual cluster and if they are not, they are extended to an existing cluster of the data. The approach can also be used to assign unseen data to existing clusters, as well.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIDM.2014.7008656","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

This paper presents a new approach for relational data partitioning using the notion of dominant sets. A dominant set is a subset of data points satisfying the constraints of internal homogeneity and external in-homogeneity, i.e. a cluster. However, since any subset of a dominant set cannot be a dominant set itself, dominant sets tend to be compact sets. Hence, in this paper, we present a novel approach to enumerate well distributed clusters where the number of clusters need not be known. When the number of clusters is known, in order to search the solution space appropriately, after finding each dominant set, data points are partitioned into two disjoint subsets of data points using spectral graph image segmentation methods to enumerate the other well distributed dominant sets. For the latter case, we introduce a new hierarchical approach for relational data partitioning using a new class of evolutionary game theory dynamics called InImDynamics which is very fast and linear, in computational time, with the number of data points. In this regard, at each level of the proposed hierarchy, Dunn's index is used to find the appropriate number of clusters. Then the objects are partitioned based on the projected number of clusters using game theoretic relations. The same method is applied to each partition to extract its underlying structure. Although the resulting clusters exist in their equivalent partitions, they may not be clusters of the entire data. Hence, they are checked for being an actual cluster and if they are not, they are extended to an existing cluster of the data. The approach can also be used to assign unseen data to existing clusters, as well.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用进化博弈论的关系数据分区

本文利用优势集的概念提出了一种新的关系数据划分方法。优势集是满足内部同质性和外部非同质性约束的数据点的子集，即聚类。然而，由于支配集的任何子集本身不可能是支配集，所以支配集趋向于紧集。因此，在本文中，我们提出了一种新的方法来枚举分布良好的簇，其中簇的数量不需要知道。当聚类数量已知时，为了适当地搜索解空间，在找到每个优势集后，使用谱图图像分割方法将数据点划分为两个不相交的数据点子集，枚举其他分布良好的优势集。对于后一种情况，我们引入了一种新的分层方法来进行关系数据分区，使用一种新的称为InImDynamics的进化博弈论动力学，它在计算时间上与数据点的数量是非常快速和线性的。在这方面，在提出的层次结构的每一层，Dunn指数被用来找到适当数量的聚类。然后利用博弈论关系，根据预测的聚类数量对目标进行划分。对每个分区应用相同的方法来提取其底层结构。尽管生成的集群存在于它们的等效分区中，但它们可能不是整个数据的集群。因此，将检查它们是否为实际的集群，如果不是，则将它们扩展到现有的数据集群。这种方法还可以用于将看不见的数据分配给现有的集群。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)

自引率

0.00%

发文量