基于内核的 iVAT,具有自适应群组提取功能

IF 2.5 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Knowledge and Information Systems Pub Date : 2024-09-06 DOI:10.1007/s10115-024-02189-1
Baojie Zhang, Ye Zhu, Yang Cao, Sutharshan Rajasegarar, Gang Li, Gang Liu
{"title":"基于内核的 iVAT,具有自适应群组提取功能","authors":"Baojie Zhang, Ye Zhu, Yang Cao, Sutharshan Rajasegarar, Gang Li, Gang Liu","doi":"10.1007/s10115-024-02189-1","DOIUrl":null,"url":null,"abstract":"<p>Visual Assessment of cluster Tendency (VAT) is a popular method that visually represents the possible clusters found in a dataset as dark blocks along the diagonal of a <i>reordered dissimilarity image</i> (RDI). Although many variants of the VAT algorithm have been proposed to improve the visualisation quality on different types of datasets, they still suffer from the challenge of extracting clusters with varied densities. In this paper, we focus on overcoming this drawback of VAT algorithms by incorporating kernel methods and also propose a novel adaptive cluster extraction strategy, named CER, to effectively identify the local clusters from the RDI. We examine their effects on an improved VAT method (iVAT) and systematically evaluate the clustering performance on 18 synthetic and real-world datasets. The experimental results reveal that the recently proposed data-dependent dissimilarity measure, namely the Isolation kernel, helps to significantly improve the RDI image for easy cluster identification. Furthermore, the proposed cluster extraction method, CER, outperforms other existing methods on most of the datasets in terms of a series of dissimilarity measures.\n</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Kernel-based iVAT with adaptive cluster extraction\",\"authors\":\"Baojie Zhang, Ye Zhu, Yang Cao, Sutharshan Rajasegarar, Gang Li, Gang Liu\",\"doi\":\"10.1007/s10115-024-02189-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Visual Assessment of cluster Tendency (VAT) is a popular method that visually represents the possible clusters found in a dataset as dark blocks along the diagonal of a <i>reordered dissimilarity image</i> (RDI). Although many variants of the VAT algorithm have been proposed to improve the visualisation quality on different types of datasets, they still suffer from the challenge of extracting clusters with varied densities. In this paper, we focus on overcoming this drawback of VAT algorithms by incorporating kernel methods and also propose a novel adaptive cluster extraction strategy, named CER, to effectively identify the local clusters from the RDI. We examine their effects on an improved VAT method (iVAT) and systematically evaluate the clustering performance on 18 synthetic and real-world datasets. The experimental results reveal that the recently proposed data-dependent dissimilarity measure, namely the Isolation kernel, helps to significantly improve the RDI image for easy cluster identification. Furthermore, the proposed cluster extraction method, CER, outperforms other existing methods on most of the datasets in terms of a series of dissimilarity measures.\\n</p>\",\"PeriodicalId\":54749,\"journal\":{\"name\":\"Knowledge and Information Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge and Information Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10115-024-02189-1\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge and Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10115-024-02189-1","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

聚类倾向可视化评估(VAT)是一种流行的方法,它将数据集中可能存在的聚类直观地表示为沿重排异同图像(RDI)对角线的暗色块。尽管 VAT 算法的许多变体已被提出,以提高不同类型数据集的可视化质量,但它们仍然面临着提取不同密度聚类的挑战。在本文中,我们将重点放在结合核方法来克服 VAT 算法的这一缺点,并提出了一种名为 CER 的新型自适应聚类提取策略,以有效识别 RDI 中的局部聚类。我们研究了它们对改进型 VAT 方法(iVAT)的影响,并在 18 个合成数据集和实际数据集上系统地评估了聚类性能。实验结果表明,最近提出的依赖于数据的差异度量(即隔离核)有助于显著改善 RDI 图像,从而轻松识别聚类。此外,在大多数数据集上,所提出的聚类提取方法 CER 在一系列异质性度量方面都优于其他现有方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Kernel-based iVAT with adaptive cluster extraction

Visual Assessment of cluster Tendency (VAT) is a popular method that visually represents the possible clusters found in a dataset as dark blocks along the diagonal of a reordered dissimilarity image (RDI). Although many variants of the VAT algorithm have been proposed to improve the visualisation quality on different types of datasets, they still suffer from the challenge of extracting clusters with varied densities. In this paper, we focus on overcoming this drawback of VAT algorithms by incorporating kernel methods and also propose a novel adaptive cluster extraction strategy, named CER, to effectively identify the local clusters from the RDI. We examine their effects on an improved VAT method (iVAT) and systematically evaluate the clustering performance on 18 synthetic and real-world datasets. The experimental results reveal that the recently proposed data-dependent dissimilarity measure, namely the Isolation kernel, helps to significantly improve the RDI image for easy cluster identification. Furthermore, the proposed cluster extraction method, CER, outperforms other existing methods on most of the datasets in terms of a series of dissimilarity measures.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Knowledge and Information Systems
Knowledge and Information Systems 工程技术-计算机:人工智能
CiteScore
5.70
自引率
7.40%
发文量
152
审稿时长
7.2 months
期刊介绍: Knowledge and Information Systems (KAIS) provides an international forum for researchers and professionals to share their knowledge and report new advances on all topics related to knowledge systems and advanced information systems. This monthly peer-reviewed archival journal publishes state-of-the-art research reports on emerging topics in KAIS, reviews of important techniques in related areas, and application papers of interest to a general readership.
期刊最新文献
Dynamic evolution of causal relationships among cryptocurrencies: an analysis via Bayesian networks Deep multi-semantic fuzzy K-means with adaptive weight adjustment Class incremental named entity recognition without forgetting Spectral clustering with scale fairness constraints Supervised kernel-based multi-modal Bhattacharya distance learning for imbalanced data classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1