通过鲸鱼优化算法实现可扩展的 k-Medoids 集群

Huang Chenan, Narumasa Tsutsumida
{"title":"通过鲸鱼优化算法实现可扩展的 k-Medoids 集群","authors":"Huang Chenan, Narumasa Tsutsumida","doi":"arxiv-2408.16993","DOIUrl":null,"url":null,"abstract":"Unsupervised clustering has emerged as a critical tool for uncovering hidden\npatterns and insights from vast, unlabeled datasets. However, traditional\nmethods like Partitioning Around Medoids (PAM) struggle with scalability due to\ntheir quadratic computational complexity. To address this limitation, we\nintroduce WOA-kMedoids, a novel unsupervised clustering method that\nincorporates the Whale Optimization Algorithm (WOA), a nature-inspired\nmetaheuristic inspired by the hunting strategies of humpback whales. By\noptimizing centroid selection, WOA-kMedoids reduces computational complexity of\nthe k-medoids algorithm from quadratic to near-linear with respect to the\nnumber of observations. This improvement in efficiency enables WOA-kMedoids to\nbe scalable to large datasets while maintaining high clustering accuracy. We\nevaluated the performance of WOA-kMedoids on 25 diverse time series datasets\nfrom the UCR archive. Our empirical results demonstrate that WOA-kMedoids\nmaintains clustering accuracy similar to PAM. While WOA-kMedoids exhibited\nslightly higher runtime than PAM on small datasets (less than 300\nobservations), it outperformed PAM in computational efficiency on larger\ndatasets. The scalability of WOA-kMedoids, combined with its consistently high\naccuracy, positions it as a promising and practical choice for unsupervised\nclustering in big data applications. WOA-kMedoids has implications for\nefficient knowledge discovery in massive, unlabeled datasets across various\ndomains.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Scalable k-Medoids Clustering via Whale Optimization Algorithm\",\"authors\":\"Huang Chenan, Narumasa Tsutsumida\",\"doi\":\"arxiv-2408.16993\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Unsupervised clustering has emerged as a critical tool for uncovering hidden\\npatterns and insights from vast, unlabeled datasets. However, traditional\\nmethods like Partitioning Around Medoids (PAM) struggle with scalability due to\\ntheir quadratic computational complexity. To address this limitation, we\\nintroduce WOA-kMedoids, a novel unsupervised clustering method that\\nincorporates the Whale Optimization Algorithm (WOA), a nature-inspired\\nmetaheuristic inspired by the hunting strategies of humpback whales. By\\noptimizing centroid selection, WOA-kMedoids reduces computational complexity of\\nthe k-medoids algorithm from quadratic to near-linear with respect to the\\nnumber of observations. This improvement in efficiency enables WOA-kMedoids to\\nbe scalable to large datasets while maintaining high clustering accuracy. We\\nevaluated the performance of WOA-kMedoids on 25 diverse time series datasets\\nfrom the UCR archive. Our empirical results demonstrate that WOA-kMedoids\\nmaintains clustering accuracy similar to PAM. While WOA-kMedoids exhibited\\nslightly higher runtime than PAM on small datasets (less than 300\\nobservations), it outperformed PAM in computational efficiency on larger\\ndatasets. The scalability of WOA-kMedoids, combined with its consistently high\\naccuracy, positions it as a promising and practical choice for unsupervised\\nclustering in big data applications. WOA-kMedoids has implications for\\nefficient knowledge discovery in massive, unlabeled datasets across various\\ndomains.\",\"PeriodicalId\":501291,\"journal\":{\"name\":\"arXiv - CS - Performance\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Performance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.16993\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.16993","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

无监督聚类已成为从庞大的无标记数据集中发掘隐藏模式和洞察力的重要工具。然而,传统的方法(如环中网格划分法(PAM))由于其二次计算复杂性而难以扩展。为了解决这一局限性,我们引入了 WOA-kMedoids,这是一种新型的无监督聚类方法,它结合了鲸鱼优化算法(WOA),这是一种受座头鲸狩猎策略启发的自然启发元启发式算法。通过优化中心点选择,WOA-kMedoids 将 k-medoids 算法的计算复杂度从与观测值数量相关的二次方降低到接近线性。效率的提高使 WOA-kMedoids 可以扩展到大型数据集,同时保持较高的聚类精度。我们在来自 UCR 档案库的 25 个不同时间序列数据集上评估了 WOA-kMedoids 的性能。实证结果表明,WOA-kMedoids 保持了与 PAM 相似的聚类精度。虽然 WOA-kMedoids 在小型数据集(少于 300 个观测值)上的运行时间略高于 PAM,但在大型数据集上的计算效率却优于 PAM。WOA-kMedoids 的可扩展性加上其一贯的高精确度,使它成为大数据应用中无监督聚类的一个有前途的实用选择。WOA-kMedoids 对在不同领域的海量无标记数据集中进行高效知识发现具有重要意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Scalable k-Medoids Clustering via Whale Optimization Algorithm
Unsupervised clustering has emerged as a critical tool for uncovering hidden patterns and insights from vast, unlabeled datasets. However, traditional methods like Partitioning Around Medoids (PAM) struggle with scalability due to their quadratic computational complexity. To address this limitation, we introduce WOA-kMedoids, a novel unsupervised clustering method that incorporates the Whale Optimization Algorithm (WOA), a nature-inspired metaheuristic inspired by the hunting strategies of humpback whales. By optimizing centroid selection, WOA-kMedoids reduces computational complexity of the k-medoids algorithm from quadratic to near-linear with respect to the number of observations. This improvement in efficiency enables WOA-kMedoids to be scalable to large datasets while maintaining high clustering accuracy. We evaluated the performance of WOA-kMedoids on 25 diverse time series datasets from the UCR archive. Our empirical results demonstrate that WOA-kMedoids maintains clustering accuracy similar to PAM. While WOA-kMedoids exhibited slightly higher runtime than PAM on small datasets (less than 300 observations), it outperformed PAM in computational efficiency on larger datasets. The scalability of WOA-kMedoids, combined with its consistently high accuracy, positions it as a promising and practical choice for unsupervised clustering in big data applications. WOA-kMedoids has implications for efficient knowledge discovery in massive, unlabeled datasets across various domains.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
HRA: A Multi-Criteria Framework for Ranking Metaheuristic Optimization Algorithms Temporal Load Imbalance on Ondes3D Seismic Simulator for Different Multicore Architectures Can Graph Reordering Speed Up Graph Neural Network Training? An Experimental Study The Landscape of GPU-Centric Communication A Global Perspective on the Past, Present, and Future of Video Streaming over Starlink
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1