Time series clustering with random convolutional kernels

IF 2.8 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Data Mining and Knowledge Discovery Pub Date : 2024-04-01 DOI:10.1007/s10618-024-01018-x

引用次数: 0

Abstract

Time series data, spanning applications ranging from climatology to finance to healthcare, presents significant challenges in data mining due to its size and complexity. One open issue lies in time series clustering, which is crucial for processing large volumes of unlabeled time series data and unlocking valuable insights. Traditional and modern analysis methods, however, often struggle with these complexities. To address these limitations, we introduce R-Clustering, a novel method that utilizes convolutional architectures with randomly selected parameters. Through extensive evaluations, R-Clustering demonstrates superior performance over existing methods in terms of clustering accuracy, computational efficiency and scalability. Empirical results obtained using the UCR archive demonstrate the effectiveness of our approach across diverse time series datasets. The findings highlight the significance of R-Clustering in various domains and applications, contributing to the advancement of time series data mining.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用随机卷积核进行时间序列聚类

摘要时间序列数据的应用范围从气候学、金融到医疗保健，由于其规模和复杂性，给数据挖掘带来了巨大挑战。其中一个有待解决的问题是时间序列聚类，这对于处理大量无标记的时间序列数据和挖掘有价值的见解至关重要。然而，传统和现代的分析方法往往难以应对这些复杂性。为了解决这些局限性，我们引入了 R-聚类，这是一种利用随机选择参数的卷积架构的新方法。通过广泛的评估，R-聚类在聚类准确性、计算效率和可扩展性方面都表现出优于现有方法的性能。使用 UCR 档案获得的经验结果表明，我们的方法在各种时间序列数据集上都很有效。研究结果凸显了 R 聚类在不同领域和应用中的重要性，有助于推动时间序列数据挖掘的发展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Data Mining and Knowledge Discovery 工程技术-计算机：人工智能

CiteScore

10.40

自引率

4.20%

发文量

审稿时长

10 months

期刊介绍： Advances in data gathering, storage, and distribution have created a need for computational tools and techniques to aid in data analysis. Data Mining and Knowledge Discovery in Databases (KDD) is a rapidly growing area of research and application that builds on techniques and theories from many fields, including statistics, databases, pattern recognition and learning, data visualization, uncertainty modelling, data warehousing and OLAP, optimization, and high performance computing.

期刊最新文献

Missing value replacement in strings and applications. FRUITS: feature extraction using iterated sums for time series classification Bounding the family-wise error rate in local causal discovery using Rademacher averages Evaluating the disclosure risk of anonymized documents via a machine learning-based re-identification attack Efficient learning with projected histograms