Yifan He, Yatao Bian, Xi Ding, Bingzhe Wu, Jihong Guan, Ji Zhang, Shuigeng Zhou
{"title":"Variate Associated Domain Adaptation for Unsupervised Multivariate Time Series Anomaly Detection","authors":"Yifan He, Yatao Bian, Xi Ding, Bingzhe Wu, Jihong Guan, Ji Zhang, Shuigeng Zhou","doi":"10.1145/3663573","DOIUrl":null,"url":null,"abstract":"<p>Multivariate Time Series Anomaly Detection (MTS-AD) is crucial for the effective management and maintenance of devices in complex systems such as server clusters, spacecrafts and financial systems etc. However, upgrade or cross-platform deployment of these devices will introduce the issue of cross-domain distribution shift, which leads to the prototypical problem of Domain Adaptation for MTS-AD. Compared with general domain adaptation problems, MTS-AD domain adaptation presents two peculiar challenges: 1) The dimensions of data from the source domain and the target domain are usually different, so alignment without losing any information is necessary. 2) The association between different variates plays a vital role in the MTS-AD task, which is overlooked by traditional domain adaptation approaches. Aiming at addressing the above issues, we propose a <b>V</b>ariate <b>A</b>ssociated domai<b>N</b> a<b>D</b>aptation method combined with a Gr<b>A</b>ph Deviation Network (abbreviated as <monospace>VANDA</monospace>) for MTS-AD, which includes two major contributions. First, we characterize the intra-domain variate associations of the source domain by a graph deviation network (GDN), which can share parameters across domains without dimension alignment. Second, we propose a sliding similarity to measure the inter-domain variate associations and perform joint training by minimizing the optimal transport distance between source and target data for transferring variate associations across domains. <monospace>VANDA</monospace> achieves domain adaptation by transferring both variate associations and GDN parameters from the source domain to the target domain. We construct two pairs of MTS-AD datasets from existing MTS-AD data and combine three domain adaptation strategies with six MTS-AD backbones as the benchmark methods for experimental evaluation and comparison. Extensive experiments demonstrate the effectiveness of our approach, which outperforms the benchmark methods, and significantly improves the AD performance of the target domain by effectively utilizing the source domain knowledge.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"247 1","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3663573","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Multivariate Time Series Anomaly Detection (MTS-AD) is crucial for the effective management and maintenance of devices in complex systems such as server clusters, spacecrafts and financial systems etc. However, upgrade or cross-platform deployment of these devices will introduce the issue of cross-domain distribution shift, which leads to the prototypical problem of Domain Adaptation for MTS-AD. Compared with general domain adaptation problems, MTS-AD domain adaptation presents two peculiar challenges: 1) The dimensions of data from the source domain and the target domain are usually different, so alignment without losing any information is necessary. 2) The association between different variates plays a vital role in the MTS-AD task, which is overlooked by traditional domain adaptation approaches. Aiming at addressing the above issues, we propose a Variate Associated domaiN aDaptation method combined with a GrAph Deviation Network (abbreviated as VANDA) for MTS-AD, which includes two major contributions. First, we characterize the intra-domain variate associations of the source domain by a graph deviation network (GDN), which can share parameters across domains without dimension alignment. Second, we propose a sliding similarity to measure the inter-domain variate associations and perform joint training by minimizing the optimal transport distance between source and target data for transferring variate associations across domains. VANDA achieves domain adaptation by transferring both variate associations and GDN parameters from the source domain to the target domain. We construct two pairs of MTS-AD datasets from existing MTS-AD data and combine three domain adaptation strategies with six MTS-AD backbones as the benchmark methods for experimental evaluation and comparison. Extensive experiments demonstrate the effectiveness of our approach, which outperforms the benchmark methods, and significantly improves the AD performance of the target domain by effectively utilizing the source domain knowledge.
期刊介绍:
TKDD welcomes papers on a full range of research in the knowledge discovery and analysis of diverse forms of data. Such subjects include, but are not limited to: scalable and effective algorithms for data mining and big data analysis, mining brain networks, mining data streams, mining multi-media data, mining high-dimensional data, mining text, Web, and semi-structured data, mining spatial and temporal data, data mining for community generation, social network analysis, and graph structured data, security and privacy issues in data mining, visual, interactive and online data mining, pre-processing and post-processing for data mining, robust and scalable statistical methods, data mining languages, foundations of data mining, KDD framework and process, and novel applications and infrastructures exploiting data mining technology including massively parallel processing and cloud computing platforms. TKDD encourages papers that explore the above subjects in the context of large distributed networks of computers, parallel or multiprocessing computers, or new data devices. TKDD also encourages papers that describe emerging data mining applications that cannot be satisfied by the current data mining technology.