V. Megalooikonomou, Qiang Wang, Guo Li, C. Faloutsos
{"title":"A multiresolution symbolic representation of time series","authors":"V. Megalooikonomou, Qiang Wang, Guo Li, C. Faloutsos","doi":"10.1109/ICDE.2005.10","DOIUrl":null,"url":null,"abstract":"Efficiently and accurately searching for similarities among time series and discovering interesting patterns is an important and non-trivial problem. In this paper, we introduce a new representation of time series, the multiresolution vector quantized (MVQ) approximation, along with a new distance function. The novelty of MVQ is that it keeps both local and global information about the original time series in a hierarchical mechanism, processing the original time series at multiple resolutions. Moreover, the proposed representation is symbolic employing key subsequences and potentially allows the application of text-based retrieval techniques into the similarity analysis of time series. The proposed method is fast and scales linearly with the size of database and the dimensionality. Contrary to the vast majority in the literature that uses the Euclidean distance, MVQ uses a multi-resolution/hierarchical distance function. We performed experiments with real and synthetic data. The proposed distance function consistently outperforms all the major competitors (Euclidean, dynamic time warping, piecewise aggregate approximation) achieving up to 20% better precision/recall and clustering accuracy on the tested datasets.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"109","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"21st International Conference on Data Engineering (ICDE'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2005.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 109
Abstract
Efficiently and accurately searching for similarities among time series and discovering interesting patterns is an important and non-trivial problem. In this paper, we introduce a new representation of time series, the multiresolution vector quantized (MVQ) approximation, along with a new distance function. The novelty of MVQ is that it keeps both local and global information about the original time series in a hierarchical mechanism, processing the original time series at multiple resolutions. Moreover, the proposed representation is symbolic employing key subsequences and potentially allows the application of text-based retrieval techniques into the similarity analysis of time series. The proposed method is fast and scales linearly with the size of database and the dimensionality. Contrary to the vast majority in the literature that uses the Euclidean distance, MVQ uses a multi-resolution/hierarchical distance function. We performed experiments with real and synthetic data. The proposed distance function consistently outperforms all the major competitors (Euclidean, dynamic time warping, piecewise aggregate approximation) achieving up to 20% better precision/recall and clustering accuracy on the tested datasets.