{"title":"Time series symbolization and search for frequent patterns","authors":"Mai Van Hoan, M. Exbrayat","doi":"10.1145/2542050.2542057","DOIUrl":null,"url":null,"abstract":"In this paper, we focus on two aspects of time series mining: first on the transformation of numerical data to symbolic data; then on the search for frequent patterns in the resulting symbolic time series. We are thus interested in some patterns which have a high frequency in our database of time series and might help to generate candidates for various tasks in the area of time series mining. During the symbolization phase, we transform the numerical time series into a symbolic time series by i) splitting this latter into consecutive subsequences, ii) using a clustering algorithm to cluster these subsequences, each subsequence being then replaced by the name of its cluster to produce the symbolic time series. In the second phase, we use a sliding window to create a collection of transactions from the symbolic time series, then we use some algorithm for mining sequential pattern to find out some interesting motifs in the original time series. An example experiment based on environmental data is presented.","PeriodicalId":246033,"journal":{"name":"Proceedings of the 4th Symposium on Information and Communication Technology","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th Symposium on Information and Communication Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2542050.2542057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
In this paper, we focus on two aspects of time series mining: first on the transformation of numerical data to symbolic data; then on the search for frequent patterns in the resulting symbolic time series. We are thus interested in some patterns which have a high frequency in our database of time series and might help to generate candidates for various tasks in the area of time series mining. During the symbolization phase, we transform the numerical time series into a symbolic time series by i) splitting this latter into consecutive subsequences, ii) using a clustering algorithm to cluster these subsequences, each subsequence being then replaced by the name of its cluster to produce the symbolic time series. In the second phase, we use a sliding window to create a collection of transactions from the symbolic time series, then we use some algorithm for mining sequential pattern to find out some interesting motifs in the original time series. An example experiment based on environmental data is presented.