{"title":"利用贝叶斯网络学习算法发现多元时间序列中的因果关系","authors":"Zhenxing Wang, L. Chan","doi":"10.1109/ICDM.2011.153","DOIUrl":null,"url":null,"abstract":"Many applications naturally involve time series data, and the vector auto regression (VAR) and the structural VAR (SVAR) are dominant tools to investigate relations between variables in time series. In the first part of this work, we show that the SVAR method is incapable of identifying contemporaneous causal relations when data follow Gaussian distributions. In addition, least squares estimators become unreliable when the scales of the problems are large and observations are limited. In the remaining part, we propose an approach to apply Bayesian network learning algorithms to identify SVARs from time series data in order to capture both temporal and contemporaneous causal relations and avoid high-order statistical tests. The difficulty of applying Bayesian network learning algorithms to time series is that the sizes of the networks corresponding to time series tend to be large and high-order statistical tests are required by Bayesian network learning algorithms in this case. To overcome the difficulty, we show that the search space of conditioning sets d-separating two vertices should be subsets of Markov blankets. Based on this fact, we propose an algorithm learning Bayesian networks locally and making the largest order of statistical tests independent of the scales of the problems. Empirical results show that our algorithm outperforms existing methods in terms of both efficiency and accuracy.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Using Bayesian Network Learning Algorithm to Discover Causal Relations in Multivariate Time Series\",\"authors\":\"Zhenxing Wang, L. Chan\",\"doi\":\"10.1109/ICDM.2011.153\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many applications naturally involve time series data, and the vector auto regression (VAR) and the structural VAR (SVAR) are dominant tools to investigate relations between variables in time series. In the first part of this work, we show that the SVAR method is incapable of identifying contemporaneous causal relations when data follow Gaussian distributions. In addition, least squares estimators become unreliable when the scales of the problems are large and observations are limited. In the remaining part, we propose an approach to apply Bayesian network learning algorithms to identify SVARs from time series data in order to capture both temporal and contemporaneous causal relations and avoid high-order statistical tests. The difficulty of applying Bayesian network learning algorithms to time series is that the sizes of the networks corresponding to time series tend to be large and high-order statistical tests are required by Bayesian network learning algorithms in this case. To overcome the difficulty, we show that the search space of conditioning sets d-separating two vertices should be subsets of Markov blankets. Based on this fact, we propose an algorithm learning Bayesian networks locally and making the largest order of statistical tests independent of the scales of the problems. Empirical results show that our algorithm outperforms existing methods in terms of both efficiency and accuracy.\",\"PeriodicalId\":106216,\"journal\":{\"name\":\"2011 IEEE 11th International Conference on Data Mining\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-12-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE 11th International Conference on Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2011.153\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 11th International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2011.153","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using Bayesian Network Learning Algorithm to Discover Causal Relations in Multivariate Time Series
Many applications naturally involve time series data, and the vector auto regression (VAR) and the structural VAR (SVAR) are dominant tools to investigate relations between variables in time series. In the first part of this work, we show that the SVAR method is incapable of identifying contemporaneous causal relations when data follow Gaussian distributions. In addition, least squares estimators become unreliable when the scales of the problems are large and observations are limited. In the remaining part, we propose an approach to apply Bayesian network learning algorithms to identify SVARs from time series data in order to capture both temporal and contemporaneous causal relations and avoid high-order statistical tests. The difficulty of applying Bayesian network learning algorithms to time series is that the sizes of the networks corresponding to time series tend to be large and high-order statistical tests are required by Bayesian network learning algorithms in this case. To overcome the difficulty, we show that the search space of conditioning sets d-separating two vertices should be subsets of Markov blankets. Based on this fact, we propose an algorithm learning Bayesian networks locally and making the largest order of statistical tests independent of the scales of the problems. Empirical results show that our algorithm outperforms existing methods in terms of both efficiency and accuracy.