Fangyao Li, Christopher M. Triggs, Ciprian Doru Giurcăneanu
{"title":"On the selection of predictors by using greedy algorithms and information theoretic criteria","authors":"Fangyao Li, Christopher M. Triggs, Ciprian Doru Giurcăneanu","doi":"10.1111/anzs.12387","DOIUrl":null,"url":null,"abstract":"<p>We discuss the use of the following greedy algorithms in the prediction of multivariate time series: Matching Pursuit Algorithm (MPA), Orthogonal Matching Pursuit (OMP), Relaxed Matching Pursuit (RMP), Frank–Wolfe Algorithm (FWA) and Constrained Matching Pursuit (CMP). The last two are known to be solvers for the lasso problem. Some of the algorithms are well-known (e.g. OMP), while others are less popular (e.g. RMP). We provide a unified presentation of all the algorithms, and evaluate their computational complexity for the high-dimensional case and for the big data case. We show how 12 information theoretic (IT) criteria can be used jointly with the greedy algorithms. As part of this effort, we derive new theoretical results that allow modification of the IT criteria such that to be compatible with RMP. The prediction capabilities are tested in experiments with two data sets. The first one involves air pollution data measured in Auckland (New Zealand) and the second one concerns the House Price Index in England (the United Kingdom).</p>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/anzs.12387","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/anzs.12387","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We discuss the use of the following greedy algorithms in the prediction of multivariate time series: Matching Pursuit Algorithm (MPA), Orthogonal Matching Pursuit (OMP), Relaxed Matching Pursuit (RMP), Frank–Wolfe Algorithm (FWA) and Constrained Matching Pursuit (CMP). The last two are known to be solvers for the lasso problem. Some of the algorithms are well-known (e.g. OMP), while others are less popular (e.g. RMP). We provide a unified presentation of all the algorithms, and evaluate their computational complexity for the high-dimensional case and for the big data case. We show how 12 information theoretic (IT) criteria can be used jointly with the greedy algorithms. As part of this effort, we derive new theoretical results that allow modification of the IT criteria such that to be compatible with RMP. The prediction capabilities are tested in experiments with two data sets. The first one involves air pollution data measured in Auckland (New Zealand) and the second one concerns the House Price Index in England (the United Kingdom).