{"title":"从大量高度相关的变量中选择预测器进行预测","authors":"A. Timofeeva, Y. Mezentsev","doi":"10.18287/1613-0073-2019-2416-10-18","DOIUrl":null,"url":null,"abstract":"The potential of correlation-based feature selection has been explored in selecting an optimal subset from a set of highly correlated predictors. This problem occurs, for example, in time series forecasting of economic indicators using regression models on multiple lags of a large number of candidate leading indicators. Greedy algorithms (forward selection and backward elimination) in such cases fail. To obtain the globally optimal solution, the feature selection problem is formulated as a mixed integer programming problem. To solve it, we use the binary cut-and-branch method. The results of simulation studies demonstrate the advantage of using the binary cut-and-branch method in comparison with heuristic search algorithms. The real example of the selection of leading indicators of consumer price index growth shows the acceptability of using the correlation-based feature selection method.","PeriodicalId":10486,"journal":{"name":"Collection of selected papers of the III International Conference on Information Technology and Nanotechnology","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Forecasting using predictor selection from a large set of highly correlated variables\",\"authors\":\"A. Timofeeva, Y. Mezentsev\",\"doi\":\"10.18287/1613-0073-2019-2416-10-18\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The potential of correlation-based feature selection has been explored in selecting an optimal subset from a set of highly correlated predictors. This problem occurs, for example, in time series forecasting of economic indicators using regression models on multiple lags of a large number of candidate leading indicators. Greedy algorithms (forward selection and backward elimination) in such cases fail. To obtain the globally optimal solution, the feature selection problem is formulated as a mixed integer programming problem. To solve it, we use the binary cut-and-branch method. The results of simulation studies demonstrate the advantage of using the binary cut-and-branch method in comparison with heuristic search algorithms. The real example of the selection of leading indicators of consumer price index growth shows the acceptability of using the correlation-based feature selection method.\",\"PeriodicalId\":10486,\"journal\":{\"name\":\"Collection of selected papers of the III International Conference on Information Technology and Nanotechnology\",\"volume\":\"18 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Collection of selected papers of the III International Conference on Information Technology and Nanotechnology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18287/1613-0073-2019-2416-10-18\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Collection of selected papers of the III International Conference on Information Technology and Nanotechnology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18287/1613-0073-2019-2416-10-18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Forecasting using predictor selection from a large set of highly correlated variables
The potential of correlation-based feature selection has been explored in selecting an optimal subset from a set of highly correlated predictors. This problem occurs, for example, in time series forecasting of economic indicators using regression models on multiple lags of a large number of candidate leading indicators. Greedy algorithms (forward selection and backward elimination) in such cases fail. To obtain the globally optimal solution, the feature selection problem is formulated as a mixed integer programming problem. To solve it, we use the binary cut-and-branch method. The results of simulation studies demonstrate the advantage of using the binary cut-and-branch method in comparison with heuristic search algorithms. The real example of the selection of leading indicators of consumer price index growth shows the acceptability of using the correlation-based feature selection method.