{"title":"局部最小值与K-fold交叉验证的可变性","authors":"C. Marzban, Jueyi Liu, P. Tissot","doi":"10.1175/aies-d-21-0004.1","DOIUrl":null,"url":null,"abstract":"\nResampling methods such as cross-validation or bootstrap are often employed to estimate the uncertainty in a loss function due to sampling variability, usually for the purpose of model selection. But in models that require nonlinear optimization, the existence of local minima in the loss function landscape introduces an additional source of variability which is confounded with sampling variability. In other words, some portion of the variability in the loss function across different resamples is due to local minima. Given that statistically-sound model selection is based on an examination of variance, it is important to disentangle these two sources of variability. To that end, a methodology is developed for estimating each, specifically in the context of K-fold cross-validation, and Neural Networks (NN) whose training leads to different local minima. Random effects models are used to estimate the two variance components - due to sampling and due to local minima. The results are examined as a function of the number of hidden nodes, and the variance of the initial weights, with the latter controlling the “depth” of local minima. The main goal of the methodology is to increase statistical power in model selection and/or model comparison. Using both simulated and realistic data it is shown that the two sources of variability can be comparable, casting doubt on model selection methods that ignore the variability due to local minima. Furthermore, the methodology is sufficiently flexible so as to allow assessment of the effect of other/any NN parameters on variability.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"83 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"On Variability due to Local Minima and K-fold Cross-validation\",\"authors\":\"C. Marzban, Jueyi Liu, P. Tissot\",\"doi\":\"10.1175/aies-d-21-0004.1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\nResampling methods such as cross-validation or bootstrap are often employed to estimate the uncertainty in a loss function due to sampling variability, usually for the purpose of model selection. But in models that require nonlinear optimization, the existence of local minima in the loss function landscape introduces an additional source of variability which is confounded with sampling variability. In other words, some portion of the variability in the loss function across different resamples is due to local minima. Given that statistically-sound model selection is based on an examination of variance, it is important to disentangle these two sources of variability. To that end, a methodology is developed for estimating each, specifically in the context of K-fold cross-validation, and Neural Networks (NN) whose training leads to different local minima. Random effects models are used to estimate the two variance components - due to sampling and due to local minima. The results are examined as a function of the number of hidden nodes, and the variance of the initial weights, with the latter controlling the “depth” of local minima. The main goal of the methodology is to increase statistical power in model selection and/or model comparison. Using both simulated and realistic data it is shown that the two sources of variability can be comparable, casting doubt on model selection methods that ignore the variability due to local minima. Furthermore, the methodology is sufficiently flexible so as to allow assessment of the effect of other/any NN parameters on variability.\",\"PeriodicalId\":94369,\"journal\":{\"name\":\"Artificial intelligence for the earth systems\",\"volume\":\"83 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial intelligence for the earth systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1175/aies-d-21-0004.1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence for the earth systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1175/aies-d-21-0004.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On Variability due to Local Minima and K-fold Cross-validation
Resampling methods such as cross-validation or bootstrap are often employed to estimate the uncertainty in a loss function due to sampling variability, usually for the purpose of model selection. But in models that require nonlinear optimization, the existence of local minima in the loss function landscape introduces an additional source of variability which is confounded with sampling variability. In other words, some portion of the variability in the loss function across different resamples is due to local minima. Given that statistically-sound model selection is based on an examination of variance, it is important to disentangle these two sources of variability. To that end, a methodology is developed for estimating each, specifically in the context of K-fold cross-validation, and Neural Networks (NN) whose training leads to different local minima. Random effects models are used to estimate the two variance components - due to sampling and due to local minima. The results are examined as a function of the number of hidden nodes, and the variance of the initial weights, with the latter controlling the “depth” of local minima. The main goal of the methodology is to increase statistical power in model selection and/or model comparison. Using both simulated and realistic data it is shown that the two sources of variability can be comparable, casting doubt on model selection methods that ignore the variability due to local minima. Furthermore, the methodology is sufficiently flexible so as to allow assessment of the effect of other/any NN parameters on variability.