On Variability due to Local Minima and K-fold Cross-validation

C. Marzban, Jueyi Liu, P. Tissot
{"title":"On Variability due to Local Minima and K-fold Cross-validation","authors":"C. Marzban, Jueyi Liu, P. Tissot","doi":"10.1175/aies-d-21-0004.1","DOIUrl":null,"url":null,"abstract":"\nResampling methods such as cross-validation or bootstrap are often employed to estimate the uncertainty in a loss function due to sampling variability, usually for the purpose of model selection. But in models that require nonlinear optimization, the existence of local minima in the loss function landscape introduces an additional source of variability which is confounded with sampling variability. In other words, some portion of the variability in the loss function across different resamples is due to local minima. Given that statistically-sound model selection is based on an examination of variance, it is important to disentangle these two sources of variability. To that end, a methodology is developed for estimating each, specifically in the context of K-fold cross-validation, and Neural Networks (NN) whose training leads to different local minima. Random effects models are used to estimate the two variance components - due to sampling and due to local minima. The results are examined as a function of the number of hidden nodes, and the variance of the initial weights, with the latter controlling the “depth” of local minima. The main goal of the methodology is to increase statistical power in model selection and/or model comparison. Using both simulated and realistic data it is shown that the two sources of variability can be comparable, casting doubt on model selection methods that ignore the variability due to local minima. Furthermore, the methodology is sufficiently flexible so as to allow assessment of the effect of other/any NN parameters on variability.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"83 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence for the earth systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1175/aies-d-21-0004.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Resampling methods such as cross-validation or bootstrap are often employed to estimate the uncertainty in a loss function due to sampling variability, usually for the purpose of model selection. But in models that require nonlinear optimization, the existence of local minima in the loss function landscape introduces an additional source of variability which is confounded with sampling variability. In other words, some portion of the variability in the loss function across different resamples is due to local minima. Given that statistically-sound model selection is based on an examination of variance, it is important to disentangle these two sources of variability. To that end, a methodology is developed for estimating each, specifically in the context of K-fold cross-validation, and Neural Networks (NN) whose training leads to different local minima. Random effects models are used to estimate the two variance components - due to sampling and due to local minima. The results are examined as a function of the number of hidden nodes, and the variance of the initial weights, with the latter controlling the “depth” of local minima. The main goal of the methodology is to increase statistical power in model selection and/or model comparison. Using both simulated and realistic data it is shown that the two sources of variability can be comparable, casting doubt on model selection methods that ignore the variability due to local minima. Furthermore, the methodology is sufficiently flexible so as to allow assessment of the effect of other/any NN parameters on variability.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
局部最小值与K-fold交叉验证的可变性
交叉验证或自举等重采样方法通常用于估计由于采样可变性而导致的损失函数中的不确定性,通常用于模型选择。但是在需要非线性优化的模型中,损失函数中局部极小值的存在引入了一个额外的可变性源,它与采样可变性相混淆。换句话说,损失函数中跨不同样本的可变性的一部分是由于局部最小值。考虑到统计上合理的模型选择是基于对方差的检查,区分这两个变异性来源是很重要的。为此,开发了一种方法来估计每个,特别是在K-fold交叉验证和神经网络(NN)的背景下,其训练导致不同的局部最小值。随机效应模型用于估计两个方差成分-由于抽样和由于局部最小值。将结果作为隐藏节点数和初始权重方差的函数进行检验,后者控制局部最小值的“深度”。该方法的主要目标是提高模型选择和/或模型比较的统计能力。模拟数据和实际数据表明,两种变异性来源可以比较,这对忽略局部极小值引起的变异性的模型选择方法提出了质疑。此外,该方法具有足够的灵活性,可以评估其他/任何神经网络参数对变异性的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Transferability and explainability of deep learning emulators for regional climate model projections: Perspectives for future applications Classification of ice particle shapes using machine learning on forward light scattering images Convolutional encoding and normalizing flows: a deep learning approach for offshore wind speed probabilistic forecasting in the Mediterranean Sea Neural networks to find the optimal forcing for offsetting the anthropogenic climate change effects Machine Learning Approach for Spatiotemporal Multivariate Optimization of Environmental Monitoring Sensor Locations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1