Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset

Sungduk Yu, Mike Pritchard, Po-Lun Ma, Balwinder Singh, Sam Silva
{"title":"Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset","authors":"Sungduk Yu, Mike Pritchard, Po-Lun Ma, Balwinder Singh, Sam Silva","doi":"10.1175/aies-d-23-0013.1","DOIUrl":null,"url":null,"abstract":"Abstract Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic—primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a re-evaluation of the top-performing candidate models post-retraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data—a few thousand samples—in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135× speed-up. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs).","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"79 3-4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence for the earth systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1175/aies-d-23-0013.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic—primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a re-evaluation of the top-performing candidate models post-retraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data—a few thousand samples—in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135× speed-up. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
两步超参数优化方法:利用训练数据集的一小部分加速超参数搜索
超参数优化(HPO)是机器学习(ML)模型开发中的重要步骤,但常见的做法是过时的-主要依赖于手动或网格搜索。部分原因是采用先进的HPO算法会增加工作流程的复杂性,从而导致更长的计算时间。这对机器学习应用提出了一个显著的挑战,因为次优超参数选择限制了机器学习模型性能的潜力,最终阻碍了机器学习技术的充分利用。在本文中,我们提出了一种两步HPO方法,作为抑制计算需求和等待时间的战略解决方案,收集了应用机器学习参数化工作的实际经验。初始阶段包括在训练数据集的一小部分上对超参数进行初步评估,然后在使用整个训练数据集进行再训练后对表现最佳的候选模型进行重新评估。这种两步HPO方法普遍适用于所有HPO搜索算法,并且我们认为它具有吸引人的效率增益。作为一个案例研究,我们介绍了我们最近将两步HPO方法应用于气溶胶激活神经网络模拟器的开发。尽管我们的主要用例是具有数百万个样本的数据丰富的限制,但我们也发现,在初始步骤中使用高达0.0025%的数据(几千个样本)足以从更广泛的采样中找到最佳的超参数配置,从而实现高达135倍的加速。这种方法的好处是通过对超参数和模型性能的评估来实现的,揭示了实现最佳性能所需的最小模型复杂性。从HPO过程中获得的各种高性能模型使我们能够选择一个具有低推理成本的高性能模型,以便在全球气候模型(GCMs)中有效使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Transferability and explainability of deep learning emulators for regional climate model projections: Perspectives for future applications Classification of ice particle shapes using machine learning on forward light scattering images Convolutional encoding and normalizing flows: a deep learning approach for offshore wind speed probabilistic forecasting in the Mediterranean Sea Neural networks to find the optimal forcing for offsetting the anthropogenic climate change effects Machine Learning Approach for Spatiotemporal Multivariate Optimization of Environmental Monitoring Sensor Locations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1