两步超参数优化方法:利用训练数据集的一小部分加速超参数搜索

Artificial intelligence for the earth systems Pub Date : 2023-11-01 DOI:10.1175/aies-d-23-0013.1

Sungduk Yu, Mike Pritchard, Po-Lun Ma, Balwinder Singh, Sam Silva

{"title":"两步超参数优化方法:利用训练数据集的一小部分加速超参数搜索","authors":"Sungduk Yu, Mike Pritchard, Po-Lun Ma, Balwinder Singh, Sam Silva","doi":"10.1175/aies-d-23-0013.1","DOIUrl":null,"url":null,"abstract":"Abstract Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic—primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a re-evaluation of the top-performing candidate models post-retraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data—a few thousand samples—in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135× speed-up. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs).","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"79 3-4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset\",\"authors\":\"Sungduk Yu, Mike Pritchard, Po-Lun Ma, Balwinder Singh, Sam Silva\",\"doi\":\"10.1175/aies-d-23-0013.1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic—primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a re-evaluation of the top-performing candidate models post-retraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data—a few thousand samples—in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135× speed-up. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs).\",\"PeriodicalId\":94369,\"journal\":{\"name\":\"Artificial intelligence for the earth systems\",\"volume\":\"79 3-4\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial intelligence for the earth systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1175/aies-d-23-0013.1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence for the earth systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1175/aies-d-23-0013.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

超参数优化(HPO)是机器学习(ML)模型开发中的重要步骤，但常见的做法是过时的-主要依赖于手动或网格搜索。部分原因是采用先进的HPO算法会增加工作流程的复杂性，从而导致更长的计算时间。这对机器学习应用提出了一个显著的挑战，因为次优超参数选择限制了机器学习模型性能的潜力，最终阻碍了机器学习技术的充分利用。在本文中，我们提出了一种两步HPO方法，作为抑制计算需求和等待时间的战略解决方案，收集了应用机器学习参数化工作的实际经验。初始阶段包括在训练数据集的一小部分上对超参数进行初步评估，然后在使用整个训练数据集进行再训练后对表现最佳的候选模型进行重新评估。这种两步HPO方法普遍适用于所有HPO搜索算法，并且我们认为它具有吸引人的效率增益。作为一个案例研究，我们介绍了我们最近将两步HPO方法应用于气溶胶激活神经网络模拟器的开发。尽管我们的主要用例是具有数百万个样本的数据丰富的限制，但我们也发现，在初始步骤中使用高达0.0025%的数据(几千个样本)足以从更广泛的采样中找到最佳的超参数配置，从而实现高达135倍的加速。这种方法的好处是通过对超参数和模型性能的评估来实现的，揭示了实现最佳性能所需的最小模型复杂性。从HPO过程中获得的各种高性能模型使我们能够选择一个具有低推理成本的高性能模型，以便在全球气候模型(GCMs)中有效使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset

Abstract Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic—primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a re-evaluation of the top-performing candidate models post-retraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data—a few thousand samples—in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135× speed-up. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Artificial intelligence for the earth systems

自引率

0.00%

发文量