Optimization on selecting XGBoost hyperparameters using meta-learning

IF 3 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Expert Systems Pub Date : 2024-04-25 DOI:10.1111/exsy.13611
Tiago Lima Marinho, Diego Carvalho do Nascimento, Bruno Almeida Pimentel
{"title":"Optimization on selecting XGBoost hyperparameters using meta-learning","authors":"Tiago Lima Marinho,&nbsp;Diego Carvalho do Nascimento,&nbsp;Bruno Almeida Pimentel","doi":"10.1111/exsy.13611","DOIUrl":null,"url":null,"abstract":"<p>With computational evolution, there has been a growth in the number of machine learning algorithms and they became more complex and robust. A greater challenge is upon faster and more practical ways to find hyperparameters that will set up each algorithm individually. This article aims to use meta-learning as a practicable solution for recommending hyperparameters from similar datasets, through their meta-features structures, than to adopt the already trained XGBoost parameters for a new database. This reduced computational costs and also aimed to make real-time decision-making feasible or reduce any extra costs for companies for new information. The experimental results, adopting 198 data sets, attested to the success of the heuristics application using meta-learning to compare datasets structure analysis. Initially, a characterization of the datasets was performed by combining three groups of meta-features (general, statistical, and info-theory), so that there would be a way to compare the similarity between sets and, thus, apply meta-learning to recommend the hyperparameters. Later, the appropriate number of sets to characterize the XGBoost turning was tested. The obtained results were promising, showing an improved performance in the accuracy of the XGBoost, <i>k</i> = {4 − 6}, using the average of the hyperparameters values and, comparing to the standard grid-search hyperparameters set by default, it was obtained that, in 78.28% of the datasets, the meta-learning methodology performed better. This study, therefore, shows that the adoption of meta-learning is a competitive alternative to generalize the XGBoost model, expecting better statistics performance (accuracy etc.) rather than adjusting to a single/particular model.</p>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":null,"pages":null},"PeriodicalIF":3.0000,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/exsy.13611","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

With computational evolution, there has been a growth in the number of machine learning algorithms and they became more complex and robust. A greater challenge is upon faster and more practical ways to find hyperparameters that will set up each algorithm individually. This article aims to use meta-learning as a practicable solution for recommending hyperparameters from similar datasets, through their meta-features structures, than to adopt the already trained XGBoost parameters for a new database. This reduced computational costs and also aimed to make real-time decision-making feasible or reduce any extra costs for companies for new information. The experimental results, adopting 198 data sets, attested to the success of the heuristics application using meta-learning to compare datasets structure analysis. Initially, a characterization of the datasets was performed by combining three groups of meta-features (general, statistical, and info-theory), so that there would be a way to compare the similarity between sets and, thus, apply meta-learning to recommend the hyperparameters. Later, the appropriate number of sets to characterize the XGBoost turning was tested. The obtained results were promising, showing an improved performance in the accuracy of the XGBoost, k = {4 − 6}, using the average of the hyperparameters values and, comparing to the standard grid-search hyperparameters set by default, it was obtained that, in 78.28% of the datasets, the meta-learning methodology performed better. This study, therefore, shows that the adoption of meta-learning is a competitive alternative to generalize the XGBoost model, expecting better statistics performance (accuracy etc.) rather than adjusting to a single/particular model.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用元学习优化 XGBoost 超参数的选择
随着计算技术的发展,机器学习算法的数量也在不断增加,而且变得更加复杂和强大。更大的挑战在于如何以更快、更实用的方式找到超参数,以单独设置每种算法。本文旨在使用元学习作为一种实用的解决方案,通过元特征结构从类似数据集中推荐超参数,而不是在新数据库中采用已经训练好的 XGBoost 参数。这不仅降低了计算成本,还旨在使实时决策变得可行,或减少公司获取新信息的额外成本。采用 198 个数据集的实验结果证明,使用元学习比较数据集结构分析的启发式应用是成功的。最初,通过结合三组元特征(一般特征、统计特征和信息理论特征)对数据集进行了特征描述,这样就有办法比较数据集之间的相似性,从而应用元学习推荐超参数。随后,测试了 XGBoost 转弯的特征集的适当数量。获得的结果很有希望,显示出使用超参数值的平均值(k = {4 - 6})提高了 XGBoost 的准确性,与默认设置的标准网格搜索超参数相比,在 78.28% 的数据集中,元学习方法的表现更好。因此,这项研究表明,采用元学习方法是推广 XGBoost 模型的一种有竞争力的替代方法,可望获得更好的统计性能(准确率等),而不是调整为单一/特定模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Expert Systems
Expert Systems 工程技术-计算机:理论方法
CiteScore
7.40
自引率
6.10%
发文量
266
审稿时长
24 months
期刊介绍: Expert Systems: The Journal of Knowledge Engineering publishes papers dealing with all aspects of knowledge engineering, including individual methods and techniques in knowledge acquisition and representation, and their application in the construction of systems – including expert systems – based thereon. Detailed scientific evaluation is an essential part of any paper. As well as traditional application areas, such as Software and Requirements Engineering, Human-Computer Interaction, and Artificial Intelligence, we are aiming at the new and growing markets for these technologies, such as Business, Economy, Market Research, and Medical and Health Care. The shift towards this new focus will be marked by a series of special issues covering hot and emergent topics.
期刊最新文献
A comprehensive survey on deep learning‐based intrusion detection systems in Internet of Things (IoT) MTFDN: An image copy‐move forgery detection method based on multi‐task learning STP‐CNN: Selection of transfer parameters in convolutional neural networks Label distribution learning for compound facial expression recognition in‐the‐wild: A comparative study Federated learning‐driven dual blockchain for data sharing and reputation management in Internet of medical things
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1