Feature Sorting Algorithm Based on XGBoost and MIC Combination Model

Gao Xiang, Yu Jun, Huo Zhiyi, Huang Yuzhe
{"title":"Feature Sorting Algorithm Based on XGBoost and MIC Combination Model","authors":"Gao Xiang, Yu Jun, Huo Zhiyi, Huang Yuzhe","doi":"10.21307/ijanmc-2021-037","DOIUrl":null,"url":null,"abstract":"Abstract Feature ranking can not only help the data analysis system improve efficiency, but also reduce the interference of redundant features and irrelevant features to the results. At present, feature ranking of massive data is an important and difficult problem. In order to solve the above problems, this paper proposes a feature importance ranking algorithm based on XGBoost and MIC model by analyzing the existing algorithm models. Firstly, XGBoost model and MIC model are established respectively; Then, the results of the above two models are weighted and combined by the error reciprocal method. XGBoost model has the advantages of high efficiency, flexibility and portability, while MIC model has universality and easy parameter adjustment. The resulting XGBoost MIC combination model has both advantages; Finally, the first mock exam is used as a sample set of data for anticancer drug candidates. After preprocessing the data set, the XGBoost-MIC combination model is used to analyze the case. At the same time, the calculation results of a single model are calculated, and the model is optimized by adjusting the parameters of the model. The results show that the error of the first mock exam is obviously lower than that of the single calculation model, and the accuracy of the XGBoost-MIC is 0.75, which is 0.02 higher than that of the single model.","PeriodicalId":193299,"journal":{"name":"International Journal of Advanced Network, Monitoring and Controls","volume":"349 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advanced Network, Monitoring and Controls","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21307/ijanmc-2021-037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract Feature ranking can not only help the data analysis system improve efficiency, but also reduce the interference of redundant features and irrelevant features to the results. At present, feature ranking of massive data is an important and difficult problem. In order to solve the above problems, this paper proposes a feature importance ranking algorithm based on XGBoost and MIC model by analyzing the existing algorithm models. Firstly, XGBoost model and MIC model are established respectively; Then, the results of the above two models are weighted and combined by the error reciprocal method. XGBoost model has the advantages of high efficiency, flexibility and portability, while MIC model has universality and easy parameter adjustment. The resulting XGBoost MIC combination model has both advantages; Finally, the first mock exam is used as a sample set of data for anticancer drug candidates. After preprocessing the data set, the XGBoost-MIC combination model is used to analyze the case. At the same time, the calculation results of a single model are calculated, and the model is optimized by adjusting the parameters of the model. The results show that the error of the first mock exam is obviously lower than that of the single calculation model, and the accuracy of the XGBoost-MIC is 0.75, which is 0.02 higher than that of the single model.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于XGBoost和MIC组合模型的特征排序算法
特征排序不仅可以帮助数据分析系统提高效率,还可以减少冗余特征和不相关特征对分析结果的干扰。目前,海量数据的特征排序是一个重要而又困难的问题。为了解决上述问题,本文在分析现有算法模型的基础上,提出了一种基于XGBoost和MIC模型的特征重要性排序算法。首先,分别建立XGBoost模型和MIC模型;然后,采用误差倒数法对上述两种模型的结果进行加权组合。XGBoost模型具有高效、灵活、便携等优点,MIC模型具有通用性和参数调整方便等优点。由此产生的XGBoost MIC组合模型具有这两种优点;最后,将第一次模拟考试作为抗癌候选药物的样本数据集。对数据集进行预处理后,采用XGBoost-MIC组合模型对案例进行分析。同时,对单个模型的计算结果进行了计算,并通过调整模型参数对模型进行了优化。结果表明,第一次模拟考试的误差明显低于单一计算模型,XGBoost-MIC的精度为0.75,比单一模型高0.02。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automatic Landing Control of Aircraft Based on Cognitive Load Theory and DDPG Research on Simulation Approximate Solution Strategy for Complex Kinematic Models Indoor Robot SLAM with Multi-Sensor Fusion Securing Operating Systems (OS): A Comprehensive Approach to Security with Best Practices and Techniques Lightweight Low-Altitude UAV Object Detection Based on Improved YOLOv5s
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1