XGBoost算法在股票价格走势预测中的性能分析

Affan Ardana
{"title":"XGBoost算法在股票价格走势预测中的性能分析","authors":"Affan Ardana","doi":"10.31315/telematika.v20i1.9329","DOIUrl":null,"url":null,"abstract":"Purpose: The research aims to find the best parameters and features for predicting stock price movement using the XGBoost algorithm. The parameters are searched using the RMSE value, and the features are searched using the importance value.Design/methodology/approach: The research data is the stock data of Amazon.com company (AMZN). The dataset contains the Date, Low, Open, Volume, High, Close, and Adjusted Close features. The dataset is ensured to have no missing data by handling missing values. The input feature is selected using the Pearson Correlation feature selection method. To prevent the difference between the highest and lowest stock price from being too far apart, the data is scaled using the scaling method. To avoid bias that may appear in the prediction result, cross-validation is used with the Min Max Scaling method, which will devide the dataset into training data and testing data within a range of 30 days after the training data. The parameters to be tested include n_estimator = 500, early stopping round = 3, learning rate = 0.01, 0.05, 0.1, and max_depth (tree depth) = 3, 4, 5.Findings/result: The result of the research that a learning rate of 0.05 and a tree depth of 5 obtained the lowest RMSE result compared to other models, with an RMSE of 0.009437. The Low feature obtained the highest importance value among all the models built.Originality/value/state of the art: This study used testing data within a range of 30 days after the training data and used a combination of parameters, including n_estimator = 500, early stopping round = 3, learning rate = 0.01, 0.05, 0.1, amd max_depth (tree depth) = 3, 4, 5. ","PeriodicalId":31716,"journal":{"name":"Telematika","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Analysis of XGBoost Algorithm to Determine the Most Optimal Parameters and Features in Predicting Stock Price Movement\",\"authors\":\"Affan Ardana\",\"doi\":\"10.31315/telematika.v20i1.9329\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: The research aims to find the best parameters and features for predicting stock price movement using the XGBoost algorithm. The parameters are searched using the RMSE value, and the features are searched using the importance value.Design/methodology/approach: The research data is the stock data of Amazon.com company (AMZN). The dataset contains the Date, Low, Open, Volume, High, Close, and Adjusted Close features. The dataset is ensured to have no missing data by handling missing values. The input feature is selected using the Pearson Correlation feature selection method. To prevent the difference between the highest and lowest stock price from being too far apart, the data is scaled using the scaling method. To avoid bias that may appear in the prediction result, cross-validation is used with the Min Max Scaling method, which will devide the dataset into training data and testing data within a range of 30 days after the training data. The parameters to be tested include n_estimator = 500, early stopping round = 3, learning rate = 0.01, 0.05, 0.1, and max_depth (tree depth) = 3, 4, 5.Findings/result: The result of the research that a learning rate of 0.05 and a tree depth of 5 obtained the lowest RMSE result compared to other models, with an RMSE of 0.009437. The Low feature obtained the highest importance value among all the models built.Originality/value/state of the art: This study used testing data within a range of 30 days after the training data and used a combination of parameters, including n_estimator = 500, early stopping round = 3, learning rate = 0.01, 0.05, 0.1, amd max_depth (tree depth) = 3, 4, 5. \",\"PeriodicalId\":31716,\"journal\":{\"name\":\"Telematika\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Telematika\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31315/telematika.v20i1.9329\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Telematika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31315/telematika.v20i1.9329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目的:寻找XGBoost算法预测股价走势的最佳参数和特征。使用RMSE值搜索参数,使用重要性值搜索特征。设计/方法/方法:研究数据为亚马逊公司(AMZN)的股票数据。数据集包含日期,低,打开,音量,高,关闭和调整关闭特征。通过处理缺失值,确保数据集没有缺失数据。使用皮尔逊相关特征选择方法选择输入特征。为了防止最高和最低股票价格之间的差异太远,使用缩放方法对数据进行缩放。为了避免预测结果中可能出现的偏差,交叉验证采用了Min Max Scaling方法,该方法将数据集分为训练数据和测试数据,在训练数据后30天的范围内。需要测试的参数包括n_estimator = 500, early stop round = 3,学习率= 0.01,0.05,0.1,max_depth (tree depth) = 3,4,5。发现/结果:研究结果表明,学习率为0.05,树深度为5时,与其他模型相比RMSE结果最低,RMSE为0.009437。Low特征在所有模型中获得了最高的重要值。独创性/价值/技术水平:本研究使用训练数据后30天范围内的测试数据,并使用组合参数,其中n_estimator = 500,早期停止轮= 3,学习率= 0.01,0.05,0.1,max_depth(树深度)= 3,4,5。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Performance Analysis of XGBoost Algorithm to Determine the Most Optimal Parameters and Features in Predicting Stock Price Movement
Purpose: The research aims to find the best parameters and features for predicting stock price movement using the XGBoost algorithm. The parameters are searched using the RMSE value, and the features are searched using the importance value.Design/methodology/approach: The research data is the stock data of Amazon.com company (AMZN). The dataset contains the Date, Low, Open, Volume, High, Close, and Adjusted Close features. The dataset is ensured to have no missing data by handling missing values. The input feature is selected using the Pearson Correlation feature selection method. To prevent the difference between the highest and lowest stock price from being too far apart, the data is scaled using the scaling method. To avoid bias that may appear in the prediction result, cross-validation is used with the Min Max Scaling method, which will devide the dataset into training data and testing data within a range of 30 days after the training data. The parameters to be tested include n_estimator = 500, early stopping round = 3, learning rate = 0.01, 0.05, 0.1, and max_depth (tree depth) = 3, 4, 5.Findings/result: The result of the research that a learning rate of 0.05 and a tree depth of 5 obtained the lowest RMSE result compared to other models, with an RMSE of 0.009437. The Low feature obtained the highest importance value among all the models built.Originality/value/state of the art: This study used testing data within a range of 30 days after the training data and used a combination of parameters, including n_estimator = 500, early stopping round = 3, learning rate = 0.01, 0.05, 0.1, amd max_depth (tree depth) = 3, 4, 5. 
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
7
审稿时长
24 weeks
期刊最新文献
Identification of Social Media Posts Containing Self-reported COVID-19 Symptoms using Triple Word Embeddings and Long Short-Term Memory Deep Learning for Histopathological Image Analysis: A Convolutional Neural Network Approach to Colon Cancer Classification Comparative Analysis of Classification Methods in Sentiment Analysis: The Impact of Feature Selection and Ensemble Techniques Optimization Optimizing Clustering of Indonesian Text Data Using Particle Swarm Optimization Algorithm: A Case Study of the Quran Translation Monitoring Development Board based on InfluxDB and Grafana
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1