集成学习与不平衡数据处理在资本市场早期检测中的应用

Putri Auliana Rifqi Mukhlashin, Anwar Fitrianto, Agus M Soleh, Wan Zuki Azman Wan Muhamad
{"title":"集成学习与不平衡数据处理在资本市场早期检测中的应用","authors":"Putri Auliana Rifqi Mukhlashin, Anwar Fitrianto, Agus M Soleh, Wan Zuki Azman Wan Muhamad","doi":"10.18196/jai.v24i2.17970","DOIUrl":null,"url":null,"abstract":"Research aims: This study aims to create an early detection model to predict events in the Indonesian capital market.Design/Methodology/Approach: A quantitative study comparing ensemble learning models with imbalanced data handling detected early capital market events. This study used five ensemble learning models—Random Forest, ExtraTrees, CatBoost, XGBoost, and LightGBM—to detect early events in the Indonesian capital market by handling imbalanced data, such as under sampling (RUS), oversampling (SMOTE, SMOTE-Broder, ADASYN), and over-under sampling (SMOTE-Tomek, SMOTE-ENN), weighted (class weight). Global and regional stock markets, commodities, exchange rates, technical indicators, sectoral indices, JCI leaders, MSCI, net buys of foreign stocks, national securities, and national share ownership all predicted the lowest return of Crisis Management Protocol (CMP) binary responses.Research findings: Hyperparameters and thresholds were tuned to produce the optimum model. The best model had the highest G-mean. ExtraTrees with SMOTE-ENN predicted the highest number of one-day events, with a G-Mean of 96.88%. LightGBM with SMOTE handling best predicted five-day events with an 89.21% G-Mean. With a G-Mean of 89.49%, CatBoost with SMOTE-Border handling was the best for a 15-day event. In addition, LightGBM with SMOTE-Tomek handling and 68.02% G-Mean was best for 30-day events. Further, performance evaluation scores decreased with increased prediction time.Theoretical contribution/Originality: This work relates more imbalance handling methods and ensemble learning to capital market early detection cases.Practitioner/Policy implication: Capital markets can indicate economic stability. Maintaining capital market efficacy and economic value requires a system to detect pressure.Research limitation/Implication: This study used ensemble learning models to predict capital market events 1, 5, 15, and 30 days ahead, assuming Indonesian working days. The model's forecast results are expected to be utilized to monitor the capital market and take precautions.","PeriodicalId":33157,"journal":{"name":"Journal of Accounting and Investment","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Ensemble learning with imbalanced data handling in the early detection of capital markets\",\"authors\":\"Putri Auliana Rifqi Mukhlashin, Anwar Fitrianto, Agus M Soleh, Wan Zuki Azman Wan Muhamad\",\"doi\":\"10.18196/jai.v24i2.17970\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Research aims: This study aims to create an early detection model to predict events in the Indonesian capital market.Design/Methodology/Approach: A quantitative study comparing ensemble learning models with imbalanced data handling detected early capital market events. This study used five ensemble learning models—Random Forest, ExtraTrees, CatBoost, XGBoost, and LightGBM—to detect early events in the Indonesian capital market by handling imbalanced data, such as under sampling (RUS), oversampling (SMOTE, SMOTE-Broder, ADASYN), and over-under sampling (SMOTE-Tomek, SMOTE-ENN), weighted (class weight). Global and regional stock markets, commodities, exchange rates, technical indicators, sectoral indices, JCI leaders, MSCI, net buys of foreign stocks, national securities, and national share ownership all predicted the lowest return of Crisis Management Protocol (CMP) binary responses.Research findings: Hyperparameters and thresholds were tuned to produce the optimum model. The best model had the highest G-mean. ExtraTrees with SMOTE-ENN predicted the highest number of one-day events, with a G-Mean of 96.88%. LightGBM with SMOTE handling best predicted five-day events with an 89.21% G-Mean. With a G-Mean of 89.49%, CatBoost with SMOTE-Border handling was the best for a 15-day event. In addition, LightGBM with SMOTE-Tomek handling and 68.02% G-Mean was best for 30-day events. Further, performance evaluation scores decreased with increased prediction time.Theoretical contribution/Originality: This work relates more imbalance handling methods and ensemble learning to capital market early detection cases.Practitioner/Policy implication: Capital markets can indicate economic stability. Maintaining capital market efficacy and economic value requires a system to detect pressure.Research limitation/Implication: This study used ensemble learning models to predict capital market events 1, 5, 15, and 30 days ahead, assuming Indonesian working days. The model's forecast results are expected to be utilized to monitor the capital market and take precautions.\",\"PeriodicalId\":33157,\"journal\":{\"name\":\"Journal of Accounting and Investment\",\"volume\":\"91 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Accounting and Investment\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18196/jai.v24i2.17970\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Accounting and Investment","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18196/jai.v24i2.17970","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

研究目的:本研究旨在建立一个早期发现模型来预测印尼资本市场的事件。设计/方法/方法:一项比较集成学习模型与不平衡数据处理检测早期资本市场事件的定量研究。本研究使用随机森林、ExtraTrees、CatBoost、XGBoost和lightgbm五种集成学习模型,通过处理欠采样(RUS)、过采样(SMOTE、SMOTE- broder、ADASYN)和过欠采样(SMOTE- tomek、SMOTE- enn)、加权(类权重)等不平衡数据,检测印尼资本市场的早期事件。全球和地区股市、大宗商品、汇率、技术指标、行业指数、JCI领导者、MSCI、外国股票净购买量、国家证券和国家股票所有权都预测了危机管理协议(CMP)二元反应的最低回报。研究发现:调整超参数和阈值以产生最优模型。最佳模型的g均值最高。使用SMOTE-ENN的extratree预测的单日事件数量最多,g均值为96.88%。SMOTE处理的LightGBM对5天事件的预测效果最好,平均g值为89.21%。在为期15天的活动中,具有SMOTE-Border处理的CatBoost的g均值为89.49%。此外,SMOTE-Tomek处理和68.02% G-Mean的LightGBM在30天的事件中表现最好。此外,性能评估分数随着预测时间的增加而下降。理论贡献/独创性:本工作将更多的失衡处理方法和集成学习应用于资本市场早期发现案例。从业者/政策含义:资本市场可以表明经济稳定。维持资本市场的有效性和经济价值需要一个监测压力的系统。研究限制/启示:本研究使用集合学习模型预测未来1、5、15和30天的资本市场事件,假设印尼工作日。该模型的预测结果有望用于监测资本市场和采取预防措施。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Ensemble learning with imbalanced data handling in the early detection of capital markets
Research aims: This study aims to create an early detection model to predict events in the Indonesian capital market.Design/Methodology/Approach: A quantitative study comparing ensemble learning models with imbalanced data handling detected early capital market events. This study used five ensemble learning models—Random Forest, ExtraTrees, CatBoost, XGBoost, and LightGBM—to detect early events in the Indonesian capital market by handling imbalanced data, such as under sampling (RUS), oversampling (SMOTE, SMOTE-Broder, ADASYN), and over-under sampling (SMOTE-Tomek, SMOTE-ENN), weighted (class weight). Global and regional stock markets, commodities, exchange rates, technical indicators, sectoral indices, JCI leaders, MSCI, net buys of foreign stocks, national securities, and national share ownership all predicted the lowest return of Crisis Management Protocol (CMP) binary responses.Research findings: Hyperparameters and thresholds were tuned to produce the optimum model. The best model had the highest G-mean. ExtraTrees with SMOTE-ENN predicted the highest number of one-day events, with a G-Mean of 96.88%. LightGBM with SMOTE handling best predicted five-day events with an 89.21% G-Mean. With a G-Mean of 89.49%, CatBoost with SMOTE-Border handling was the best for a 15-day event. In addition, LightGBM with SMOTE-Tomek handling and 68.02% G-Mean was best for 30-day events. Further, performance evaluation scores decreased with increased prediction time.Theoretical contribution/Originality: This work relates more imbalance handling methods and ensemble learning to capital market early detection cases.Practitioner/Policy implication: Capital markets can indicate economic stability. Maintaining capital market efficacy and economic value requires a system to detect pressure.Research limitation/Implication: This study used ensemble learning models to predict capital market events 1, 5, 15, and 30 days ahead, assuming Indonesian working days. The model's forecast results are expected to be utilized to monitor the capital market and take precautions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
34
审稿时长
20 weeks
期刊最新文献
Herding behavior, information type, and overconfidence bias: an experimental study on novice investors’ investment decisions Determinants of tax compliance behavior among central Java SMEs: The mediating role of intention to comply Voluntary disclosure with the International Integrated Reporting Council (IIRC) framework and value relevance Sharia stock investment decisions: Sharia stock literacy and risk factors and their relations with behavioral bias Investigating the moderating role of past behavior in the relationship between risk aversion and investment choice in the Tehran stock market
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1