{"title":"混合随机森林和长短期记忆缓解时间序列存量数据的过拟合问题","authors":"Tran Kim Toai, V. Hanh, Vo Minh Huan","doi":"10.1109/ICSSE58758.2023.10227248","DOIUrl":null,"url":null,"abstract":"This paper proposes the hybrid random forest and long short-term memory (LSTM) to mitigate overfitting issue in time series data in stock market. There are many techniques that reduce the overfitting such as data augmentation, regularization, feature selection, dimension reduction, and so on. We propose the model based on feature selection to reduce the model complexity. First, the model selects the stock data features by random forest model. As the result, the selected features are inputted to the LSTM to predict the stock price. By doing so, the proposed model can improve model accuracy in both training and test dataset and generalize well unseen data to mitigate overfitting. The hybrid random forest and LSTM is compared with hybrid ridge and LSTM, and single LSTM model in ability to mitigate overfitting. The MAE, RSME and R2 are used as performance evaluation metrics. We also conduct the study on various stock datasets to evaluate the performance of overcoming the overfitting problems.","PeriodicalId":280745,"journal":{"name":"2023 International Conference on System Science and Engineering (ICSSE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hybrid Random Forest and Long Short-Term Memory to Mitigate Overfitting Issue in Time Series Stock Data\",\"authors\":\"Tran Kim Toai, V. Hanh, Vo Minh Huan\",\"doi\":\"10.1109/ICSSE58758.2023.10227248\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes the hybrid random forest and long short-term memory (LSTM) to mitigate overfitting issue in time series data in stock market. There are many techniques that reduce the overfitting such as data augmentation, regularization, feature selection, dimension reduction, and so on. We propose the model based on feature selection to reduce the model complexity. First, the model selects the stock data features by random forest model. As the result, the selected features are inputted to the LSTM to predict the stock price. By doing so, the proposed model can improve model accuracy in both training and test dataset and generalize well unseen data to mitigate overfitting. The hybrid random forest and LSTM is compared with hybrid ridge and LSTM, and single LSTM model in ability to mitigate overfitting. The MAE, RSME and R2 are used as performance evaluation metrics. We also conduct the study on various stock datasets to evaluate the performance of overcoming the overfitting problems.\",\"PeriodicalId\":280745,\"journal\":{\"name\":\"2023 International Conference on System Science and Engineering (ICSSE)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on System Science and Engineering (ICSSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSSE58758.2023.10227248\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on System Science and Engineering (ICSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSE58758.2023.10227248","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hybrid Random Forest and Long Short-Term Memory to Mitigate Overfitting Issue in Time Series Stock Data
This paper proposes the hybrid random forest and long short-term memory (LSTM) to mitigate overfitting issue in time series data in stock market. There are many techniques that reduce the overfitting such as data augmentation, regularization, feature selection, dimension reduction, and so on. We propose the model based on feature selection to reduce the model complexity. First, the model selects the stock data features by random forest model. As the result, the selected features are inputted to the LSTM to predict the stock price. By doing so, the proposed model can improve model accuracy in both training and test dataset and generalize well unseen data to mitigate overfitting. The hybrid random forest and LSTM is compared with hybrid ridge and LSTM, and single LSTM model in ability to mitigate overfitting. The MAE, RSME and R2 are used as performance evaluation metrics. We also conduct the study on various stock datasets to evaluate the performance of overcoming the overfitting problems.