Subin Ha, Yong-Tak Kim, Eun-Soon Im, Jina Hur, Sera Jo, Yong-Seok Kim, Kyo‑Moon Shim
{"title":"Impacts of meteorological variables and machine learning algorithms on rice yield prediction in Korea","authors":"Subin Ha, Yong-Tak Kim, Eun-Soon Im, Jina Hur, Sera Jo, Yong-Seok Kim, Kyo‑Moon Shim","doi":"10.1007/s00484-023-02544-x","DOIUrl":null,"url":null,"abstract":"<div><p>As crop productivity is greatly influenced by weather conditions, many attempts have been made to estimate crop yields using meteorological data and have achieved great progress with the development of machine learning. However, most yield prediction models are developed based on observational data, and the utilization of climate model output in yield prediction has been addressed in very few studies. In this study, we estimate rice yields in South Korea using the meteorological variables provided by ERA5 reanalysis data (ERA-O) and its dynamically downscaled data (ERA-DS). After ERA-O and ERA-DS are validated against observations (OBS), two different machine learning models, Support Vector Machine (SVM) and Long Short-Term Memory (LSTM), are trained with different combinations of eight meteorological variables (mean temperature, maximum temperature, minimum temperature, precipitation, diurnal temperature range, solar irradiance, mean wind speed, and relative humidity) obtained from OBS, ERA-O, and ERA-DS at weekly and monthly timescales from May to September. Regardless of the model type and the source of the input data, training a model with weekly datasets leads to better yield estimates compared to monthly datasets. LSTM generally outperforms SVM, especially when the model is trained with ERA-DS data at a weekly timescale. The best yield estimates are produced by the LSTM model trained with all eight variables at a weekly timescale. Altogether this study shows the significance of high spatial and temporal resolution of input meteorological data in yield prediction, which can also serve to substantiate the added value of dynamical downscaling.</p></div>","PeriodicalId":588,"journal":{"name":"International Journal of Biometeorology","volume":"67 11","pages":"1825 - 1838"},"PeriodicalIF":3.0000,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Biometeorology","FirstCategoryId":"89","ListUrlMain":"https://link.springer.com/article/10.1007/s00484-023-02544-x","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
As crop productivity is greatly influenced by weather conditions, many attempts have been made to estimate crop yields using meteorological data and have achieved great progress with the development of machine learning. However, most yield prediction models are developed based on observational data, and the utilization of climate model output in yield prediction has been addressed in very few studies. In this study, we estimate rice yields in South Korea using the meteorological variables provided by ERA5 reanalysis data (ERA-O) and its dynamically downscaled data (ERA-DS). After ERA-O and ERA-DS are validated against observations (OBS), two different machine learning models, Support Vector Machine (SVM) and Long Short-Term Memory (LSTM), are trained with different combinations of eight meteorological variables (mean temperature, maximum temperature, minimum temperature, precipitation, diurnal temperature range, solar irradiance, mean wind speed, and relative humidity) obtained from OBS, ERA-O, and ERA-DS at weekly and monthly timescales from May to September. Regardless of the model type and the source of the input data, training a model with weekly datasets leads to better yield estimates compared to monthly datasets. LSTM generally outperforms SVM, especially when the model is trained with ERA-DS data at a weekly timescale. The best yield estimates are produced by the LSTM model trained with all eight variables at a weekly timescale. Altogether this study shows the significance of high spatial and temporal resolution of input meteorological data in yield prediction, which can also serve to substantiate the added value of dynamical downscaling.
期刊介绍:
The Journal publishes original research papers, review articles and short communications on studies examining the interactions between living organisms and factors of the natural and artificial atmospheric environment.
Living organisms extend from single cell organisms, to plants and animals, including humans. The atmospheric environment includes climate and weather, electromagnetic radiation, and chemical and biological pollutants. The journal embraces basic and applied research and practical aspects such as living conditions, agriculture, forestry, and health.
The journal is published for the International Society of Biometeorology, and most membership categories include a subscription to the Journal.