Evaluating parallelized support vector regression and nearest neighbor regression with different input variations for estimating daily global solar radiation of the humid subtropical region in China
{"title":"Evaluating parallelized support vector regression and nearest neighbor regression with different input variations for estimating daily global solar radiation of the humid subtropical region in China","authors":"Xiang Yu","doi":"10.1093/ijlct/ctad005","DOIUrl":null,"url":null,"abstract":"\n Indirectly estimating global solar radiation is strongly nonlinear and needs to be addressed by machine learning. Sequentially developing a machine learning model can be very time consuming. Moreover, whether and how the exogenous meteorological, geographical, and temporal variables affect regression accuracy still hasn’t been well understood. This paper evaluates parallelized support vector regression (SVR) and nearest neighbor regression (NNR) models for estimating daily global solar radiation of the humid subtropical region in China using existing Python libraries on a multi-core central processing unit (CPU) and a graphical processing unit (GPU). Seven input variations are studied. Two variations are commonly adopted in literature, four variations contain meteorological, geographical, and/or temporal features with bounded Pearson correlation coefficients (PCCs), and the other variation simply include all the available features. Experimental results demonstrate that: SVR and NNR are equally powerful for nonlinear regression, and the variation comprising features with absolute PCCs no less than 0.3 (i.e. just all the meteorological features) is able to achieve most accurate estimation; the GPU-parallelized SVR model can accelerate parameter calibration and prediction; compared with the CPU-parallelized and GPU-parallelized SVR models, the GPU-parallelized NNR model is much more efficient and rather more scalable with the increment of the number of data samples; and the CPU-parallelized NNR model consumes quite less parameter calibration time than the GPU-parallelized NNR model, owing to different methods adopted for determining distances and significant time wasted by the GPU-parallelized NNR model on repeatedly calculating required information during crossvalidation.","PeriodicalId":14118,"journal":{"name":"International Journal of Low-carbon Technologies","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2023-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Low-carbon Technologies","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1093/ijlct/ctad005","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Indirectly estimating global solar radiation is strongly nonlinear and needs to be addressed by machine learning. Sequentially developing a machine learning model can be very time consuming. Moreover, whether and how the exogenous meteorological, geographical, and temporal variables affect regression accuracy still hasn’t been well understood. This paper evaluates parallelized support vector regression (SVR) and nearest neighbor regression (NNR) models for estimating daily global solar radiation of the humid subtropical region in China using existing Python libraries on a multi-core central processing unit (CPU) and a graphical processing unit (GPU). Seven input variations are studied. Two variations are commonly adopted in literature, four variations contain meteorological, geographical, and/or temporal features with bounded Pearson correlation coefficients (PCCs), and the other variation simply include all the available features. Experimental results demonstrate that: SVR and NNR are equally powerful for nonlinear regression, and the variation comprising features with absolute PCCs no less than 0.3 (i.e. just all the meteorological features) is able to achieve most accurate estimation; the GPU-parallelized SVR model can accelerate parameter calibration and prediction; compared with the CPU-parallelized and GPU-parallelized SVR models, the GPU-parallelized NNR model is much more efficient and rather more scalable with the increment of the number of data samples; and the CPU-parallelized NNR model consumes quite less parameter calibration time than the GPU-parallelized NNR model, owing to different methods adopted for determining distances and significant time wasted by the GPU-parallelized NNR model on repeatedly calculating required information during crossvalidation.
期刊介绍:
The International Journal of Low-Carbon Technologies is a quarterly publication concerned with the challenge of climate change and its effects on the built environment and sustainability. The Journal publishes original, quality research papers on issues of climate change, sustainable development and the built environment related to architecture, building services engineering, civil engineering, building engineering, urban design and other disciplines. It features in-depth articles, technical notes, review papers, book reviews and special issues devoted to international conferences. The journal encourages submissions related to interdisciplinary research in the built environment. The journal is available in paper and electronic formats. All articles are peer-reviewed by leading experts in the field.