Bayesian regression modeling and inference of energy efficiency data: the effect of collinearity and sensitivity analysis

IF 2.6 4区 工程技术 Q3 ENERGY & FUELS Frontiers in Energy Research Pub Date : 2024-07-30 DOI:10.3389/fenrg.2024.1416126
Laila A. Al-Essa, Endris Assen Ebrahim, Yusuf Ali Mergiaw
{"title":"Bayesian regression modeling and inference of energy efficiency data: the effect of collinearity and sensitivity analysis","authors":"Laila A. Al-Essa, Endris Assen Ebrahim, Yusuf Ali Mergiaw","doi":"10.3389/fenrg.2024.1416126","DOIUrl":null,"url":null,"abstract":"The majority of research predicted heating demand using linear regression models, but they did not give current building features enough context. Model problems such as Multicollinearity need to be checked and appropriate features must be chosen based on their significance to produce accurate load predictions and inferences. Numerous building energy efficiency features correlate with each other and with heating load in the energy efficiency dataset. The standard Ordinary Least Square regression has a problem when the dataset shows Multicollinearity. Bayesian supervised machine learning is a popular method for parameter estimation and inference when frequentist statistical assumptions fail. The prediction of the heating load as the energy efficiency output with Bayesian inference in multiple regression with a collinearity problem needs careful data analysis. The parameter estimates and hypothesis tests were significantly impacted by the Multicollinearity problem that occurred among the features in the building energy efficiency dataset. This study demonstrated several shrinkage and informative priors on likelihood in the Bayesian framework as alternative solutions or remedies to reduce the collinearity problem in multiple regression analysis. This manuscript tried to model the standard Ordinary Least Square regression and four distinct Bayesian regression models with several prior distributions using the Hamiltonian Monte Carlo algorithm in Bayesian Regression Modeling using Stan and the package used to fit linear models. Several model comparison and assessment methods were used to select the best-fit regression model for the dataset. The Bayesian regression model with weakly informative prior is the best-fitted model compared to the standard Ordinary Least Squares regression and other Bayesian regression models with shrinkage priors for collinear energy efficiency data. The numerical findings of collinearity were checked using variance inflation factor, estimates of regression coefficient and standard errors, and sensitivity of priors and likelihoods. It is suggested that applied research in science, engineering, agriculture, health, and other disciplines needs to check the Multicollinearity effect for regression modeling for better estimation and inference.","PeriodicalId":12428,"journal":{"name":"Frontiers in Energy Research","volume":"89 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Energy Research","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3389/fenrg.2024.1416126","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0

Abstract

The majority of research predicted heating demand using linear regression models, but they did not give current building features enough context. Model problems such as Multicollinearity need to be checked and appropriate features must be chosen based on their significance to produce accurate load predictions and inferences. Numerous building energy efficiency features correlate with each other and with heating load in the energy efficiency dataset. The standard Ordinary Least Square regression has a problem when the dataset shows Multicollinearity. Bayesian supervised machine learning is a popular method for parameter estimation and inference when frequentist statistical assumptions fail. The prediction of the heating load as the energy efficiency output with Bayesian inference in multiple regression with a collinearity problem needs careful data analysis. The parameter estimates and hypothesis tests were significantly impacted by the Multicollinearity problem that occurred among the features in the building energy efficiency dataset. This study demonstrated several shrinkage and informative priors on likelihood in the Bayesian framework as alternative solutions or remedies to reduce the collinearity problem in multiple regression analysis. This manuscript tried to model the standard Ordinary Least Square regression and four distinct Bayesian regression models with several prior distributions using the Hamiltonian Monte Carlo algorithm in Bayesian Regression Modeling using Stan and the package used to fit linear models. Several model comparison and assessment methods were used to select the best-fit regression model for the dataset. The Bayesian regression model with weakly informative prior is the best-fitted model compared to the standard Ordinary Least Squares regression and other Bayesian regression models with shrinkage priors for collinear energy efficiency data. The numerical findings of collinearity were checked using variance inflation factor, estimates of regression coefficient and standard errors, and sensitivity of priors and likelihoods. It is suggested that applied research in science, engineering, agriculture, health, and other disciplines needs to check the Multicollinearity effect for regression modeling for better estimation and inference.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
能源效率数据的贝叶斯回归建模和推断:共线性的影响和敏感性分析
大多数研究使用线性回归模型预测供暖需求,但这些模型没有充分考虑当前建筑的特点。需要检查多重共线性等模型问题,并根据其重要性选择适当的特征,以得出准确的负荷预测和推论。在能效数据集中,有许多建筑能效特征相互关联,并与供热负荷相关。当数据集出现多重共线性时,标准的普通最小二乘法回归就会出现问题。当频繁主义统计假设失效时,贝叶斯监督机器学习是一种常用的参数估计和推理方法。在存在共线性问题的多元回归中,利用贝叶斯推理预测作为能效产出的供热负荷,需要进行仔细的数据分析。建筑能效数据集的特征之间存在多重共线性问题,这严重影响了参数估计和假设检验。本研究展示了贝叶斯框架中的几种收缩和似然信息先验,作为减少多元回归分析中的共线性问题的替代解决方案或补救措施。本手稿使用 Stan 和线性模型拟合软件包中的贝叶斯回归建模中的哈密尔顿蒙特卡罗算法,尝试了标准普通最小二乘回归模型和四种不同的贝叶斯回归模型,并使用了几种先验分布。使用了几种模型比较和评估方法来选择数据集的最佳拟合回归模型。与标准普通最小二乘法回归模型和其他具有收缩先验的贝叶斯回归模型相比,具有弱信息先验的贝叶斯回归模型是能效数据共线性的最佳拟合模型。利用方差膨胀因子、回归系数和标准误差的估计值以及先验和似然的敏感性检验了共线性的数值结论。建议科学、工程、农业、健康和其他学科的应用研究需要检查回归建模的多重共线性效应,以获得更好的估计和推断。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Frontiers in Energy Research
Frontiers in Energy Research Economics, Econometrics and Finance-Economics and Econometrics
CiteScore
3.90
自引率
11.80%
发文量
1727
审稿时长
12 weeks
期刊介绍: Frontiers in Energy Research makes use of the unique Frontiers platform for open-access publishing and research networking for scientists, which provides an equal opportunity to seek, share and create knowledge. The mission of Frontiers is to place publishing back in the hands of working scientists and to promote an interactive, fair, and efficient review process. Articles are peer-reviewed according to the Frontiers review guidelines, which evaluate manuscripts on objective editorial criteria
期刊最新文献
Grid-integrated solutions for sustainable EV charging: a comparative study of renewable energy and battery storage systems Research on the impact of digitalization on energy companies’ green transition: new insights from China Multi-objective-based economic and emission dispatch with integration of wind energy sources using different optimization algorithms Demand-side management scenario analysis for the energy-efficient future of Pakistan: Bridging the gap between market interests and national priorities Modeling and scheduling of utility-scale energy storage toward high-share renewable coordination
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1