Chiagoziem C. Ukwuoma , Dongsheng Cai , Chibueze D. Ukwuoma , Mmesoma P. Chukwuemeka , Blessing O. Ayeni , Chidera O. Ukwuoma , Odeh Victor Adeyi , Qi Huang
{"title":"Sequential gated recurrent and self attention explainable deep learning model for predicting hydrogen production: Implications and applicability","authors":"Chiagoziem C. Ukwuoma , Dongsheng Cai , Chibueze D. Ukwuoma , Mmesoma P. Chukwuemeka , Blessing O. Ayeni , Chidera O. Ukwuoma , Odeh Victor Adeyi , Qi Huang","doi":"10.1016/j.apenergy.2024.124851","DOIUrl":null,"url":null,"abstract":"<div><div>To meet the difficulties of the current energy environment, hydrogen has enormous potential as a clean and sustainable energy source. Utilizing hydrogen's potential requires accurate hydrogen production prediction. Due to its capacity to identify intricate patterns in data, Machine learning alongside deep learning models has attracted considerable interest from a variety of industries, including the energy industry. Although these models yield an acceptable performance, there is still a need to improve their prediction results. Also, they are inherently black boxes, which makes it difficult to comprehend and interpret their predictions, particularly in important sectors like hydrogen generation. Sequel to the above, a sequential gated recurrent and self-attention network is proposed in this study to enhance hydrogen production prediction. The framework captures both sequential dependencies and contextual information enabling the model to effectively learn and represent temporal patterns in hydrogen production prediction. The biomass gasification dataset is used for the experiment including the Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Coefficient of Determination (R<sup>2</sup>), Mean Squared Logarithmic Error (MSLE) and Root Mean Squared Logarithmic Error (RMSLE) evaluation metrics. The proposed model recorded an optimal performance with an MAE of 0.102, MSE of 0.027, RMSE of 0.160, R<sup>2</sup> of 0.999, MSLE of 0.001, and RMSLE of 0.030 based on K-cross validation. Among the input features, the percentage of plastics in the mixture(wt%) and RSS Particle Size(mm) are identified to be the most influential features in the proposed model prediction as identified by Shapley Additive Explanation (SHAP), Local Interpretable Model-Agnostic Explanations (LIME) and Feature importance plot. With 99.99 % of the data points for H<sub>2</sub> production found within the range of reliability, the model demonstrates robust predictive capability with the majority of observations exerting minimal leverage (0 ≤ u ≤ [leverage threshold]) and limited influence (0 ≤ H ≤ [cooks' threshold]) on the predictive outcome using the modified William plot. Furthermore, various visualization approaches like Matthews correlation coefficient and Tarloy charts were adapted for the result explanations. The proposed model results were compared with state-of-the-art models exploring the significance of the proposed model in providing insights into the underlying mechanisms and factors influencing hydrogen production processes hence improving human understanding of the relationships between input factors and hydrogen production outputs as well as bridging the gap between predicted accuracy and interpretability.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"378 ","pages":"Article 124851"},"PeriodicalIF":10.1000,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306261924022347","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
To meet the difficulties of the current energy environment, hydrogen has enormous potential as a clean and sustainable energy source. Utilizing hydrogen's potential requires accurate hydrogen production prediction. Due to its capacity to identify intricate patterns in data, Machine learning alongside deep learning models has attracted considerable interest from a variety of industries, including the energy industry. Although these models yield an acceptable performance, there is still a need to improve their prediction results. Also, they are inherently black boxes, which makes it difficult to comprehend and interpret their predictions, particularly in important sectors like hydrogen generation. Sequel to the above, a sequential gated recurrent and self-attention network is proposed in this study to enhance hydrogen production prediction. The framework captures both sequential dependencies and contextual information enabling the model to effectively learn and represent temporal patterns in hydrogen production prediction. The biomass gasification dataset is used for the experiment including the Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Coefficient of Determination (R2), Mean Squared Logarithmic Error (MSLE) and Root Mean Squared Logarithmic Error (RMSLE) evaluation metrics. The proposed model recorded an optimal performance with an MAE of 0.102, MSE of 0.027, RMSE of 0.160, R2 of 0.999, MSLE of 0.001, and RMSLE of 0.030 based on K-cross validation. Among the input features, the percentage of plastics in the mixture(wt%) and RSS Particle Size(mm) are identified to be the most influential features in the proposed model prediction as identified by Shapley Additive Explanation (SHAP), Local Interpretable Model-Agnostic Explanations (LIME) and Feature importance plot. With 99.99 % of the data points for H2 production found within the range of reliability, the model demonstrates robust predictive capability with the majority of observations exerting minimal leverage (0 ≤ u ≤ [leverage threshold]) and limited influence (0 ≤ H ≤ [cooks' threshold]) on the predictive outcome using the modified William plot. Furthermore, various visualization approaches like Matthews correlation coefficient and Tarloy charts were adapted for the result explanations. The proposed model results were compared with state-of-the-art models exploring the significance of the proposed model in providing insights into the underlying mechanisms and factors influencing hydrogen production processes hence improving human understanding of the relationships between input factors and hydrogen production outputs as well as bridging the gap between predicted accuracy and interpretability.
期刊介绍:
Applied Energy serves as a platform for sharing innovations, research, development, and demonstrations in energy conversion, conservation, and sustainable energy systems. The journal covers topics such as optimal energy resource use, environmental pollutant mitigation, and energy process analysis. It welcomes original papers, review articles, technical notes, and letters to the editor. Authors are encouraged to submit manuscripts that bridge the gap between research, development, and implementation. The journal addresses a wide spectrum of topics, including fossil and renewable energy technologies, energy economics, and environmental impacts. Applied Energy also explores modeling and forecasting, conservation strategies, and the social and economic implications of energy policies, including climate change mitigation. It is complemented by the open-access journal Advances in Applied Energy.