用于预测氢气生产的序列门控递归和自我关注可解释深度学习模型：意义和适用性

IF 10.1 1区工程技术 Q1 ENERGY & FUELS Applied Energy Pub Date : 2024-11-09 DOI:10.1016/j.apenergy.2024.124851

Chiagoziem C. Ukwuoma , Dongsheng Cai , Chibueze D. Ukwuoma , Mmesoma P. Chukwuemeka , Blessing O. Ayeni , Chidera O. Ukwuoma , Odeh Victor Adeyi , Qi Huang

{"title":"用于预测氢气生产的序列门控递归和自我关注可解释深度学习模型：意义和适用性","authors":"Chiagoziem C. Ukwuoma , Dongsheng Cai , Chibueze D. Ukwuoma , Mmesoma P. Chukwuemeka , Blessing O. Ayeni , Chidera O. Ukwuoma , Odeh Victor Adeyi , Qi Huang","doi":"10.1016/j.apenergy.2024.124851","DOIUrl":null,"url":null,"abstract":"<div><div>To meet the difficulties of the current energy environment, hydrogen has enormous potential as a clean and sustainable energy source. Utilizing hydrogen's potential requires accurate hydrogen production prediction. Due to its capacity to identify intricate patterns in data, Machine learning alongside deep learning models has attracted considerable interest from a variety of industries, including the energy industry. Although these models yield an acceptable performance, there is still a need to improve their prediction results. Also, they are inherently black boxes, which makes it difficult to comprehend and interpret their predictions, particularly in important sectors like hydrogen generation. Sequel to the above, a sequential gated recurrent and self-attention network is proposed in this study to enhance hydrogen production prediction. The framework captures both sequential dependencies and contextual information enabling the model to effectively learn and represent temporal patterns in hydrogen production prediction. The biomass gasification dataset is used for the experiment including the Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Coefficient of Determination (R<sup>2</sup>), Mean Squared Logarithmic Error (MSLE) and Root Mean Squared Logarithmic Error (RMSLE) evaluation metrics. The proposed model recorded an optimal performance with an MAE of 0.102, MSE of 0.027, RMSE of 0.160, R<sup>2</sup> of 0.999, MSLE of 0.001, and RMSLE of 0.030 based on K-cross validation. Among the input features, the percentage of plastics in the mixture(wt%) and RSS Particle Size(mm) are identified to be the most influential features in the proposed model prediction as identified by Shapley Additive Explanation (SHAP), Local Interpretable Model-Agnostic Explanations (LIME) and Feature importance plot. With 99.99 % of the data points for H<sub>2</sub> production found within the range of reliability, the model demonstrates robust predictive capability with the majority of observations exerting minimal leverage (0 ≤ u ≤ [leverage threshold]) and limited influence (0 ≤ H ≤ [cooks' threshold]) on the predictive outcome using the modified William plot. Furthermore, various visualization approaches like Matthews correlation coefficient and Tarloy charts were adapted for the result explanations. The proposed model results were compared with state-of-the-art models exploring the significance of the proposed model in providing insights into the underlying mechanisms and factors influencing hydrogen production processes hence improving human understanding of the relationships between input factors and hydrogen production outputs as well as bridging the gap between predicted accuracy and interpretability.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"378 ","pages":"Article 124851"},"PeriodicalIF":10.1000,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sequential gated recurrent and self attention explainable deep learning model for predicting hydrogen production: Implications and applicability\",\"authors\":\"Chiagoziem C. Ukwuoma , Dongsheng Cai , Chibueze D. Ukwuoma , Mmesoma P. Chukwuemeka , Blessing O. Ayeni , Chidera O. Ukwuoma , Odeh Victor Adeyi , Qi Huang\",\"doi\":\"10.1016/j.apenergy.2024.124851\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>To meet the difficulties of the current energy environment, hydrogen has enormous potential as a clean and sustainable energy source. Utilizing hydrogen's potential requires accurate hydrogen production prediction. Due to its capacity to identify intricate patterns in data, Machine learning alongside deep learning models has attracted considerable interest from a variety of industries, including the energy industry. Although these models yield an acceptable performance, there is still a need to improve their prediction results. Also, they are inherently black boxes, which makes it difficult to comprehend and interpret their predictions, particularly in important sectors like hydrogen generation. Sequel to the above, a sequential gated recurrent and self-attention network is proposed in this study to enhance hydrogen production prediction. The framework captures both sequential dependencies and contextual information enabling the model to effectively learn and represent temporal patterns in hydrogen production prediction. The biomass gasification dataset is used for the experiment including the Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Coefficient of Determination (R<sup>2</sup>), Mean Squared Logarithmic Error (MSLE) and Root Mean Squared Logarithmic Error (RMSLE) evaluation metrics. The proposed model recorded an optimal performance with an MAE of 0.102, MSE of 0.027, RMSE of 0.160, R<sup>2</sup> of 0.999, MSLE of 0.001, and RMSLE of 0.030 based on K-cross validation. Among the input features, the percentage of plastics in the mixture(wt%) and RSS Particle Size(mm) are identified to be the most influential features in the proposed model prediction as identified by Shapley Additive Explanation (SHAP), Local Interpretable Model-Agnostic Explanations (LIME) and Feature importance plot. With 99.99 % of the data points for H<sub>2</sub> production found within the range of reliability, the model demonstrates robust predictive capability with the majority of observations exerting minimal leverage (0 ≤ u ≤ [leverage threshold]) and limited influence (0 ≤ H ≤ [cooks' threshold]) on the predictive outcome using the modified William plot. Furthermore, various visualization approaches like Matthews correlation coefficient and Tarloy charts were adapted for the result explanations. The proposed model results were compared with state-of-the-art models exploring the significance of the proposed model in providing insights into the underlying mechanisms and factors influencing hydrogen production processes hence improving human understanding of the relationships between input factors and hydrogen production outputs as well as bridging the gap between predicted accuracy and interpretability.</div></div>\",\"PeriodicalId\":246,\"journal\":{\"name\":\"Applied Energy\",\"volume\":\"378 \",\"pages\":\"Article 124851\"},\"PeriodicalIF\":10.1000,\"publicationDate\":\"2024-11-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Energy\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306261924022347\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENERGY & FUELS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306261924022347","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}

引用次数: 0

摘要

为了应对当前能源环境的困难，氢作为一种清洁和可持续的能源具有巨大的潜力。要发挥氢的潜力，就必须准确预测氢的产量。由于机器学习和深度学习模型能够识别数据中错综复杂的模式，因此引起了包括能源行业在内的各行各业的极大兴趣。虽然这些模型的性能可以接受，但仍需要改进其预测结果。而且，这些模型本身就是黑盒子，很难理解和解释其预测结果，尤其是在制氢等重要领域。有鉴于此，本研究提出了一种顺序门控递归自注意网络，以提高氢气生产预测能力。该框架同时捕捉了顺序依赖性和上下文信息，使模型能够有效地学习和表示氢气生产预测中的时间模式。实验使用了生物质气化数据集，包括平均绝对误差 (MAE)、平均平方误差 (MSE)、根平均平方误差 (RMSE)、判定系数 (R2)、平均平方对数误差 (MSLE) 和根平均平方对数误差 (RMSLE) 等评价指标。基于 K 交叉验证，所提模型的 MAE 为 0.102，MSE 为 0.027，RMSE 为 0.160，R2 为 0.999，MSLE 为 0.001，RMSLE 为 0.030，表现最佳。在输入特征中，混合物中塑料的百分比（wt%）和 RSS 粒径（mm）被确定为对所提出的模型预测最有影响的特征，这些特征由 Shapley Additive Explanation (SHAP)、Local Interpretable Model-Agnostic Explanations (LIME) 和特征重要性图确定。由于 99.99% 的 H2 生产数据点都在可靠性范围内，因此该模型具有强大的预测能力，使用修改后的威廉图，大多数观测值对预测结果的影响极小（0 ≤ u ≤ [杠杆阈值]），影响有限（0 ≤ H ≤ [厨师阈值]）。此外，还采用了马修斯相关系数和 Tarloy 图表等多种可视化方法来解释结果。将所提出的模型结果与最先进的模型进行了比较，以探讨所提出的模型在深入了解影响氢气生产过程的潜在机制和因素方面的意义，从而提高人类对输入因素和氢气生产产出之间关系的理解，并缩小预测准确性和可解释性之间的差距。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Sequential gated recurrent and self attention explainable deep learning model for predicting hydrogen production: Implications and applicability

To meet the difficulties of the current energy environment, hydrogen has enormous potential as a clean and sustainable energy source. Utilizing hydrogen's potential requires accurate hydrogen production prediction. Due to its capacity to identify intricate patterns in data, Machine learning alongside deep learning models has attracted considerable interest from a variety of industries, including the energy industry. Although these models yield an acceptable performance, there is still a need to improve their prediction results. Also, they are inherently black boxes, which makes it difficult to comprehend and interpret their predictions, particularly in important sectors like hydrogen generation. Sequel to the above, a sequential gated recurrent and self-attention network is proposed in this study to enhance hydrogen production prediction. The framework captures both sequential dependencies and contextual information enabling the model to effectively learn and represent temporal patterns in hydrogen production prediction. The biomass gasification dataset is used for the experiment including the Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Coefficient of Determination (R²), Mean Squared Logarithmic Error (MSLE) and Root Mean Squared Logarithmic Error (RMSLE) evaluation metrics. The proposed model recorded an optimal performance with an MAE of 0.102, MSE of 0.027, RMSE of 0.160, R² of 0.999, MSLE of 0.001, and RMSLE of 0.030 based on K-cross validation. Among the input features, the percentage of plastics in the mixture(wt%) and RSS Particle Size(mm) are identified to be the most influential features in the proposed model prediction as identified by Shapley Additive Explanation (SHAP), Local Interpretable Model-Agnostic Explanations (LIME) and Feature importance plot. With 99.99 % of the data points for H₂ production found within the range of reliability, the model demonstrates robust predictive capability with the majority of observations exerting minimal leverage (0 ≤ u ≤ [leverage threshold]) and limited influence (0 ≤ H ≤ [cooks' threshold]) on the predictive outcome using the modified William plot. Furthermore, various visualization approaches like Matthews correlation coefficient and Tarloy charts were adapted for the result explanations. The proposed model results were compared with state-of-the-art models exploring the significance of the proposed model in providing insights into the underlying mechanisms and factors influencing hydrogen production processes hence improving human understanding of the relationships between input factors and hydrogen production outputs as well as bridging the gap between predicted accuracy and interpretability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Energy 工程技术-工程：化工

CiteScore

21.20

自引率

10.70%

发文量

1830

审稿时长

41 days

期刊介绍： Applied Energy serves as a platform for sharing innovations, research, development, and demonstrations in energy conversion, conservation, and sustainable energy systems. The journal covers topics such as optimal energy resource use, environmental pollutant mitigation, and energy process analysis. It welcomes original papers, review articles, technical notes, and letters to the editor. Authors are encouraged to submit manuscripts that bridge the gap between research, development, and implementation. The journal addresses a wide spectrum of topics, including fossil and renewable energy technologies, energy economics, and environmental impacts. Applied Energy also explores modeling and forecasting, conservation strategies, and the social and economic implications of energy policies, including climate change mitigation. It is complemented by the open-access journal Advances in Applied Energy.