Herein, we proposed a machine learning (ML) framework for determining four photovoltaic parameters: open-circuit voltage (), short-circuit current density (), fill factor (FF), and power conversion efficiency (PCE) in perovskite solar cells. Initially, a dataset of 503 experimental entries was manually compiled from different peer-reviewed literatures, and physically meaningful descriptors like perovskite composition, energy levels of the various layers, band offsets, charge mobilities, and the architecture type were used as inputs for the ML models. The four ensemble learning algorithms — Random Forest, Gradient Boosting, Extreme Gradient Boosting, and CatBoost — were then trained and analyzed with 5-fold cross-validation and the reserved test sets. CatBoost showed the highest accuracy for PCE prediction (RMSE = 1.09%, ), while Random Forest performs best for , , and FF, achieving RMSE as low as 0.033 V, 1.013 mA/cm, and 0.031, respectively. SHAP-based interpretability analysis revealed that intrinsic features of perovskite, charge mobilities, and interfacial band alignment are essential to the device performance. Furthermore, SHAP feature dependency plots were also used to study how the individual features are associated with the target’s prediction. Finally, an additional evaluation based on 12 independent samples not part of training or test sets confirmed model robustness with predictions in close agreement with reported experimental results. These results indicate that, in addition to sound predictions, the ML models can also observe complex microscale features and correlate them with macroscale device operation that can guide future experimentation for better results.
扫码关注我们
求助内容:
应助结果提醒方式:
