Predicting phase structures is crucial for the discovery and design of high-performance High Entropy Carbide Ceramics (HECCs). Data-driven approaches are increasingly popular in materials science. However, these methods heavily rely on dataset size. To tackle this, we propose a two-step data enhancement method using Auxiliary Enhanced-Least Squares Generative Adversarial Networks—Tree-structured Parzen Estimator—Multilayer Perceptron (AE-LSGAN-TPE-MLP). The method improves model stability and data quality by adding auxiliary generator and discriminator networks that operate alongside the primary ones in the LSGAN framework. Additionally, data enhancement occurs in two steps to further ensure data quality. In the first step, feature enhancement is performed using a label-guided feature separation strategy. Features corresponding to different label categories are separated, and the AE-LSGAN model is trained independently for each category to generate realistic feature data. In the second step, pseudo-labels for augmented features are predicted. An MLP model is pre-trained on 50% of the dataset, and the generated features are used as input to predict pseudo-labels. Finally, the augmented dataset is merged with the training set, and a TPE-MLP model is built for prediction. Comparative results show that the proposed method significantly improves model performance, achieving 0.9629 accuracy on a test set comprising 50% of the dataset. Moreover, three HECCs were synthesized through experiments, further validating the model's accuracy. Finally, an interpretability analysis of the model was performed, with the shapley additive explanations (SHAP) summary plot revealing that σχ and σV were the paramount influencers of HECCs' phase structures. Notably, when σχ and σV were below 0.079 and 0.125, single-phase HECCs were more likely to be formed. This work is expected to facilitate the discovery and design of HECCs with bespoke phase structures and properties.
扫码关注我们
求助内容:
应助结果提醒方式:
