Dual-Stage Stacking Machine Learning Method Considering Virtual Sample Generation for the Prediction of ZIF-8′ BET Specific Surface Area with Experimental Validation
Fengfei Chen, Hongguang Zhou, Xiaohui Yu, Yunpeng Zhao, Chenchen Wang, Bin Dai, Sheng Han
{"title":"Dual-Stage Stacking Machine Learning Method Considering Virtual Sample Generation for the Prediction of ZIF-8′ BET Specific Surface Area with Experimental Validation","authors":"Fengfei Chen, Hongguang Zhou, Xiaohui Yu, Yunpeng Zhao, Chenchen Wang, Bin Dai, Sheng Han","doi":"10.1021/acs.langmuir.4c04088","DOIUrl":null,"url":null,"abstract":"The widespread application of metal–organic frameworks (MOFs) in wastewater and gas treatment has created an increasing demand for accurate and rapid assessment of their BET specific surface area. However, experimental methods for acquiring sufficient statistical data are often costly and time-consuming. Therefore, this study proposes a dual-stage stacking model with Gaussian mixture model-virtual sample generation (GMM-VSG) technology for the BET specific surface area prediction. In this study, 90 real samples were selected from the MOF database and 300 virtual samples were generated. The performance on both real and virtual samples was evaluated by using four machine learning models, including Bayesian regression (Bayes), adaptive boosting (AdaBoost), random forest (RF), and extreme gradient boosting (XGBoost). Subsequently, three best-performing models and a linear regression model were selected for constructing a two-stage stacking model, with <i>R</i><sup>2</sup> value of 0.974. Finally, experimental conditions were adjusted based on feature importance analysis during the validation process, and the result shows that the prediction accuracy of the BET specific surface area is 0.943. This study contributes to the development of more efficient and accurate evaluation methods.","PeriodicalId":50,"journal":{"name":"Langmuir","volume":"14 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Langmuir","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.langmuir.4c04088","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The widespread application of metal–organic frameworks (MOFs) in wastewater and gas treatment has created an increasing demand for accurate and rapid assessment of their BET specific surface area. However, experimental methods for acquiring sufficient statistical data are often costly and time-consuming. Therefore, this study proposes a dual-stage stacking model with Gaussian mixture model-virtual sample generation (GMM-VSG) technology for the BET specific surface area prediction. In this study, 90 real samples were selected from the MOF database and 300 virtual samples were generated. The performance on both real and virtual samples was evaluated by using four machine learning models, including Bayesian regression (Bayes), adaptive boosting (AdaBoost), random forest (RF), and extreme gradient boosting (XGBoost). Subsequently, three best-performing models and a linear regression model were selected for constructing a two-stage stacking model, with R2 value of 0.974. Finally, experimental conditions were adjusted based on feature importance analysis during the validation process, and the result shows that the prediction accuracy of the BET specific surface area is 0.943. This study contributes to the development of more efficient and accurate evaluation methods.
期刊介绍:
Langmuir is an interdisciplinary journal publishing articles in the following subject categories:
Colloids: surfactants and self-assembly, dispersions, emulsions, foams
Interfaces: adsorption, reactions, films, forces
Biological Interfaces: biocolloids, biomolecular and biomimetic materials
Materials: nano- and mesostructured materials, polymers, gels, liquid crystals
Electrochemistry: interfacial charge transfer, charge transport, electrocatalysis, electrokinetic phenomena, bioelectrochemistry
Devices and Applications: sensors, fluidics, patterning, catalysis, photonic crystals
However, when high-impact, original work is submitted that does not fit within the above categories, decisions to accept or decline such papers will be based on one criteria: What Would Irving Do?
Langmuir ranks #2 in citations out of 136 journals in the category of Physical Chemistry with 113,157 total citations. The journal received an Impact Factor of 4.384*.
This journal is also indexed in the categories of Materials Science (ranked #1) and Multidisciplinary Chemistry (ranked #5).