Exploring how base model combination affects the results of a “stacking” ensemble machine learning model: An applied study on optimization of heteroatom doped carbon data
{"title":"Exploring how base model combination affects the results of a “stacking” ensemble machine learning model: An applied study on optimization of heteroatom doped carbon data","authors":"Krittapong Deshsorn , Weekit Sirisaksoontorn , Wisit Hirunpinyopas , Pawin Iamprasertkun","doi":"10.1016/j.flatc.2025.100827","DOIUrl":null,"url":null,"abstract":"<div><div>This study explores stack models for electrochemical analysis, incorporating base models (decision trees, linear regression, and k-nearest neighbors) and a meta-model. It reveals that the order of stacking base models affects predictions, often yielding multiple solutions. To address this “uncertainty,” a novel “sorting” technique was applied during meta-model training. This approach significantly reduced model uncertainty, achieving the most accurate predictions and minimizing order deviations (mean absolute error of 37.92388; standard deviation reduced from 6.19 × 10<sup>−15</sup> to 0). The refined model was applied to analyze synergies in electrochemical and material properties using feature importance tools, such as SHAP, Feature Permutation Importance (FPI), and Partial Dependence Plots (PDP). Key insights for heteroatom-doped carbon supercapacitors suggest maximizing surface area and nitrogen, sulfur, and boron doping while minimizing current density and acidic electrolyte concentration. Optimal oxygen and phosphorus doping levels were ∼ 15 % and ∼ 2.5 %, respectively. FPI ranked nitrogen > surface area > electrolyte concentration > oxygen > current density > defect ratio > sulfur > boron > phosphorus. PDP revealed that dual heteroatom doping (e.g., nitrogen and oxygen) may outperform doping with five heteroatoms. These findings enhance machine learning's reliability in materials science, offering pathways for efficient synthesis and optimization in two-dimensional materials.</div></div>","PeriodicalId":316,"journal":{"name":"FlatChem","volume":"50 ","pages":"Article 100827"},"PeriodicalIF":5.9000,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"FlatChem","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452262725000212","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
This study explores stack models for electrochemical analysis, incorporating base models (decision trees, linear regression, and k-nearest neighbors) and a meta-model. It reveals that the order of stacking base models affects predictions, often yielding multiple solutions. To address this “uncertainty,” a novel “sorting” technique was applied during meta-model training. This approach significantly reduced model uncertainty, achieving the most accurate predictions and minimizing order deviations (mean absolute error of 37.92388; standard deviation reduced from 6.19 × 10−15 to 0). The refined model was applied to analyze synergies in electrochemical and material properties using feature importance tools, such as SHAP, Feature Permutation Importance (FPI), and Partial Dependence Plots (PDP). Key insights for heteroatom-doped carbon supercapacitors suggest maximizing surface area and nitrogen, sulfur, and boron doping while minimizing current density and acidic electrolyte concentration. Optimal oxygen and phosphorus doping levels were ∼ 15 % and ∼ 2.5 %, respectively. FPI ranked nitrogen > surface area > electrolyte concentration > oxygen > current density > defect ratio > sulfur > boron > phosphorus. PDP revealed that dual heteroatom doping (e.g., nitrogen and oxygen) may outperform doping with five heteroatoms. These findings enhance machine learning's reliability in materials science, offering pathways for efficient synthesis and optimization in two-dimensional materials.
期刊介绍:
FlatChem - Chemistry of Flat Materials, a new voice in the community, publishes original and significant, cutting-edge research related to the chemistry of graphene and related 2D & layered materials. The overall aim of the journal is to combine the chemistry and applications of these materials, where the submission of communications, full papers, and concepts should contain chemistry in a materials context, which can be both experimental and/or theoretical. In addition to original research articles, FlatChem also offers reviews, minireviews, highlights and perspectives on the future of this research area with the scientific leaders in fields related to Flat Materials. Topics of interest include, but are not limited to, the following: -Design, synthesis, applications and investigation of graphene, graphene related materials and other 2D & layered materials (for example Silicene, Germanene, Phosphorene, MXenes, Boron nitride, Transition metal dichalcogenides) -Characterization of these materials using all forms of spectroscopy and microscopy techniques -Chemical modification or functionalization and dispersion of these materials, as well as interactions with other materials -Exploring the surface chemistry of these materials for applications in: Sensors or detectors in electrochemical/Lab on a Chip devices, Composite materials, Membranes, Environment technology, Catalysis for energy storage and conversion (for example fuel cells, supercapacitors, batteries, hydrogen storage), Biomedical technology (drug delivery, biosensing, bioimaging)