{"title":"利用机器学习和分析数据进行光阳极 PEC 性能分析的稳健方法学","authors":"Moeko Tajima, Yuya Nagai, Siyan Chen, Zhenhua Pan, Kenji Katayama","doi":"10.1039/d4an00439f","DOIUrl":null,"url":null,"abstract":"Machine learning (ML) is increasingly applied across various fields, including chemistry, for molecular design and optimizing reaction parameters. Yet, applying ML to experimental data is challenging due to the limited number of synthesized samples, which restricts its broader application in device development. In energy-harvesting, photoanodes are crucial for solar-driven water splitting, generating hydrogen and oxygen. We explored electrodes like hematite and bismuth vanadate for photocatalytic uses, noting varied photoelectrochemical performances despite similar preparations. To understand this variability, we applied a data-driven ML approach, predicting photocurrent values and identifying key performance influencers even with limited experimental data in the research development of inorganic device. Traditional ML methods used multiple algorithms, obscuring the influence of specific factors. We introduced a novel methodology, incorporating clustering to manage multicollinearity from correlated analytical data and Shapley analysis for clear interpretation of contributions to performance prediction. This method was validated on hematite and bismuth vanadate, showing superior predictability and factor identification, then extended to tungsten oxide and bismuth vanadate heterojunction photoanodes. Despite their complexity, our approach achieved determination coefficients (R2) with a prediction accuracy over 0.85, successfully pinpointing performance-determining factors, demonstrating the robustness of the new scheme in advancing photodevice research.","PeriodicalId":63,"journal":{"name":"Analyst","volume":null,"pages":null},"PeriodicalIF":3.6000,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust methodology for PEC performance analysis of photoanodes using machine learning and analytical data\",\"authors\":\"Moeko Tajima, Yuya Nagai, Siyan Chen, Zhenhua Pan, Kenji Katayama\",\"doi\":\"10.1039/d4an00439f\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning (ML) is increasingly applied across various fields, including chemistry, for molecular design and optimizing reaction parameters. Yet, applying ML to experimental data is challenging due to the limited number of synthesized samples, which restricts its broader application in device development. In energy-harvesting, photoanodes are crucial for solar-driven water splitting, generating hydrogen and oxygen. We explored electrodes like hematite and bismuth vanadate for photocatalytic uses, noting varied photoelectrochemical performances despite similar preparations. To understand this variability, we applied a data-driven ML approach, predicting photocurrent values and identifying key performance influencers even with limited experimental data in the research development of inorganic device. Traditional ML methods used multiple algorithms, obscuring the influence of specific factors. We introduced a novel methodology, incorporating clustering to manage multicollinearity from correlated analytical data and Shapley analysis for clear interpretation of contributions to performance prediction. This method was validated on hematite and bismuth vanadate, showing superior predictability and factor identification, then extended to tungsten oxide and bismuth vanadate heterojunction photoanodes. Despite their complexity, our approach achieved determination coefficients (R2) with a prediction accuracy over 0.85, successfully pinpointing performance-determining factors, demonstrating the robustness of the new scheme in advancing photodevice research.\",\"PeriodicalId\":63,\"journal\":{\"name\":\"Analyst\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Analyst\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1039/d4an00439f\",\"RegionNum\":3,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analyst","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1039/d4an00439f","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0
摘要
机器学习(ML)越来越多地应用于包括化学在内的各个领域,用于分子设计和优化反应参数。然而,由于合成样品的数量有限,将 ML 应用于实验数据具有挑战性,这限制了其在设备开发中的广泛应用。在能量收集领域,光阳极对于太阳能驱动的水分裂、产生氢气和氧气至关重要。我们探索了赤铁矿和钒酸铋等电极的光催化用途,发现尽管制备方法相似,但光电化学性能却各不相同。为了理解这种差异,我们在无机器件的研究开发中采用了数据驱动的 ML 方法,即使实验数据有限,也能预测光电流值并确定关键的性能影响因素。传统的 ML 方法使用多种算法,掩盖了特定因素的影响。我们引入了一种新方法,通过聚类来管理相关分析数据的多重共线性,并通过 Shapley 分析来明确解释对性能预测的贡献。这种方法在赤铁矿和钒酸铋上进行了验证,显示出卓越的可预测性和因素识别能力,然后扩展到氧化钨和钒酸铋异质结光电阳极。尽管它们很复杂,但我们的方法达到了预测精度超过 0.85 的确定系数 (R2),成功地找出了性能决定因素,证明了新方案在推进光电器件研究方面的稳健性。
Robust methodology for PEC performance analysis of photoanodes using machine learning and analytical data
Machine learning (ML) is increasingly applied across various fields, including chemistry, for molecular design and optimizing reaction parameters. Yet, applying ML to experimental data is challenging due to the limited number of synthesized samples, which restricts its broader application in device development. In energy-harvesting, photoanodes are crucial for solar-driven water splitting, generating hydrogen and oxygen. We explored electrodes like hematite and bismuth vanadate for photocatalytic uses, noting varied photoelectrochemical performances despite similar preparations. To understand this variability, we applied a data-driven ML approach, predicting photocurrent values and identifying key performance influencers even with limited experimental data in the research development of inorganic device. Traditional ML methods used multiple algorithms, obscuring the influence of specific factors. We introduced a novel methodology, incorporating clustering to manage multicollinearity from correlated analytical data and Shapley analysis for clear interpretation of contributions to performance prediction. This method was validated on hematite and bismuth vanadate, showing superior predictability and factor identification, then extended to tungsten oxide and bismuth vanadate heterojunction photoanodes. Despite their complexity, our approach achieved determination coefficients (R2) with a prediction accuracy over 0.85, successfully pinpointing performance-determining factors, demonstrating the robustness of the new scheme in advancing photodevice research.