Guanjie Wang, Erpeng Wang, Zefeng Li, Jian Zhou, Zhimei Sun
{"title":"Exploring the mathematic equations behind the materials science data using interpretable symbolic regression","authors":"Guanjie Wang, Erpeng Wang, Zefeng Li, Jian Zhou, Zhimei Sun","doi":"10.1002/idm2.12180","DOIUrl":null,"url":null,"abstract":"<p>Symbolic regression (SR), exploring mathematical expressions from a given data set to construct an interpretable model, emerges as a powerful computational technique with the potential to transform the “black box” machining learning methods into physical and chemistry interpretable expressions in material science research. In this review, the current advancements in SR are investigated, focusing on the underlying theories, fundamental flowcharts, various techniques, implemented codes, and application fields. More predominantly, the challenging issues and future opportunities in SR that should be overcome to unlock the full potential of SR in material design and research, including graphics processing unit acceleration and transfer learning algorithms, the trade-off between expression accuracy and complexity, physical or chemistry interpretable SR with generative large language models, and multimodal SR methods, are discussed.</p>","PeriodicalId":100685,"journal":{"name":"Interdisciplinary Materials","volume":"3 5","pages":"637-657"},"PeriodicalIF":24.5000,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/idm2.12180","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interdisciplinary Materials","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/idm2.12180","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Symbolic regression (SR), exploring mathematical expressions from a given data set to construct an interpretable model, emerges as a powerful computational technique with the potential to transform the “black box” machining learning methods into physical and chemistry interpretable expressions in material science research. In this review, the current advancements in SR are investigated, focusing on the underlying theories, fundamental flowcharts, various techniques, implemented codes, and application fields. More predominantly, the challenging issues and future opportunities in SR that should be overcome to unlock the full potential of SR in material design and research, including graphics processing unit acceleration and transfer learning algorithms, the trade-off between expression accuracy and complexity, physical or chemistry interpretable SR with generative large language models, and multimodal SR methods, are discussed.
符号回归(SR)是从给定数据集中探索数学表达式以构建可解释模型的方法,它是一种强大的计算技术,具有将 "黑箱 "加工学习方法转化为材料科学研究中物理和化学可解释表达式的潜力。在这篇综述中,我们将重点研究 SR 的基础理论、基本流程图、各种技术、实施代码和应用领域,并对 SR 的当前进展进行研究。更主要的是,讨论了 SR 中应克服的挑战性问题和未来机遇,以释放 SR 在材料设计和研究中的全部潜力,包括图形处理单元加速和迁移学习算法、表达准确性和复杂性之间的权衡、使用生成式大型语言模型的物理或化学可解释 SR 以及多模态 SR 方法。