{"title":"Explainable Synthesizability Prediction of Inorganic Crystal Polymorphs using Large Language Models","authors":"Seongmin Kim, Joshua Schrier, Yousung Jung","doi":"10.1002/anie.202423950","DOIUrl":null,"url":null,"abstract":"We evaluate the ability of machine learning to predict whether a hypothetical crystal structure can be synthesized and explain those predictions to scientists. Fine-tuned large language models (LLMs) trained on a human-readable text description of the target crystal structure perform comparably to previous bespoke convolutional graph neural network methods, but better prediction quality can be achieved by training a positive-unlabeled learning model on a text-embedding representation of the structure. An LLM-based workflow can then be used to generate human-readable explanations for the types of factors governing synthesizability, extract the underlying physical rules, and assess the veracity of those rules. These explanations can guide chemists in modifying or optimizing non-synthesizable hypothetical structures to make them more feasible for materials design.","PeriodicalId":125,"journal":{"name":"Angewandte Chemie International Edition","volume":"22 1","pages":""},"PeriodicalIF":16.1000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Angewandte Chemie International Edition","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1002/anie.202423950","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
We evaluate the ability of machine learning to predict whether a hypothetical crystal structure can be synthesized and explain those predictions to scientists. Fine-tuned large language models (LLMs) trained on a human-readable text description of the target crystal structure perform comparably to previous bespoke convolutional graph neural network methods, but better prediction quality can be achieved by training a positive-unlabeled learning model on a text-embedding representation of the structure. An LLM-based workflow can then be used to generate human-readable explanations for the types of factors governing synthesizability, extract the underlying physical rules, and assess the veracity of those rules. These explanations can guide chemists in modifying or optimizing non-synthesizable hypothetical structures to make them more feasible for materials design.
期刊介绍:
Angewandte Chemie, a journal of the German Chemical Society (GDCh), maintains a leading position among scholarly journals in general chemistry with an impressive Impact Factor of 16.6 (2022 Journal Citation Reports, Clarivate, 2023). Published weekly in a reader-friendly format, it features new articles almost every day. Established in 1887, Angewandte Chemie is a prominent chemistry journal, offering a dynamic blend of Review-type articles, Highlights, Communications, and Research Articles on a weekly basis, making it unique in the field.