Joren Van Herck, María Victoria Gil, Kevin Maik Jablonka, Alex Abrudan, Andy Sode Anker, Mehrdad Asgari, Ben J Blaiszik, Antonio Buffo, Leander Choudhury, Clemence Corminboeuf, Hilal Daglar, Amir Mohammad Elahi, Ian T. Foster, Susana García, Matthew Garvin, Guillaume Godin, Lydia L. Good, Jianan Gu, Noémie Xiao Hu, Xin Jin, Tanja Junkers, Seda Keskin, Tuomas Knowles, Ruben Laplaza, Michele Lessona, Sauradeep Majumdar, Hossein Mashhadimoslem, Ruaraidh D McIntosh, Seyed Mohamad Moosavi, Beatriz Mouriño, Francesca Nerli, Cova Pevida, Neda Poudineh, Mahyar Rajabi Kochi, Kadi-Liis Saar, Fahimeh Hooriabad Saboor, Morteza Sagharichiha, K. J. Schmidt, Jiale Shi, Elena Simone, Dennis Svatunek, Marco Taddei, Igor V. Tetko, Domonkos Tolnai, Sahar Vahdatifar, Jonathan K. Whitmer, Florian Wieland, Regine Willumeit-Römer, Andreas Züttel, Berend Smit
{"title":"Assessment of Fine-Tuned Large Language Models for Real-World Chemistry and Material Science Applications","authors":"Joren Van Herck, María Victoria Gil, Kevin Maik Jablonka, Alex Abrudan, Andy Sode Anker, Mehrdad Asgari, Ben J Blaiszik, Antonio Buffo, Leander Choudhury, Clemence Corminboeuf, Hilal Daglar, Amir Mohammad Elahi, Ian T. Foster, Susana García, Matthew Garvin, Guillaume Godin, Lydia L. Good, Jianan Gu, Noémie Xiao Hu, Xin Jin, Tanja Junkers, Seda Keskin, Tuomas Knowles, Ruben Laplaza, Michele Lessona, Sauradeep Majumdar, Hossein Mashhadimoslem, Ruaraidh D McIntosh, Seyed Mohamad Moosavi, Beatriz Mouriño, Francesca Nerli, Cova Pevida, Neda Poudineh, Mahyar Rajabi Kochi, Kadi-Liis Saar, Fahimeh Hooriabad Saboor, Morteza Sagharichiha, K. J. Schmidt, Jiale Shi, Elena Simone, Dennis Svatunek, Marco Taddei, Igor V. Tetko, Domonkos Tolnai, Sahar Vahdatifar, Jonathan K. Whitmer, Florian Wieland, Regine Willumeit-Römer, Andreas Züttel, Berend Smit","doi":"10.1039/d4sc04401k","DOIUrl":null,"url":null,"abstract":"The current generation of large language models (LLMs) have limited chemical knowledge. Recently, it has been shown that these LLMs can learn and predict chemical properties through fine-tuning. Using natural language to train machine learning models opens doors to a wider chemical audience, as field-specific featurization techniques can be omitted. In this work, we explore the potential and limitations of this approach. We studied the performance of fine-tuning three open-source LLMs (GPT-J-6B, Llama-3.1-8B, and Mistral-7B) for a range of different chemical questions. We benchmark their performances against ``traditional\" machine learning models and find that, in most cases, the fine-tuning approach is superior for a simple classification problem. Depending on the size of the dataset and the type of questions, we also successfully address more sophisticated problems. The most important conclusions of this work are that, for all datasets considered, their conversion into an LLM fine-tuning training set is straightforward and that fine-tuning with even relatively small datasets leads to predictive models. These results suggest that the systematic use of LLMs to guide experiments and simulations will be a powerful technique in any research study, significantly reducing unnecessary experiments or computations.","PeriodicalId":9909,"journal":{"name":"Chemical Science","volume":"4 1","pages":""},"PeriodicalIF":7.6000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemical Science","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1039/d4sc04401k","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The current generation of large language models (LLMs) have limited chemical knowledge. Recently, it has been shown that these LLMs can learn and predict chemical properties through fine-tuning. Using natural language to train machine learning models opens doors to a wider chemical audience, as field-specific featurization techniques can be omitted. In this work, we explore the potential and limitations of this approach. We studied the performance of fine-tuning three open-source LLMs (GPT-J-6B, Llama-3.1-8B, and Mistral-7B) for a range of different chemical questions. We benchmark their performances against ``traditional" machine learning models and find that, in most cases, the fine-tuning approach is superior for a simple classification problem. Depending on the size of the dataset and the type of questions, we also successfully address more sophisticated problems. The most important conclusions of this work are that, for all datasets considered, their conversion into an LLM fine-tuning training set is straightforward and that fine-tuning with even relatively small datasets leads to predictive models. These results suggest that the systematic use of LLMs to guide experiments and simulations will be a powerful technique in any research study, significantly reducing unnecessary experiments or computations.
期刊介绍:
Chemical Science is a journal that encompasses various disciplines within the chemical sciences. Its scope includes publishing ground-breaking research with significant implications for its respective field, as well as appealing to a wider audience in related areas. To be considered for publication, articles must showcase innovative and original advances in their field of study and be presented in a manner that is understandable to scientists from diverse backgrounds. However, the journal generally does not publish highly specialized research.