Roman A. Stepanyuk , Igor V. Polyakov , Anna M. Kulakova , Ekaterina I. Marchenko , Maria G. Khrenova
{"title":"Towards machine learning prediction of the fluorescent protein absorption spectra","authors":"Roman A. Stepanyuk , Igor V. Polyakov , Anna M. Kulakova , Ekaterina I. Marchenko , Maria G. Khrenova","doi":"10.1016/j.mencom.2024.10.007","DOIUrl":null,"url":null,"abstract":"<div><div>We demonstrate that machine learning models trained on a set of features obtained from QM/MM molecular dynamic trajectories of fluorescent proteins can be used to predict the chromophore dipole moment variation upon excitation, the quantity related to the electronic excitation energy. Linear regression, gradient boosting, and artificial neural network- based models were considered using cross-validation on the training dataset. Gradient boosting approach proved to be the most accurate for both internal (<em>R<sup>2</sup></em> = 0.77) and external (<em>R<sup>2</sup></em> = 0.7) test sets.</div></div>","PeriodicalId":18542,"journal":{"name":"Mendeleev Communications","volume":"34 6","pages":"Pages 788-791"},"PeriodicalIF":1.8000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mendeleev Communications","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0959943624003055","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
We demonstrate that machine learning models trained on a set of features obtained from QM/MM molecular dynamic trajectories of fluorescent proteins can be used to predict the chromophore dipole moment variation upon excitation, the quantity related to the electronic excitation energy. Linear regression, gradient boosting, and artificial neural network- based models were considered using cross-validation on the training dataset. Gradient boosting approach proved to be the most accurate for both internal (R2 = 0.77) and external (R2 = 0.7) test sets.
期刊介绍:
Mendeleev Communications is the journal of the Russian Academy of Sciences, launched jointly by the Academy of Sciences of the USSR and the Royal Society of Chemistry (United Kingdom) in 1991. Starting from 1st January 2007, Elsevier is the new publishing partner of Mendeleev Communications.
Mendeleev Communications publishes short communications in chemistry. The journal primarily features papers from the Russian Federation and the other states of the former USSR. However, it also includes papers by authors from other parts of the world. Mendeleev Communications is not a translated journal, but instead is published directly in English. The International Editorial Board is composed of eminent scientists who provide advice on refereeing policy.