Zunsheng Han, Zhonghua Xia, Jie Xia, Igor V Tetko, Song Wu
{"title":"最先进的血浆蛋白结合预测机器学习模型:利用 OCHEM 进行计算建模和实验验证","authors":"Zunsheng Han, Zhonghua Xia, Jie Xia, Igor V Tetko, Song Wu","doi":"10.1101/2024.07.12.603170","DOIUrl":null,"url":null,"abstract":"Plasma protein binding (PPB) is closely related to pharmacokinetics, pharmacodynamics and drug toxicity. Prediction of PPB is an alternative to experimental approaches that are known to be time-consuming and costly. Although there are various models and web servers for PPB prediction already available, they suffer from low prediction accuracy and poor interpretability, in particular for molecules with high values, and are most often not properly validated in prospective studies. Here, we carried out strict data curation, and applied consensus modeling to obtain a model with a coefficient of determination of 0.90 and 0.91 on the training set and the test set, respectively. This model was further validated in a prospective study to predict 63 poly-fluorinated and another 25 highly diverse compounds, and its performance for both these sets was superior to that of other previously reported models. To identify structural features related to PPB, we analyzed a model based on Morgan2 fingerprints and identified that features such as aromatic rings, halogen atoms, heterocyclic rings can discriminate high- and low-PPB molecules. In conclusion, we have established a PPB prediction model that showed state-of-the-art performance in prospective screening, which we have made publicly available in the OCHEM platform (https://ochem.eu/article/29). Graphic Abstract","PeriodicalId":9124,"journal":{"name":"bioRxiv","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The state-of-the-art machine learning model for Plasma Protein Binding Prediction: computational modeling with OCHEM and experimental validation\",\"authors\":\"Zunsheng Han, Zhonghua Xia, Jie Xia, Igor V Tetko, Song Wu\",\"doi\":\"10.1101/2024.07.12.603170\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Plasma protein binding (PPB) is closely related to pharmacokinetics, pharmacodynamics and drug toxicity. Prediction of PPB is an alternative to experimental approaches that are known to be time-consuming and costly. Although there are various models and web servers for PPB prediction already available, they suffer from low prediction accuracy and poor interpretability, in particular for molecules with high values, and are most often not properly validated in prospective studies. Here, we carried out strict data curation, and applied consensus modeling to obtain a model with a coefficient of determination of 0.90 and 0.91 on the training set and the test set, respectively. This model was further validated in a prospective study to predict 63 poly-fluorinated and another 25 highly diverse compounds, and its performance for both these sets was superior to that of other previously reported models. To identify structural features related to PPB, we analyzed a model based on Morgan2 fingerprints and identified that features such as aromatic rings, halogen atoms, heterocyclic rings can discriminate high- and low-PPB molecules. In conclusion, we have established a PPB prediction model that showed state-of-the-art performance in prospective screening, which we have made publicly available in the OCHEM platform (https://ochem.eu/article/29). Graphic Abstract\",\"PeriodicalId\":9124,\"journal\":{\"name\":\"bioRxiv\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.07.12.603170\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.07.12.603170","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The state-of-the-art machine learning model for Plasma Protein Binding Prediction: computational modeling with OCHEM and experimental validation
Plasma protein binding (PPB) is closely related to pharmacokinetics, pharmacodynamics and drug toxicity. Prediction of PPB is an alternative to experimental approaches that are known to be time-consuming and costly. Although there are various models and web servers for PPB prediction already available, they suffer from low prediction accuracy and poor interpretability, in particular for molecules with high values, and are most often not properly validated in prospective studies. Here, we carried out strict data curation, and applied consensus modeling to obtain a model with a coefficient of determination of 0.90 and 0.91 on the training set and the test set, respectively. This model was further validated in a prospective study to predict 63 poly-fluorinated and another 25 highly diverse compounds, and its performance for both these sets was superior to that of other previously reported models. To identify structural features related to PPB, we analyzed a model based on Morgan2 fingerprints and identified that features such as aromatic rings, halogen atoms, heterocyclic rings can discriminate high- and low-PPB molecules. In conclusion, we have established a PPB prediction model that showed state-of-the-art performance in prospective screening, which we have made publicly available in the OCHEM platform (https://ochem.eu/article/29). Graphic Abstract