E Raveendrakumar, B Gopichand, H Bhosale, N Melethadathil, J Valadi
{"title":"Uncovering blood-brain barrier permeability: a comparative study of machine learning models using molecular fingerprints, and SHAP explainability.","authors":"E Raveendrakumar, B Gopichand, H Bhosale, N Melethadathil, J Valadi","doi":"10.1080/1062936X.2024.2446352","DOIUrl":null,"url":null,"abstract":"<p><p>This study illustrates the use of chemical fingerprints with machine learning for blood-brain barrier (BBB) permeability prediction. Employing the Blood Brain Barrier Database (B3DB) dataset for BBB permeability prediction, we extracted nine different fingerprints. Support Vector Machine (SVM) and Extreme Gradient Boosting (XGBoost) algorithms were used to develop models for permeability prediction. Random Forest recursive Feature Selection (RF-RFS) method was used for extracting informative attributes. An additional database was employed for the validation phase. The results indicate that all nine datasets achieved good performance in training, test and validation stages. We further took MACC Keys fingerprints, one of the best performing models for explainability analysis. For this purpose, we used SHapley Additive exPlanations (SHAP) analysis on this dataset for the identification of key structural features influencing BBB permeability prediction. These features include aliphatic carbons, methyl groups and oxygen-containing groups. This study highlights the effectiveness of different fingerprint descriptors in predicting BBB permeability. SHAP analysis provides value additions to the simulations. These simulations will be of significant help in drug discovery processes, particularly in developing Central Nervous System (CNS) therapeutics.</p>","PeriodicalId":21446,"journal":{"name":"SAR and QSAR in Environmental Research","volume":"35 12","pages":"1155-1171"},"PeriodicalIF":2.3000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SAR and QSAR in Environmental Research","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1080/1062936X.2024.2446352","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/8 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
This study illustrates the use of chemical fingerprints with machine learning for blood-brain barrier (BBB) permeability prediction. Employing the Blood Brain Barrier Database (B3DB) dataset for BBB permeability prediction, we extracted nine different fingerprints. Support Vector Machine (SVM) and Extreme Gradient Boosting (XGBoost) algorithms were used to develop models for permeability prediction. Random Forest recursive Feature Selection (RF-RFS) method was used for extracting informative attributes. An additional database was employed for the validation phase. The results indicate that all nine datasets achieved good performance in training, test and validation stages. We further took MACC Keys fingerprints, one of the best performing models for explainability analysis. For this purpose, we used SHapley Additive exPlanations (SHAP) analysis on this dataset for the identification of key structural features influencing BBB permeability prediction. These features include aliphatic carbons, methyl groups and oxygen-containing groups. This study highlights the effectiveness of different fingerprint descriptors in predicting BBB permeability. SHAP analysis provides value additions to the simulations. These simulations will be of significant help in drug discovery processes, particularly in developing Central Nervous System (CNS) therapeutics.
期刊介绍:
SAR and QSAR in Environmental Research is an international journal welcoming papers on the fundamental and practical aspects of the structure-activity and structure-property relationships in the fields of environmental science, agrochemistry, toxicology, pharmacology and applied chemistry. A unique aspect of the journal is the focus on emerging techniques for the building of SAR and QSAR models in these widely varying fields. The scope of the journal includes, but is not limited to, the topics of topological and physicochemical descriptors, mathematical, statistical and graphical methods for data analysis, computer methods and programs, original applications and comparative studies. In addition to primary scientific papers, the journal contains reviews of books and software and news of conferences. Special issues on topics of current and widespread interest to the SAR and QSAR community will be published from time to time.