{"title":"Deep-Learning: Investigating feed-forward deep Neural Networks for modeling high throughput chemical bioactivity data","authors":"Jun Huan","doi":"10.1109/BIBM.2016.7822478","DOIUrl":null,"url":null,"abstract":"In recent years, research in Artificial Neural Networks (ANNs) has resurged, now under the Deep-Learning umbrella, and grown extremely popular due to major breakthroughs in methodological and computing capabilities. Deep-Learning methods are part of representation-learning algorithms that attempt to extract and organize discriminative information from the data. Recently reported success of DL techniques in crowd-sourced QSARs and predictive toxicology competitions has showcased these methods as powerful tools for drug-discovery and toxicology research. Nevertheless, reported applications of Deep Learning techniques for modeling complex bioactivity data for small molecules remain still limited. In this talk I will present our recent work on optimizing feed-forward Deep Neural Nets (DNNs) hyperparameters and performance evaluation of these methods as compared to shallow methods. In our study 48 DNNs, 24 Random Forest, 20 SVM and 6 Naive Bayes arbitrary but reasonably selected configurations were compared employing 7 diverse bioactivity datasets assembled from ChEMBL repository combined with circular fingerprints as molecular descriptors. The non-parametric Wilcoxon paired singed-rank test was employed to compare the performance of DNN to RF, SVM and NB. Overall it was found that DNNs with 2 hidden layers, 2,000 neurons per each hidden layer, ReLU activation function and Dropout regularization technique achieved strong classification performance across all tested datasets. Our results demonstrate that DNNs are powerful modeling techniques for modeling complex bioactivity data.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"32 1","pages":"5"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822478","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, research in Artificial Neural Networks (ANNs) has resurged, now under the Deep-Learning umbrella, and grown extremely popular due to major breakthroughs in methodological and computing capabilities. Deep-Learning methods are part of representation-learning algorithms that attempt to extract and organize discriminative information from the data. Recently reported success of DL techniques in crowd-sourced QSARs and predictive toxicology competitions has showcased these methods as powerful tools for drug-discovery and toxicology research. Nevertheless, reported applications of Deep Learning techniques for modeling complex bioactivity data for small molecules remain still limited. In this talk I will present our recent work on optimizing feed-forward Deep Neural Nets (DNNs) hyperparameters and performance evaluation of these methods as compared to shallow methods. In our study 48 DNNs, 24 Random Forest, 20 SVM and 6 Naive Bayes arbitrary but reasonably selected configurations were compared employing 7 diverse bioactivity datasets assembled from ChEMBL repository combined with circular fingerprints as molecular descriptors. The non-parametric Wilcoxon paired singed-rank test was employed to compare the performance of DNN to RF, SVM and NB. Overall it was found that DNNs with 2 hidden layers, 2,000 neurons per each hidden layer, ReLU activation function and Dropout regularization technique achieved strong classification performance across all tested datasets. Our results demonstrate that DNNs are powerful modeling techniques for modeling complex bioactivity data.