{"title":"Deep semi-supervised learning for DTI prediction using large datasets and H2O-spark platform","authors":"Meriem Bahi, M. Batouche","doi":"10.1109/ISACV.2018.8354081","DOIUrl":null,"url":null,"abstract":"Drug repositioning is the process of recycling existing drugs for new indications by identifying the potential drug-target interactions (DTIs). However, in silico predicting new associations between drugs and target proteins is a challenging issue, due to the scarcity of known DTIs and no experimentally true negative drug-target interaction sample. Furthermore, the volume of genomic sequences and chemical structures data is growing in an exponential manner, which consumes relatively too much time and effort. For these reasons, we propose a new computational method based on deep semi-supervised learning called DSSL-DTIs to accurately predict new DTI in post-genome era using large datasets and Spark-H2O platform. Firstly, we use the stacked autoencoders to convert high-dimensional features to low-dimensional representations. Then, we apply another unsupervised stacked autoencoders model for initializing the weights of a supervised deep neural network model. Comparing to other state-of-the-art methods applied all on the same reference dataset of Drug-Bank, it is found that our approach outperforms these techniques with an overall accuracy performance more than 98%. The DSSL-DTIs can be further used to predict large-scale new drug-target interactions. The highly ranked candidate DTIs obtained from DSSL-DTIs are also confirmed in the DrugBank database and in the literature, which demonstrates the effectiveness of our method.","PeriodicalId":184662,"journal":{"name":"2018 International Conference on Intelligent Systems and Computer Vision (ISCV)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Intelligent Systems and Computer Vision (ISCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISACV.2018.8354081","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Drug repositioning is the process of recycling existing drugs for new indications by identifying the potential drug-target interactions (DTIs). However, in silico predicting new associations between drugs and target proteins is a challenging issue, due to the scarcity of known DTIs and no experimentally true negative drug-target interaction sample. Furthermore, the volume of genomic sequences and chemical structures data is growing in an exponential manner, which consumes relatively too much time and effort. For these reasons, we propose a new computational method based on deep semi-supervised learning called DSSL-DTIs to accurately predict new DTI in post-genome era using large datasets and Spark-H2O platform. Firstly, we use the stacked autoencoders to convert high-dimensional features to low-dimensional representations. Then, we apply another unsupervised stacked autoencoders model for initializing the weights of a supervised deep neural network model. Comparing to other state-of-the-art methods applied all on the same reference dataset of Drug-Bank, it is found that our approach outperforms these techniques with an overall accuracy performance more than 98%. The DSSL-DTIs can be further used to predict large-scale new drug-target interactions. The highly ranked candidate DTIs obtained from DSSL-DTIs are also confirmed in the DrugBank database and in the literature, which demonstrates the effectiveness of our method.