{"title":"Pseudo-Siamese Neural Network Based Graph and Sequence Representation Learning for Molecular Property Prediction","authors":"Chaoran Zhang, Xiangfeng Yan, Yong Liu","doi":"10.1109/BIBM55620.2022.9994859","DOIUrl":null,"url":null,"abstract":"Molecular property prediction has received great attention due to its wide application in biomedical field. Effective molecular representation learning is of substantial significance to facilitate molecular property prediction. In recent years, with the development of artificial intelligence technology, more and more computer scientists began to apply deep learning methods to molecular property prediction instead of traditional machine learning methods. However, these methods only utilize the SMILES sequences to learn sequence representation or use the molecular graphs to learn graph representation to predict molecular property, which fails to integrate the capabilities of both approaches in preserving molecular characteristics for further improvement. In this study, we propose a joint graph and sequence representation learning model for molecular property prediction, called PSGS. Specifically, PSGS utilizes a fusion layer to combine graph and sequence representation and capture the critical features of the molecular. In addition, PSGS is trained by a new self-supervised task, which maximizes the similarity between graph and sequence representations of the same molecular by using a pseudo-Siamese neural network. We conduct extensive experiments to compare our model with state-of-the-art models. Experimental results show that our model significantly outperforms the current state-of-the-art methods on four independent datasets.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM55620.2022.9994859","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Molecular property prediction has received great attention due to its wide application in biomedical field. Effective molecular representation learning is of substantial significance to facilitate molecular property prediction. In recent years, with the development of artificial intelligence technology, more and more computer scientists began to apply deep learning methods to molecular property prediction instead of traditional machine learning methods. However, these methods only utilize the SMILES sequences to learn sequence representation or use the molecular graphs to learn graph representation to predict molecular property, which fails to integrate the capabilities of both approaches in preserving molecular characteristics for further improvement. In this study, we propose a joint graph and sequence representation learning model for molecular property prediction, called PSGS. Specifically, PSGS utilizes a fusion layer to combine graph and sequence representation and capture the critical features of the molecular. In addition, PSGS is trained by a new self-supervised task, which maximizes the similarity between graph and sequence representations of the same molecular by using a pseudo-Siamese neural network. We conduct extensive experiments to compare our model with state-of-the-art models. Experimental results show that our model significantly outperforms the current state-of-the-art methods on four independent datasets.