Bidisha Samanta, A. De, G. Jana, P. Chattaraj, Niloy Ganguly, Manuel Gomez Rodriguez
{"title":"NeVAE: A Deep Generative Model for Molecular Graphs","authors":"Bidisha Samanta, A. De, G. Jana, P. Chattaraj, Niloy Ganguly, Manuel Gomez Rodriguez","doi":"10.1609/aaai.v33i01.33011110","DOIUrl":null,"url":null,"abstract":"Deep generative models have been praised for their ability to learn smooth latent representation of images, text, and audio, which can then be used to generate new, plausible data. However, current generative models are unable to work with molecular graphs due to their unique characteristics—their underlying structure is not Euclidean or grid-like, they remain isomorphic under permutation of the nodes labels, and they come with a different number of nodes and edges. In this paper, we propose NeVAE, a novel variational autoencoder for molecular graphs, whose encoder and decoder are specially designed to account for the above properties by means of several technical innovations. In addition, by using masking, the decoder is able to guarantee a set of valid properties in the generated molecules. Experiments reveal that our model can discover plausible, diverse and novel molecules more effectively than several state of the art methods. Moreover, by utilizing Bayesian optimization over the continuous latent representation of molecules our model finds, we can also find molecules that maximize certain desirable properties more effectively than alternatives.","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"44 1","pages":"114:1-114:33"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"177","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Mach. Learn. Res.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aaai.v33i01.33011110","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 177
Abstract
Deep generative models have been praised for their ability to learn smooth latent representation of images, text, and audio, which can then be used to generate new, plausible data. However, current generative models are unable to work with molecular graphs due to their unique characteristics—their underlying structure is not Euclidean or grid-like, they remain isomorphic under permutation of the nodes labels, and they come with a different number of nodes and edges. In this paper, we propose NeVAE, a novel variational autoencoder for molecular graphs, whose encoder and decoder are specially designed to account for the above properties by means of several technical innovations. In addition, by using masking, the decoder is able to guarantee a set of valid properties in the generated molecules. Experiments reveal that our model can discover plausible, diverse and novel molecules more effectively than several state of the art methods. Moreover, by utilizing Bayesian optimization over the continuous latent representation of molecules our model finds, we can also find molecules that maximize certain desirable properties more effectively than alternatives.