Daniel Manu, Yi Sheng, Junhuan Yang, Jieren Deng, Tong Geng, Ang Li, Caiwen Ding, Weiwen Jiang, Lei Yang
{"title":"基于图的分子药物发现的联邦生成对抗网络:特别会议论文","authors":"Daniel Manu, Yi Sheng, Junhuan Yang, Jieren Deng, Tong Geng, Ang Li, Caiwen Ding, Weiwen Jiang, Lei Yang","doi":"10.1109/ICCAD51958.2021.9643440","DOIUrl":null,"url":null,"abstract":"The outbreak of the global COVID-19 pandemic emphasizes the importance of collaborative drug discovery for high effectiveness; however, due to the stringent data regulation, data privacy becomes an imminent issue needing to be addressed to enable collaborative drug discovery. In addition to the data privacy issue, the efficiency of drug discovery is another key objective since infectious diseases spread exponentially and effectively conducting drug discovery could save lives. Advanced Artificial Intelligence (AI) techniques are promising to solve these problems: (1) Federated Learning (FL) is born to keep data privacy while learning data from distributed clients; (2) graph neural network (GNN) can extract structural properties of molecules whose underlying architecture is the connected atoms; and (3) generative adversarial network (GAN) can generate novel molecules while retaining the properties learned from the training data. In this work, we make the first attempt to build a holistic collaborative and privacy-preserving FL framework, namely FL-DISCO, which integrates GAN and GNN to generate molecular graphs. Experimental results demonstrate the effectiveness of FL-DISCO on: (1) IID data for ESOL and QM9, where FL-DISCO can generate highly novel compounds with high drug-likeliness, uniqueness and LogP scores compared to the baseline; (2) non-IID data for ESOL and QM9, where FL-DISCO generates 100% novel compounds with high validity and LogP scores compared to the baseline. We also demonstrate how different fractions of clients, generator and discriminator architectures affect our evaluation scores.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"FL-DISCO: Federated Generative Adversarial Network for Graph-based Molecule Drug Discovery: Special Session Paper\",\"authors\":\"Daniel Manu, Yi Sheng, Junhuan Yang, Jieren Deng, Tong Geng, Ang Li, Caiwen Ding, Weiwen Jiang, Lei Yang\",\"doi\":\"10.1109/ICCAD51958.2021.9643440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The outbreak of the global COVID-19 pandemic emphasizes the importance of collaborative drug discovery for high effectiveness; however, due to the stringent data regulation, data privacy becomes an imminent issue needing to be addressed to enable collaborative drug discovery. In addition to the data privacy issue, the efficiency of drug discovery is another key objective since infectious diseases spread exponentially and effectively conducting drug discovery could save lives. Advanced Artificial Intelligence (AI) techniques are promising to solve these problems: (1) Federated Learning (FL) is born to keep data privacy while learning data from distributed clients; (2) graph neural network (GNN) can extract structural properties of molecules whose underlying architecture is the connected atoms; and (3) generative adversarial network (GAN) can generate novel molecules while retaining the properties learned from the training data. In this work, we make the first attempt to build a holistic collaborative and privacy-preserving FL framework, namely FL-DISCO, which integrates GAN and GNN to generate molecular graphs. Experimental results demonstrate the effectiveness of FL-DISCO on: (1) IID data for ESOL and QM9, where FL-DISCO can generate highly novel compounds with high drug-likeliness, uniqueness and LogP scores compared to the baseline; (2) non-IID data for ESOL and QM9, where FL-DISCO generates 100% novel compounds with high validity and LogP scores compared to the baseline. We also demonstrate how different fractions of clients, generator and discriminator architectures affect our evaluation scores.\",\"PeriodicalId\":370791,\"journal\":{\"name\":\"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCAD51958.2021.9643440\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAD51958.2021.9643440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
FL-DISCO: Federated Generative Adversarial Network for Graph-based Molecule Drug Discovery: Special Session Paper
The outbreak of the global COVID-19 pandemic emphasizes the importance of collaborative drug discovery for high effectiveness; however, due to the stringent data regulation, data privacy becomes an imminent issue needing to be addressed to enable collaborative drug discovery. In addition to the data privacy issue, the efficiency of drug discovery is another key objective since infectious diseases spread exponentially and effectively conducting drug discovery could save lives. Advanced Artificial Intelligence (AI) techniques are promising to solve these problems: (1) Federated Learning (FL) is born to keep data privacy while learning data from distributed clients; (2) graph neural network (GNN) can extract structural properties of molecules whose underlying architecture is the connected atoms; and (3) generative adversarial network (GAN) can generate novel molecules while retaining the properties learned from the training data. In this work, we make the first attempt to build a holistic collaborative and privacy-preserving FL framework, namely FL-DISCO, which integrates GAN and GNN to generate molecular graphs. Experimental results demonstrate the effectiveness of FL-DISCO on: (1) IID data for ESOL and QM9, where FL-DISCO can generate highly novel compounds with high drug-likeliness, uniqueness and LogP scores compared to the baseline; (2) non-IID data for ESOL and QM9, where FL-DISCO generates 100% novel compounds with high validity and LogP scores compared to the baseline. We also demonstrate how different fractions of clients, generator and discriminator architectures affect our evaluation scores.