Guishen Wang, Hui Feng, Mengyan Du, Yuncong Feng, Chen Cao
{"title":"通过图同构网络进行多模态表征学习,实现毒性多任务学习","authors":"Guishen Wang, Hui Feng, Mengyan Du, Yuncong Feng, Chen Cao","doi":"10.1021/acs.jcim.4c01061","DOIUrl":null,"url":null,"abstract":"Toxicity is paramount for comprehending compound properties, particularly in the early stages of drug design. Due to the diversity and complexity of toxic effects, it became a challenge to compute compound toxicity tasks. To address this issue, we propose a multimodal representation learning model, termed multimodal graph isomorphism network (MMGIN), to address this challenge for compound toxicity multitask learning. Based on fingerprints and molecular graphs of compounds, our MMGIN model incorporates a multimodal representation learning model to acquire a comprehensive compound representation. This model adopts a two-channel structure to independently learn fingerprint representation and molecular graph representation. Subsequently, two feedforward neural networks utilize the learned multimodal compound representation to perform multitask learning, encompassing compound toxicity classification and multiple compound category classification simultaneously. To test the effectiveness of our model, we constructed a novel data set, termed the compound toxicity multitask learning (CTMTL) data set, derived from the TOXRIC data set. We compare our MMGIN model with other representative machine learning and deep learning models on the CTMTL and Tox21 data sets. The experimental results demonstrate significant advancements achieved by our MMGIN model. Furthermore, the ablation study underscores the effectiveness of the introduced fingerprints, molecular graphs, the multimodal representation learning model, and the multitask learning model, showcasing the model’s superior predictive capability and robustness.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multimodal Representation Learning via Graph Isomorphism Network for Toxicity Multitask Learning\",\"authors\":\"Guishen Wang, Hui Feng, Mengyan Du, Yuncong Feng, Chen Cao\",\"doi\":\"10.1021/acs.jcim.4c01061\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Toxicity is paramount for comprehending compound properties, particularly in the early stages of drug design. Due to the diversity and complexity of toxic effects, it became a challenge to compute compound toxicity tasks. To address this issue, we propose a multimodal representation learning model, termed multimodal graph isomorphism network (MMGIN), to address this challenge for compound toxicity multitask learning. Based on fingerprints and molecular graphs of compounds, our MMGIN model incorporates a multimodal representation learning model to acquire a comprehensive compound representation. This model adopts a two-channel structure to independently learn fingerprint representation and molecular graph representation. Subsequently, two feedforward neural networks utilize the learned multimodal compound representation to perform multitask learning, encompassing compound toxicity classification and multiple compound category classification simultaneously. To test the effectiveness of our model, we constructed a novel data set, termed the compound toxicity multitask learning (CTMTL) data set, derived from the TOXRIC data set. We compare our MMGIN model with other representative machine learning and deep learning models on the CTMTL and Tox21 data sets. The experimental results demonstrate significant advancements achieved by our MMGIN model. Furthermore, the ablation study underscores the effectiveness of the introduced fingerprints, molecular graphs, the multimodal representation learning model, and the multitask learning model, showcasing the model’s superior predictive capability and robustness.\",\"PeriodicalId\":44,\"journal\":{\"name\":\"Journal of Chemical Information and Modeling \",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2024-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Information and Modeling \",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.jcim.4c01061\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MEDICINAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jcim.4c01061","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
Multimodal Representation Learning via Graph Isomorphism Network for Toxicity Multitask Learning
Toxicity is paramount for comprehending compound properties, particularly in the early stages of drug design. Due to the diversity and complexity of toxic effects, it became a challenge to compute compound toxicity tasks. To address this issue, we propose a multimodal representation learning model, termed multimodal graph isomorphism network (MMGIN), to address this challenge for compound toxicity multitask learning. Based on fingerprints and molecular graphs of compounds, our MMGIN model incorporates a multimodal representation learning model to acquire a comprehensive compound representation. This model adopts a two-channel structure to independently learn fingerprint representation and molecular graph representation. Subsequently, two feedforward neural networks utilize the learned multimodal compound representation to perform multitask learning, encompassing compound toxicity classification and multiple compound category classification simultaneously. To test the effectiveness of our model, we constructed a novel data set, termed the compound toxicity multitask learning (CTMTL) data set, derived from the TOXRIC data set. We compare our MMGIN model with other representative machine learning and deep learning models on the CTMTL and Tox21 data sets. The experimental results demonstrate significant advancements achieved by our MMGIN model. Furthermore, the ablation study underscores the effectiveness of the introduced fingerprints, molecular graphs, the multimodal representation learning model, and the multitask learning model, showcasing the model’s superior predictive capability and robustness.
期刊介绍:
The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery.
Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field.
As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.