Pub Date : 2024-10-09DOI: 10.1109/TCBB.2024.3477410
Jian Zhong, Haochen Zhao, Qichang Zhao, Jianxin Wang
Precisely predicting Drug-Drug Interactions (DDIs) carries the potential to elevate the quality and safety of drug therapies, protecting the well-being of patients, and providing essential guidance and decision support at every stage of the drug development process. In recent years, leveraging large-scale biomedical knowledge graphs has improved DDI prediction performance. However, the feature extraction procedures in these methods are still rough. More refined features may further improve the quality of predictions. To overcome these limitations, we develop a knowledge graph-based method for multi-typed DDI prediction with contrastive learning (KG-CLDDI). In KG-CLDDI, we combine drug knowledge aggregation features from the knowledge graph with drug topological aggregation features from the DDI graph. Additionally, we build a contrastive learning module that uses horizontal reversal and dropout operations to produce high-quality embeddings for drug-drug pairs. The comparison results indicate that KG-CLDDI is superior to state-of-the-art models in both the transductive and inductive settings. Notably, for the inductive setting, KG-CLDDI outperforms the previous best method by 17.49% and 24.97% in terms of AUC and AUPR, respectively. Furthermore, we conduct the ablation analysis and case study to show the effectiveness of KG-CLDDI. These findings illustrate the potential significance of KG-CLDDI in advancing DDI research and its clinical applications. The codes of KG-CLDDI are available at https://github.com/jianzhong123/KG-CLDDI.
{"title":"A knowledge graph-based method for drug-drug interaction prediction with contrastive learning.","authors":"Jian Zhong, Haochen Zhao, Qichang Zhao, Jianxin Wang","doi":"10.1109/TCBB.2024.3477410","DOIUrl":"https://doi.org/10.1109/TCBB.2024.3477410","url":null,"abstract":"<p><p>Precisely predicting Drug-Drug Interactions (DDIs) carries the potential to elevate the quality and safety of drug therapies, protecting the well-being of patients, and providing essential guidance and decision support at every stage of the drug development process. In recent years, leveraging large-scale biomedical knowledge graphs has improved DDI prediction performance. However, the feature extraction procedures in these methods are still rough. More refined features may further improve the quality of predictions. To overcome these limitations, we develop a knowledge graph-based method for multi-typed DDI prediction with contrastive learning (KG-CLDDI). In KG-CLDDI, we combine drug knowledge aggregation features from the knowledge graph with drug topological aggregation features from the DDI graph. Additionally, we build a contrastive learning module that uses horizontal reversal and dropout operations to produce high-quality embeddings for drug-drug pairs. The comparison results indicate that KG-CLDDI is superior to state-of-the-art models in both the transductive and inductive settings. Notably, for the inductive setting, KG-CLDDI outperforms the previous best method by 17.49% and 24.97% in terms of AUC and AUPR, respectively. Furthermore, we conduct the ablation analysis and case study to show the effectiveness of KG-CLDDI. These findings illustrate the potential significance of KG-CLDDI in advancing DDI research and its clinical applications. The codes of KG-CLDDI are available at https://github.com/jianzhong123/KG-CLDDI.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142390226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-09DOI: 10.1109/TCBB.2024.3477313
Aditya Malusare, Vaneet Aggarwal
Recent advancements in generative models have established state-of-the-art benchmarks in the generation of molecules and novel drug candidates. Despite these successes, a significant gap persists between generative models and the utilization of extensive biomedical knowledge, often systematized within knowledge graphs, whose potential to inform and enhance generative processes has not been realized. In this paper, we present a novel approach that bridges this divide by developing a framework for knowledge-enhanced generative models called KARL. We develop a scalable methodology to extend the functionality of knowledge graphs while preserving semantic integrity, and incorporate this contextual information into a generative framework to guide a diffusion-based model. The integration of knowledge graph embeddings with our generative model furnishes a robust mechanism for producing novel drug candidates possessing specific characteristics while ensuring validity and synthesizability. KARL outperforms state-of-the-art generative models on both unconditional and targeted generation tasks.
生成模型的最新进展为分子和新型候选药物的生成建立了最先进的基准。尽管取得了这些成就,但在生成模型与利用广泛的生物医学知识(通常在知识图谱中系统化)之间仍然存在着巨大的差距,而这些知识为生成过程提供信息和增强生成过程的潜力尚未实现。在本文中,我们提出了一种新颖的方法,通过开发一个名为 KARL 的知识增强生成模型框架来弥合这一鸿沟。我们开发了一种可扩展的方法来扩展知识图谱的功能,同时保持语义的完整性,并将这种上下文信息纳入生成框架,以指导基于扩散的模型。知识图谱嵌入与我们的生成模型相结合,提供了一种稳健的机制,用于生成具有特定特征的新型候选药物,同时确保有效性和可合成性。KARL 在无条件生成和目标生成任务上的表现都优于最先进的生成模型。
{"title":"Improving Molecule Generation and Drug Discovery with a Knowledge-enhanced Generative Model.","authors":"Aditya Malusare, Vaneet Aggarwal","doi":"10.1109/TCBB.2024.3477313","DOIUrl":"10.1109/TCBB.2024.3477313","url":null,"abstract":"<p><p>Recent advancements in generative models have established state-of-the-art benchmarks in the generation of molecules and novel drug candidates. Despite these successes, a significant gap persists between generative models and the utilization of extensive biomedical knowledge, often systematized within knowledge graphs, whose potential to inform and enhance generative processes has not been realized. In this paper, we present a novel approach that bridges this divide by developing a framework for knowledge-enhanced generative models called KARL. We develop a scalable methodology to extend the functionality of knowledge graphs while preserving semantic integrity, and incorporate this contextual information into a generative framework to guide a diffusion-based model. The integration of knowledge graph embeddings with our generative model furnishes a robust mechanism for producing novel drug candidates possessing specific characteristics while ensuring validity and synthesizability. KARL outperforms state-of-the-art generative models on both unconditional and targeted generation tasks.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142390228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-08DOI: 10.1109/TCBB.2024.3476453
Kiefer Andre Bedoya Benites, Wilser Andres Garcia-Quispes
Polymerase chain reaction - Restriction Fragment Length Polymorphism (PCR-RFLP) is an established molecular biology technique leveraging DNA sequence variability for organism identification, genetic disease detection, biodiversity analysis, etc. Traditional PCR-RFLP requires wet-laboratory procedures that can result in technical errors, procedural challenges, and financial costs. With the aim of providing an accessible and efficient PCR-RFLP technique complement, we introduce RFLP-inator. This is a comprehensive web-based platform developed in R using the package Shiny, which simulates the PCR-RFLP technique, integrates analysis capabilities, and offers complementary tools for both pre- and post-evaluation of in vitro results. We developed the RFLP-inator's algorithm independently and our platform offers seven dynamic tools: RFLP simulator, Pattern identifier, Enzyme selector, RFLP analyzer, Multiplex PCR, Restriction map maker, and Gel plotter. Moreover, the software includes a restriction pattern database of more than 250,000 sequences of the bacterial 16S rRNA gene. We successfully validated the core tools against published research findings. This new platform is open access and user-friendly, offering a valuable resource for researchers, educators, and students specializing in molecular genetics. RFLP-inator not only streamlines RFLP technique application but also supports pedagogical efforts in genetics, illustrating its utility and reliability. The software is available for free at https://kodebio.shinyapps.io/RFLP-inator/.
{"title":"RFLP-inator: interactive web platform for in silico simulation and complementary tools of the PCR-RFLP technique.","authors":"Kiefer Andre Bedoya Benites, Wilser Andres Garcia-Quispes","doi":"10.1109/TCBB.2024.3476453","DOIUrl":"10.1109/TCBB.2024.3476453","url":null,"abstract":"<p><p>Polymerase chain reaction - Restriction Fragment Length Polymorphism (PCR-RFLP) is an established molecular biology technique leveraging DNA sequence variability for organism identification, genetic disease detection, biodiversity analysis, etc. Traditional PCR-RFLP requires wet-laboratory procedures that can result in technical errors, procedural challenges, and financial costs. With the aim of providing an accessible and efficient PCR-RFLP technique complement, we introduce RFLP-inator. This is a comprehensive web-based platform developed in R using the package Shiny, which simulates the PCR-RFLP technique, integrates analysis capabilities, and offers complementary tools for both pre- and post-evaluation of in vitro results. We developed the RFLP-inator's algorithm independently and our platform offers seven dynamic tools: RFLP simulator, Pattern identifier, Enzyme selector, RFLP analyzer, Multiplex PCR, Restriction map maker, and Gel plotter. Moreover, the software includes a restriction pattern database of more than 250,000 sequences of the bacterial 16S rRNA gene. We successfully validated the core tools against published research findings. This new platform is open access and user-friendly, offering a valuable resource for researchers, educators, and students specializing in molecular genetics. RFLP-inator not only streamlines RFLP technique application but also supports pedagogical efforts in genetics, illustrating its utility and reliability. The software is available for free at https://kodebio.shinyapps.io/RFLP-inator/.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142390231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-07DOI: 10.1109/TCBB.2024.3475917
Reza Mazloom, N Tessa Pierce-Ward, Parul Sharma, Leighton Pritchard, C Titus Brown, Boris A Vinatzer, Lenwood S Heath
As a central organizing principle of biology, bacteria and archaea are classified into a hierarchical structure across taxonomic ranks from kingdom to subspecies. Traditionally, this organization was based on observable characteristics of form and chemistry but recently, bacterial taxonomy has been robustly quantified using comparisons of sequenced genomes, as exemplified in the Genome Taxonomy Database (GTDB). Such genome-based taxonomies resolve genomes down to genera and species and are useful in many contexts yet lack the flexibility and resolution of a fine-grained approach. The Life Identification Number (LIN) approach is a common, quantitative framework to tie existing (and future) bacterial taxonomies together, increase the resolution of genome-based discrimination of taxa, and extend taxonomic identification below the species level in a principled way. Utilizing LINgroup as an organizational concept helps resolve some of the confusion and unforeseen negative effects resulting from nomenclature changes of microorganisms that are closely related by overall genomic similarity (often due to genome-based reclassification). Our experimental results demonstrate the value of LINs and LINgroups in mapping between taxonomies, translating between different nomenclatures, and integrating them into a single taxonomic framework. They also reveal the robustness of LIN assignment to hyper-parameter changes when considering within-species taxonomic groups.
{"title":"LINgroups as a Robust Principled Approach to Compare and Integrate Multiple Bacterial Taxonomies.","authors":"Reza Mazloom, N Tessa Pierce-Ward, Parul Sharma, Leighton Pritchard, C Titus Brown, Boris A Vinatzer, Lenwood S Heath","doi":"10.1109/TCBB.2024.3475917","DOIUrl":"https://doi.org/10.1109/TCBB.2024.3475917","url":null,"abstract":"<p><p>As a central organizing principle of biology, bacteria and archaea are classified into a hierarchical structure across taxonomic ranks from kingdom to subspecies. Traditionally, this organization was based on observable characteristics of form and chemistry but recently, bacterial taxonomy has been robustly quantified using comparisons of sequenced genomes, as exemplified in the Genome Taxonomy Database (GTDB). Such genome-based taxonomies resolve genomes down to genera and species and are useful in many contexts yet lack the flexibility and resolution of a fine-grained approach. The Life Identification Number (LIN) approach is a common, quantitative framework to tie existing (and future) bacterial taxonomies together, increase the resolution of genome-based discrimination of taxa, and extend taxonomic identification below the species level in a principled way. Utilizing LINgroup as an organizational concept helps resolve some of the confusion and unforeseen negative effects resulting from nomenclature changes of microorganisms that are closely related by overall genomic similarity (often due to genome-based reclassification). Our experimental results demonstrate the value of LINs and LINgroups in mapping between taxonomies, translating between different nomenclatures, and integrating them into a single taxonomic framework. They also reveal the robustness of LIN assignment to hyper-parameter changes when considering within-species taxonomic groups.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142390229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-03DOI: 10.1109/TCBB.2024.3473899
Zhenhao Sun, Meng Wang, Shiqi Wang, Sam Kwong
In this paper, we propose a Learning-based gEnome Codec (LEC), which is designed for high efficiency and enhanced flexibility. The LEC integrates several advanced technologies, including Group of Bases (GoB) compression, multi-stride coding and bidirectional prediction, all of which are aimed at optimizing the balance between coding complexity and performance in lossless compression. The model applied in our proposed codec is data-driven, based on deep neural networks to infer probabilities for each symbol, enabling fully parallel encoding and decoding with configured complexity for diverse applications. Based upon a set of configurations on compression ratios and inference speed, experimental results show that the proposed method is very efficient in terms of compression performance and provides improved flexibility in real-world applications.
{"title":"LEC-Codec: Learning-Based Genome Data Compression.","authors":"Zhenhao Sun, Meng Wang, Shiqi Wang, Sam Kwong","doi":"10.1109/TCBB.2024.3473899","DOIUrl":"https://doi.org/10.1109/TCBB.2024.3473899","url":null,"abstract":"<p><p>In this paper, we propose a Learning-based gEnome Codec (LEC), which is designed for high efficiency and enhanced flexibility. The LEC integrates several advanced technologies, including Group of Bases (GoB) compression, multi-stride coding and bidirectional prediction, all of which are aimed at optimizing the balance between coding complexity and performance in lossless compression. The model applied in our proposed codec is data-driven, based on deep neural networks to infer probabilities for each symbol, enabling fully parallel encoding and decoding with configured complexity for diverse applications. Based upon a set of configurations on compression ratios and inference speed, experimental results show that the proposed method is very efficient in terms of compression performance and provides improved flexibility in real-world applications.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142371724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-30DOI: 10.1109/TCBB.2024.3471930
Zhuoping Zhou, Boning Tong, Davoud Ataee Tarzanagh, Bojian Hou, Andrew J Saykin, Qi Long, Li Shen
Tensor Canonical Correlation Analysis (TCCA) is a commonly employed statistical method utilized to examine linear associations between two sets of tensor datasets. However, the existing TCCA models fail to adequately address the heterogeneity present in real-world tensor data, such as brain imaging data collected from diverse groups characterized by factors like sex and race. Consequently, these models may yield biased outcomes. In order to surmount this constraint, we propose a novel approach called Multi-Group TCCA (MG-TCCA), which enables the joint analysis of multiple subgroups. By incorporating a dual sparsity structure and a block coordinate ascent algorithm, our MG-TCCA method effectively addresses heterogeneity and leverages information across different groups to identify consistent signals. This novel approach facilitates the quantification of shared and individual structures, reduces data dimensionality, and enables visual exploration. To empirically validate our approach, we conduct a study focused on investigating correlations between two brain positron emission tomography (PET) modalities (AV-45 and FDG) within an Alzheimer's disease (AD) cohort. Our results demonstrate that MG-TCCA surpasses traditional TCCA and Sparse TCCA (STCCA) in identifying sex-specific cross-modality imaging correlations. This heightened performance of MG-TCCA provides valuable insights for the characterization of multimodal imaging biomarkers in AD.
{"title":"MG-TCCA: Tensor Canonical Correlation Analysis across Multiple Groups.","authors":"Zhuoping Zhou, Boning Tong, Davoud Ataee Tarzanagh, Bojian Hou, Andrew J Saykin, Qi Long, Li Shen","doi":"10.1109/TCBB.2024.3471930","DOIUrl":"10.1109/TCBB.2024.3471930","url":null,"abstract":"<p><p>Tensor Canonical Correlation Analysis (TCCA) is a commonly employed statistical method utilized to examine linear associations between two sets of tensor datasets. However, the existing TCCA models fail to adequately address the heterogeneity present in real-world tensor data, such as brain imaging data collected from diverse groups characterized by factors like sex and race. Consequently, these models may yield biased outcomes. In order to surmount this constraint, we propose a novel approach called Multi-Group TCCA (MG-TCCA), which enables the joint analysis of multiple subgroups. By incorporating a dual sparsity structure and a block coordinate ascent algorithm, our MG-TCCA method effectively addresses heterogeneity and leverages information across different groups to identify consistent signals. This novel approach facilitates the quantification of shared and individual structures, reduces data dimensionality, and enables visual exploration. To empirically validate our approach, we conduct a study focused on investigating correlations between two brain positron emission tomography (PET) modalities (AV-45 and FDG) within an Alzheimer's disease (AD) cohort. Our results demonstrate that MG-TCCA surpasses traditional TCCA and Sparse TCCA (STCCA) in identifying sex-specific cross-modality imaging correlations. This heightened performance of MG-TCCA provides valuable insights for the characterization of multimodal imaging biomarkers in AD.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142345929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-27DOI: 10.1109/TCBB.2024.3469164
Xuena Liang, Junliang Shang, Jin-Xing Liu, Chun-Hou Zheng, Juan Wang
Recent advancements in spatially transcriptomics (ST) technologies have enabled the comprehensive measurement of gene expression profiles while preserving the spatial information of cells. Combining gene expression profiles and spatial information has been the most commonly used method to identify spatial functional domains and genes. However, most existing spatial domain decipherer methods are more focused on spatially neighboring structures and fail to take into account balancing the self-characteristics and the spatial structure dependency of spots. Therefore, we propose a novel model called SpaGCAC, which recognizes spatial domains with the help of an adaptive feature-spatial balanced graph convolutional network named AFSBGCN. The AFSBGCN can dynamically learn the relationship between spatial local topology structures and the self-characteristics of spots by adaptively increasing or declining the weight on the self-characteristics during message aggregation. Moreover, to better capture the local structures of spots, SpaGCAC exploits a local topology structure contrastive learning strategy. Meanwhile, SpaGCAC utilizes a probability distribution contrastive learning strategy to increase the similarity of probability distributions for points belonging to the same category. We validate the performance of SpaGCAC for spatial domain identification on four spatial transcriptomic datasets. In comparison with seven spatial domain recognition methods, SpaGCAC achieved the highest NMI median of 0.683 and the second highest ARI median of 0.559 on the multi-slice DLPFC dataset. SpaGCAC achieved the best results on all three other single-slice datasets. The above-mentioned results show that SpaGCAC outperforms most existing methods, providing enhanced insights into tissue heterogeneity.
{"title":"Enhancing Spatial Domain Identification in Spatially Resolved Transcriptomics Using Graph Convolutional Networks with Adaptively Feature-Spatial Balance and Contrastive Learning.","authors":"Xuena Liang, Junliang Shang, Jin-Xing Liu, Chun-Hou Zheng, Juan Wang","doi":"10.1109/TCBB.2024.3469164","DOIUrl":"10.1109/TCBB.2024.3469164","url":null,"abstract":"<p><p>Recent advancements in spatially transcriptomics (ST) technologies have enabled the comprehensive measurement of gene expression profiles while preserving the spatial information of cells. Combining gene expression profiles and spatial information has been the most commonly used method to identify spatial functional domains and genes. However, most existing spatial domain decipherer methods are more focused on spatially neighboring structures and fail to take into account balancing the self-characteristics and the spatial structure dependency of spots. Therefore, we propose a novel model called SpaGCAC, which recognizes spatial domains with the help of an adaptive feature-spatial balanced graph convolutional network named AFSBGCN. The AFSBGCN can dynamically learn the relationship between spatial local topology structures and the self-characteristics of spots by adaptively increasing or declining the weight on the self-characteristics during message aggregation. Moreover, to better capture the local structures of spots, SpaGCAC exploits a local topology structure contrastive learning strategy. Meanwhile, SpaGCAC utilizes a probability distribution contrastive learning strategy to increase the similarity of probability distributions for points belonging to the same category. We validate the performance of SpaGCAC for spatial domain identification on four spatial transcriptomic datasets. In comparison with seven spatial domain recognition methods, SpaGCAC achieved the highest NMI median of 0.683 and the second highest ARI median of 0.559 on the multi-slice DLPFC dataset. SpaGCAC achieved the best results on all three other single-slice datasets. The above-mentioned results show that SpaGCAC outperforms most existing methods, providing enhanced insights into tissue heterogeneity.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142345922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-27DOI: 10.1109/TCBB.2024.3468434
Yuyang Xu, Jingbo Zhou, Haochao Ying, Jintai Chen, Wei Chen, Danny Z Chen, Jian Wu
Drug Target Interaction (DTI) prediction plays a crucial role in in-silico drug discovery, especially for deep learning (DL) models. Along this line, existing methods usually first extract features from drugs and target proteins, and use drug-target pairs to train DL models. However, these DL-based methods essentially rely on similar structures and patterns defined by the homologous proteins from a large amount of data. When few drug-target interactions are known for a newly discovered protein and its homologous proteins, prediction performance can suffer notable reduction. In this paper, we propose a novel Protein-Context enhanced Master/Slave Framework (PCMS), for zero-shot DTI prediction. This framework facilitates the efficient discovery of ligands for newly discovered target proteins, addressing the challenge of predicting interactions without prior data. Specifically, the PCMS framework consists of two main components: a Master Learner and a Slave Learner. The Master Learner first learns the target protein context information, and then adaptively generates the corresponding parameters for the Slave Learner. The Slave Learner then perform zero-shot DTI prediction in different protein contexts. Extensive experiments verify the effectiveness of our PCMS compared to state-of-the-art methods in various metrics on two public datasets. The Code and the processed Data will be open once the paper is accepted.
{"title":"A Protein-Context Enhanced Master Slave Framework for Zero-Shot Drug Target Interaction Prediction.","authors":"Yuyang Xu, Jingbo Zhou, Haochao Ying, Jintai Chen, Wei Chen, Danny Z Chen, Jian Wu","doi":"10.1109/TCBB.2024.3468434","DOIUrl":"https://doi.org/10.1109/TCBB.2024.3468434","url":null,"abstract":"<p><p>Drug Target Interaction (DTI) prediction plays a crucial role in in-silico drug discovery, especially for deep learning (DL) models. Along this line, existing methods usually first extract features from drugs and target proteins, and use drug-target pairs to train DL models. However, these DL-based methods essentially rely on similar structures and patterns defined by the homologous proteins from a large amount of data. When few drug-target interactions are known for a newly discovered protein and its homologous proteins, prediction performance can suffer notable reduction. In this paper, we propose a novel Protein-Context enhanced Master/Slave Framework (PCMS), for zero-shot DTI prediction. This framework facilitates the efficient discovery of ligands for newly discovered target proteins, addressing the challenge of predicting interactions without prior data. Specifically, the PCMS framework consists of two main components: a Master Learner and a Slave Learner. The Master Learner first learns the target protein context information, and then adaptively generates the corresponding parameters for the Slave Learner. The Slave Learner then perform zero-shot DTI prediction in different protein contexts. Extensive experiments verify the effectiveness of our PCMS compared to state-of-the-art methods in various metrics on two public datasets. The Code and the processed Data will be open once the paper is accepted.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142345921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-27DOI: 10.1109/TCBB.2024.3470592
Garud Iyengar, Mitch Perry
Models for microbial interactions attempt to understand and predict the steady state network of inter-species relationships in a community, e.g. competition for shared metabolites, and cooperation through cross-feeding. Flux balance analysis (FBA) is an approach that was introduced to model the interaction of a particular microbial species with its environment. This approach has been extended to analyzing interactions in a community of microbes; however, these approaches have two important drawbacks: first, one has to numerically solve a differential equation to identify the steady state, and second, there are no methods available to analyze the stability of the steady state. We propose a game theory based community FBA model wherein species compete to maximize their individual growth rate, and the state of the community is given by the resulting Nash equilibrium. We develop a computationally efficient method for directly computing the steady state biomasses and fluxes without solving a differential equation. We also develop a method to determine the stability of a steady state to perturbations in the biomasses and to invasion by new species. We report the results of applying our proposed framework to a small community of four E. coli mutants that compete for externally supplied glucose, as well as cooperate since the mutants are auxotrophic for metabolites exported by other mutants, and a more realistic model for a gut microbiome consisting of nine species.
{"title":"Game-theoretic Flux Balance Analysis Model for Predicting Stable Community Composition.","authors":"Garud Iyengar, Mitch Perry","doi":"10.1109/TCBB.2024.3470592","DOIUrl":"https://doi.org/10.1109/TCBB.2024.3470592","url":null,"abstract":"<p><p>Models for microbial interactions attempt to understand and predict the steady state network of inter-species relationships in a community, e.g. competition for shared metabolites, and cooperation through cross-feeding. Flux balance analysis (FBA) is an approach that was introduced to model the interaction of a particular microbial species with its environment. This approach has been extended to analyzing interactions in a community of microbes; however, these approaches have two important drawbacks: first, one has to numerically solve a differential equation to identify the steady state, and second, there are no methods available to analyze the stability of the steady state. We propose a game theory based community FBA model wherein species compete to maximize their individual growth rate, and the state of the community is given by the resulting Nash equilibrium. We develop a computationally efficient method for directly computing the steady state biomasses and fluxes without solving a differential equation. We also develop a method to determine the stability of a steady state to perturbations in the biomasses and to invasion by new species. We report the results of applying our proposed framework to a small community of four E. coli mutants that compete for externally supplied glucose, as well as cooperate since the mutants are auxotrophic for metabolites exported by other mutants, and a more realistic model for a gut microbiome consisting of nine species.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142345923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-26DOI: 10.1109/TCBB.2024.3469178
Haixi Zhang, Jiahui Yang, Chenyan Lv, Xing Wei, Haibin Han, Bin Liu
Apple leaf diseases can seriously affect apple production and quality, and accurately detecting them can improve the efficiency of disease monitoring. Owing to the complex natural growth environment, apple leaf lesions may be easily confused with background noise, leading to poor performance. In this study, a cascaded Incremental Region Proposal Network (Inc-RPN) is proposed to accurately detect apple leaf diseases in natural environments. The proposed Inc-RPN has a two-layer RPN architecture, where the precursor RPN is leveraged to generate diseased leaf proposals, and the successor RPN focuses on extracting target disease spots based on diseased leaf proposals. In the successor RPN, a low-level feature aggregation module is designed to fully utilize the bridged features and preserve the semantic information of the target disease spots. An incremental module is also leveraged to extract aggregated diseased leaf features and target disease spot features. Finally, a novel position anchor generator is designed to generate anchors based on diseased leaf proposals. The experimental results show that the proposed Inc-RPN performs very well on the FALD_CED and Apple Leaf Disease datasets, showing that it can accurately perform apple leaf disease detection tasks.
{"title":"Incremental RPN: Hierarchical Region Proposal Network for Apple Leaf Disease Detection in Natural Environments.","authors":"Haixi Zhang, Jiahui Yang, Chenyan Lv, Xing Wei, Haibin Han, Bin Liu","doi":"10.1109/TCBB.2024.3469178","DOIUrl":"https://doi.org/10.1109/TCBB.2024.3469178","url":null,"abstract":"<p><p>Apple leaf diseases can seriously affect apple production and quality, and accurately detecting them can improve the efficiency of disease monitoring. Owing to the complex natural growth environment, apple leaf lesions may be easily confused with background noise, leading to poor performance. In this study, a cascaded Incremental Region Proposal Network (Inc-RPN) is proposed to accurately detect apple leaf diseases in natural environments. The proposed Inc-RPN has a two-layer RPN architecture, where the precursor RPN is leveraged to generate diseased leaf proposals, and the successor RPN focuses on extracting target disease spots based on diseased leaf proposals. In the successor RPN, a low-level feature aggregation module is designed to fully utilize the bridged features and preserve the semantic information of the target disease spots. An incremental module is also leveraged to extract aggregated diseased leaf features and target disease spot features. Finally, a novel position anchor generator is designed to generate anchors based on diseased leaf proposals. The experimental results show that the proposed Inc-RPN performs very well on the FALD_CED and Apple Leaf Disease datasets, showing that it can accurately perform apple leaf disease detection tasks.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142345927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}