Pub Date : 2025-10-23DOI: 10.1007/s12539-025-00767-7
Wei Zhang, Yue Yu, Yuanyuan Li, Xiaoying Zheng, Juan Shen
Single-cell RNA transcriptome data offer a fantastic chance to investigate biological mechanisms such as cellular heterogeneity. Accurate identification of subtypes is of great importance for revealing the molecular mechanisms underlying complex diseases. Designing computational methods for cell type identification has been a hot topic recently, and various computational algorithms have been designed to estimate cell type composition. However, owing to the high sparseness, noise, and dimensionality of the obtainable scRNA-seq data, boosting prediction performance remains a challenge. In this work, a new cell type identification method is developed by integrating low rank representation (LRR) and symmetric orthogonal decomposition, named LRRS. Different from the spectral embedding algorithm in which the number of clusters is predefined, LRRS introduces a new orthogonal symmetric decomposition strategy and adaptively characterizes the local properties by measuring the weighted distance under the orthogonal space. To optimize the graph model, an efficient iterative approach is proposed to optimize each variable alternatively utilizing the alternating direction method of multipliers (ADMM). Based on the resulting similarity matrix, the spectral algorithm is adopted to group the individual cells. To evaluate the performance of LRRS, we implemented it on the eleven benchmark datasets and compared it with fourteen other cutting-edge methods in terms of prediction accuracy and normalized mutual information. The comparison results show that LRRS is effective in predicting cell type composition.
{"title":"Joint Low Rank Representation with Symmetric Orthogonal Decomposition for Clustering of scRNA-seq Data.","authors":"Wei Zhang, Yue Yu, Yuanyuan Li, Xiaoying Zheng, Juan Shen","doi":"10.1007/s12539-025-00767-7","DOIUrl":"https://doi.org/10.1007/s12539-025-00767-7","url":null,"abstract":"<p><p>Single-cell RNA transcriptome data offer a fantastic chance to investigate biological mechanisms such as cellular heterogeneity. Accurate identification of subtypes is of great importance for revealing the molecular mechanisms underlying complex diseases. Designing computational methods for cell type identification has been a hot topic recently, and various computational algorithms have been designed to estimate cell type composition. However, owing to the high sparseness, noise, and dimensionality of the obtainable scRNA-seq data, boosting prediction performance remains a challenge. In this work, a new cell type identification method is developed by integrating low rank representation (LRR) and symmetric orthogonal decomposition, named LRRS. Different from the spectral embedding algorithm in which the number of clusters is predefined, LRRS introduces a new orthogonal symmetric decomposition strategy and adaptively characterizes the local properties by measuring the weighted distance under the orthogonal space. To optimize the graph model, an efficient iterative approach is proposed to optimize each variable alternatively utilizing the alternating direction method of multipliers (ADMM). Based on the resulting similarity matrix, the spectral algorithm is adopted to group the individual cells. To evaluate the performance of LRRS, we implemented it on the eleven benchmark datasets and compared it with fourteen other cutting-edge methods in terms of prediction accuracy and normalized mutual information. The comparison results show that LRRS is effective in predicting cell type composition.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145354526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1007/s12539-025-00779-3
Qi Wang, Zhiheng Zhou, Guiying Yan
Finding effective drug combinations is a pivotal strategy for enhancing therapeutic efficacy and overcoming drug resistance in complex diseases like cancer. While computational methods have accelerated this discovery, most existing models are confined to predicting pairwise interactions, failing to capture the complex, higher-order synergies inherent in multi-drug regimens. To bridge this critical gap, we introduce an enhanced hypergraph random walk (EHRW) model uniquely designed to predict effective drug combinations. Our framework naturally represents multi-drug relationships using hypergraphs and leverages network topology to predict combination efficacy. Recognizing that network structure alone may not fully capture the intricate biological properties of drugs, we further propose a robust post-processing strategy that refines initial predictions by integrating auxiliary drug features. This method, which uses chemical similarity derived from SMILES fingerprints, serves as a powerful validation layer, significantly boosting the model's predictive accuracy. We demonstrate the superior performance of our enhanced EHRW model through rigorous validation on two major cancer datasets (lung and breast cancer). Our results show that the chemical similarity-based post-processing strategy outperforms the original model and several contemporary baselines. Importantly, our model extends beyond binary prediction by introducing a straightforward scoring method for three-drug combinations, which averages the predicted scores of their constituent binary pairs and provides a practical pathway for evaluating higher-order therapies. The enhanced EHRW model offers a flexible, accurate, and scalable computational tool, paving the way for more precise discovery of effective multi-drug regimens.
{"title":"A Hypergraph-Based Model for Predicting Potential Drug Combinations in Cancer Therapy.","authors":"Qi Wang, Zhiheng Zhou, Guiying Yan","doi":"10.1007/s12539-025-00779-3","DOIUrl":"https://doi.org/10.1007/s12539-025-00779-3","url":null,"abstract":"<p><p>Finding effective drug combinations is a pivotal strategy for enhancing therapeutic efficacy and overcoming drug resistance in complex diseases like cancer. While computational methods have accelerated this discovery, most existing models are confined to predicting pairwise interactions, failing to capture the complex, higher-order synergies inherent in multi-drug regimens. To bridge this critical gap, we introduce an enhanced hypergraph random walk (EHRW) model uniquely designed to predict effective drug combinations. Our framework naturally represents multi-drug relationships using hypergraphs and leverages network topology to predict combination efficacy. Recognizing that network structure alone may not fully capture the intricate biological properties of drugs, we further propose a robust post-processing strategy that refines initial predictions by integrating auxiliary drug features. This method, which uses chemical similarity derived from SMILES fingerprints, serves as a powerful validation layer, significantly boosting the model's predictive accuracy. We demonstrate the superior performance of our enhanced EHRW model through rigorous validation on two major cancer datasets (lung and breast cancer). Our results show that the chemical similarity-based post-processing strategy outperforms the original model and several contemporary baselines. Importantly, our model extends beyond binary prediction by introducing a straightforward scoring method for three-drug combinations, which averages the predicted scores of their constituent binary pairs and provides a practical pathway for evaluating higher-order therapies. The enhanced EHRW model offers a flexible, accurate, and scalable computational tool, paving the way for more precise discovery of effective multi-drug regimens.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145336969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) allows for deciphering the epigenetic landscape at single-cell resolution. The inaccuracies in annotations and the scarcity of available real datasets hinder the unbiased and comprehensive evaluation of computational methods designed for scATAC-seq data analysis, which underscores the importance of scATAC-seq data simulation methods. Existing scATAC-seq data simulation methods impose strict requirements on the prior distribution of the data and fail to generate simulated data with a consistent manifold structure aligned with real data. In this study, we propose DiTSim, a scATAC-seq data simulation method based on diffusion transformers. DiTSim efficiently fits the global distribution of real scATAC-seq datasets and stably synthesizes samples with known cell type annotations for assessing scATAC-seq data analysis pipelines. Through comprehensive experiments on multiple datasets, DiTSim has demonstrated its outstanding performance in achieving consistency with real data and robustness to datasets with diverse characteristics. Moreover, extensive enrichment analysis demonstrates that DiTSim has the capability to imbue simulated data with biological significance, a critical aspect often overlooked in prior studies.
{"title":"DiTSim: A Diffusion-Transformers Based Single-Cell ATAC-seq Data Simulator.","authors":"Shengze Dong, Songming Tang, Ding Liu, Shengquan Chen","doi":"10.1007/s12539-025-00773-9","DOIUrl":"https://doi.org/10.1007/s12539-025-00773-9","url":null,"abstract":"<p><p>Single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) allows for deciphering the epigenetic landscape at single-cell resolution. The inaccuracies in annotations and the scarcity of available real datasets hinder the unbiased and comprehensive evaluation of computational methods designed for scATAC-seq data analysis, which underscores the importance of scATAC-seq data simulation methods. Existing scATAC-seq data simulation methods impose strict requirements on the prior distribution of the data and fail to generate simulated data with a consistent manifold structure aligned with real data. In this study, we propose DiTSim, a scATAC-seq data simulation method based on diffusion transformers. DiTSim efficiently fits the global distribution of real scATAC-seq datasets and stably synthesizes samples with known cell type annotations for assessing scATAC-seq data analysis pipelines. Through comprehensive experiments on multiple datasets, DiTSim has demonstrated its outstanding performance in achieving consistency with real data and robustness to datasets with diverse characteristics. Moreover, extensive enrichment analysis demonstrates that DiTSim has the capability to imbue simulated data with biological significance, a critical aspect often overlooked in prior studies.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145344923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate identification of drug-target interactions (DTIs) is a crucial step in drug discovery. Computational DTI prediction methods can significantly reduce the time and cost associated with drug development. However, effectively integrating multisource features for high-precision DTI prediction remains a challenge. In this study, we propose EDeepDTI, an ensemble deep learning framework designed to increase the accuracy and generalizability of DTI predictions by efficiently integrating multi-view features. EDeepDTI calculates multiple molecular fingerprints to extract rich substructural information from drugs, leverages several advanced pre-trained models to generate drug and protein features enriched with structural and semantic information, and calculates multiple semantic similarity features for drugs and proteins using various similarity measures. During the ensemble learning process, we design a deep learning base learner for each unique pairing of drug and protein features. This ensures that each base learner captures distinct feature interactions, enhancing both independence and diversity within the ensemble. Finally, a greedy strategy is employed to aggregate the predictions from all base learners to improve overall performance. The experimental results demonstrate that EDeepDTI and its variant consistently outperform the baseline methods across multiple datasets and prediction tasks, highlighting the superior performance, robustness, and scalability of EDeepDTI.
{"title":"A Scalable and Robust Ensemble Deep Learning Method for Predicting Drug-Target Interactions.","authors":"Zhixing Cheng, Qunfang Yan, Dewu Ding, Yanrui Ding","doi":"10.1007/s12539-025-00774-8","DOIUrl":"https://doi.org/10.1007/s12539-025-00774-8","url":null,"abstract":"<p><p>Accurate identification of drug-target interactions (DTIs) is a crucial step in drug discovery. Computational DTI prediction methods can significantly reduce the time and cost associated with drug development. However, effectively integrating multisource features for high-precision DTI prediction remains a challenge. In this study, we propose EDeepDTI, an ensemble deep learning framework designed to increase the accuracy and generalizability of DTI predictions by efficiently integrating multi-view features. EDeepDTI calculates multiple molecular fingerprints to extract rich substructural information from drugs, leverages several advanced pre-trained models to generate drug and protein features enriched with structural and semantic information, and calculates multiple semantic similarity features for drugs and proteins using various similarity measures. During the ensemble learning process, we design a deep learning base learner for each unique pairing of drug and protein features. This ensures that each base learner captures distinct feature interactions, enhancing both independence and diversity within the ensemble. Finally, a greedy strategy is employed to aggregate the predictions from all base learners to improve overall performance. The experimental results demonstrate that EDeepDTI and its variant consistently outperform the baseline methods across multiple datasets and prediction tasks, highlighting the superior performance, robustness, and scalability of EDeepDTI.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145345086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1007/s12539-025-00778-4
Bao Zhang, Jing Wang, Weiwei Wang, Hongbo Zhang
Differential expression analysis constitutes a crucial step in the analysis of single-cell transcriptomic data. Numerous statistical methods have been developed to conduct differential expression analysis by addressing the sparsity or heterogeneity of gene expression. Nevertheless, these approaches often overlook other critical characteristics of single-cell transcriptomic data, such as the high dimensionality of gene expression at the cellular level, which may consequently lead to suboptimal performance. Furthermore, to date, there remains a significant gap in methodologies capable of locating and ordering genes along cell trajectories. Here, we integrate polynomial fitting with hypergeometric testing to develop a new tool, DEAPLOG (Differential Expression Analysis and Pseudo-temporal Locating and Ordering of Genes), leveraging the high-dimensional gene expression characteristics at the cellular level in single-cell transcriptomic data. Benchmarking analyses on synthetic single-cell datasets demonstrate that while DEAPLOG exhibits performance comparable to existing methods on datasets comprising only two cell clusters, it demonstrates superior performance in differential expression analysis when applied to datasets with multiple cell clusters. Furthermore, the applications of DEAPLOG to real single-cell and spatial transcriptomic dataset not only validate its superior performance in terms of accuracy but also computational efficiency. Notably, when applied to single-cell transcriptomic data from the developmental hematopoietic system, DEAPLOG demonstrate precise gene localization and accurate ordering along developmental trajectories. Collectively, these findings establish DEAPLOG as a robust and highly effective tool for single-cell transcriptomic data analysis.
{"title":"DEAPLOG: Differential Expression Analysis and Pseudo-Temporal Locating and Ordering of Genes in Single-Cell Transcriptomic Data.","authors":"Bao Zhang, Jing Wang, Weiwei Wang, Hongbo Zhang","doi":"10.1007/s12539-025-00778-4","DOIUrl":"https://doi.org/10.1007/s12539-025-00778-4","url":null,"abstract":"<p><p>Differential expression analysis constitutes a crucial step in the analysis of single-cell transcriptomic data. Numerous statistical methods have been developed to conduct differential expression analysis by addressing the sparsity or heterogeneity of gene expression. Nevertheless, these approaches often overlook other critical characteristics of single-cell transcriptomic data, such as the high dimensionality of gene expression at the cellular level, which may consequently lead to suboptimal performance. Furthermore, to date, there remains a significant gap in methodologies capable of locating and ordering genes along cell trajectories. Here, we integrate polynomial fitting with hypergeometric testing to develop a new tool, DEAPLOG (Differential Expression Analysis and Pseudo-temporal Locating and Ordering of Genes), leveraging the high-dimensional gene expression characteristics at the cellular level in single-cell transcriptomic data. Benchmarking analyses on synthetic single-cell datasets demonstrate that while DEAPLOG exhibits performance comparable to existing methods on datasets comprising only two cell clusters, it demonstrates superior performance in differential expression analysis when applied to datasets with multiple cell clusters. Furthermore, the applications of DEAPLOG to real single-cell and spatial transcriptomic dataset not only validate its superior performance in terms of accuracy but also computational efficiency. Notably, when applied to single-cell transcriptomic data from the developmental hematopoietic system, DEAPLOG demonstrate precise gene localization and accurate ordering along developmental trajectories. Collectively, these findings establish DEAPLOG as a robust and highly effective tool for single-cell transcriptomic data analysis.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145337026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-14DOI: 10.1007/s12539-025-00777-5
Yibo Zhu, Xiumin Shi, Lu Wang, Jingjuan Zhang
Neoantigens, tumor-specific peptides with immunogenic potential, represent pivotal targets for cancer immunotherapy. Existing methods prioritize HLA-peptide binding but often fail to adequately address immunogenicity, limiting their clinical utility. This study introduces TrambaHLApan, a novel neoantigen prediction framework that integrates Transformer and Mamba architectures to concurrently predict antigen presentation likelihood (TrambaHLApan-EL) and immunogenic potential (TrambaHLApan-IM). A Transformer-based encoding module is employed to generate unique representations for HLA molecules and peptides. Subsequently, a hybrid fusion module, which combines merged-attention mechanisms with Mamba-based sequential modeling, is deployed to evaluate interaction patterns. TrambaHLApan-IM incorporates antigen presentation scores derived from TrambaHLApan-EL to explicitly model the interplay between antigen presentation and immunogenic potential, thereby enhancing the identification of neoantigens with high confidence. Experimental results on independent datasets demonstrate that TrambaHLApan outperforms state-of-the-art methods, establishing it as a reliable tool for advancing personalized cancer immunotherapies.
{"title":"TrambaHLApan: A Transformer and Mamba-based Neoantigen Prediction Method Considering both Antigen Presentation and Immunogenicity.","authors":"Yibo Zhu, Xiumin Shi, Lu Wang, Jingjuan Zhang","doi":"10.1007/s12539-025-00777-5","DOIUrl":"https://doi.org/10.1007/s12539-025-00777-5","url":null,"abstract":"<p><p>Neoantigens, tumor-specific peptides with immunogenic potential, represent pivotal targets for cancer immunotherapy. Existing methods prioritize HLA-peptide binding but often fail to adequately address immunogenicity, limiting their clinical utility. This study introduces TrambaHLApan, a novel neoantigen prediction framework that integrates Transformer and Mamba architectures to concurrently predict antigen presentation likelihood (TrambaHLApan-EL) and immunogenic potential (TrambaHLApan-IM). A Transformer-based encoding module is employed to generate unique representations for HLA molecules and peptides. Subsequently, a hybrid fusion module, which combines merged-attention mechanisms with Mamba-based sequential modeling, is deployed to evaluate interaction patterns. TrambaHLApan-IM incorporates antigen presentation scores derived from TrambaHLApan-EL to explicitly model the interplay between antigen presentation and immunogenic potential, thereby enhancing the identification of neoantigens with high confidence. Experimental results on independent datasets demonstrate that TrambaHLApan outperforms state-of-the-art methods, establishing it as a reliable tool for advancing personalized cancer immunotherapies.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145292048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-13DOI: 10.1007/s12539-025-00775-7
Chenyue Lei, Xiujuan Lei, Lian Liu, Jianrui Chen, Fang-Xiang Wu
Resistance to treatment remains one of the greatest challenges in cancer therapy. Recent studies have shown that drug sensitivity is closely associated with miRNA expression, highlighting the importance of predicting miRNA-drug interactions (MDIs) in understanding drug resistance mechanisms. Within this study, we propose an innovative method named MSFFMDI, which employs a dual-channel multi-source feature fusion framework based on heterogeneous networks to predict potential MDIs. The first channel focuses on attribute feature extraction. For miRNAs, we integrate the k-mer algorithm with word2vec to transform sequences into low-dimensional embeddings that capture semantic and structural information. For drugs, we utilize the graph isomorphism network to learn molecular structure features, and apply mol2vec to capture chemical and functional sequence features. The second channel extracts topological features by constructing a heterogeneous network based on integrated similarities and known associations between miRNAs and drugs. A graph attention network is used to update node embeddings, and a multi-scale convolutional neural network is employed to further extract topological representations. The features from both channels are fused and reduced via principal component analysis before being used for final prediction. A large number of rich experimental results show that MSFFMDI demonstrates excellent predictive performance on two datasets. Case studies further validate its robust performance. Overall, MSFFMDI provides a powerful and interpretable framework for predicting MDIs and offers potential insights into the mechanisms of drug resistance.
{"title":"Predicting miRNA-Drug Interactions Based on Multi-source Feature Fusion of Heterogeneous Network.","authors":"Chenyue Lei, Xiujuan Lei, Lian Liu, Jianrui Chen, Fang-Xiang Wu","doi":"10.1007/s12539-025-00775-7","DOIUrl":"https://doi.org/10.1007/s12539-025-00775-7","url":null,"abstract":"<p><p>Resistance to treatment remains one of the greatest challenges in cancer therapy. Recent studies have shown that drug sensitivity is closely associated with miRNA expression, highlighting the importance of predicting miRNA-drug interactions (MDIs) in understanding drug resistance mechanisms. Within this study, we propose an innovative method named MSFFMDI, which employs a dual-channel multi-source feature fusion framework based on heterogeneous networks to predict potential MDIs. The first channel focuses on attribute feature extraction. For miRNAs, we integrate the k-mer algorithm with word2vec to transform sequences into low-dimensional embeddings that capture semantic and structural information. For drugs, we utilize the graph isomorphism network to learn molecular structure features, and apply mol2vec to capture chemical and functional sequence features. The second channel extracts topological features by constructing a heterogeneous network based on integrated similarities and known associations between miRNAs and drugs. A graph attention network is used to update node embeddings, and a multi-scale convolutional neural network is employed to further extract topological representations. The features from both channels are fused and reduced via principal component analysis before being used for final prediction. A large number of rich experimental results show that MSFFMDI demonstrates excellent predictive performance on two datasets. Case studies further validate its robust performance. Overall, MSFFMDI provides a powerful and interpretable framework for predicting MDIs and offers potential insights into the mechanisms of drug resistance.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145286231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-13DOI: 10.1007/s12539-025-00771-x
Xiaoyan Sun, Zhenjie Hou, Wenguang Zhang, Yan Chen, Haibin Yao
Conventional approaches to drug discovery often require considerable time and effort. The promising solution is to repurpose existing drugs by identifying new therapeutic roles, thereby enhancing development efficiency. Drug repositioning based on computational methods is gaining widespread attention. However, most computational methods primarily rely on similarity-based data to extract features of associations, but lack the mining of topological structural features in the association network, while ignoring valuable original biological and chemical information. Therefore, this article develops a drug repositioning approach via meta-path integration of multi-source biological information (MPMB-DR). This approach combines meta-path and biomolecular similarity information to construct high-quality negative links within heterogeneous networks. It considers both the topological structure of the association network and the relationships among biomolecules. Based on the negative sample strategy, potential drug-disease associations are predicted by leveraging the synergy between meta-paths and multi-source biological data. Experimental results and case studies demonstrate that the MPMB-DR method has significant advantages in identifying associations between potential drugs and diseases.
{"title":"MPMB-DR: Meta-path Integration of Multi-source Biological Information for Drug Repositioning.","authors":"Xiaoyan Sun, Zhenjie Hou, Wenguang Zhang, Yan Chen, Haibin Yao","doi":"10.1007/s12539-025-00771-x","DOIUrl":"https://doi.org/10.1007/s12539-025-00771-x","url":null,"abstract":"<p><p>Conventional approaches to drug discovery often require considerable time and effort. The promising solution is to repurpose existing drugs by identifying new therapeutic roles, thereby enhancing development efficiency. Drug repositioning based on computational methods is gaining widespread attention. However, most computational methods primarily rely on similarity-based data to extract features of associations, but lack the mining of topological structural features in the association network, while ignoring valuable original biological and chemical information. Therefore, this article develops a drug repositioning approach via meta-path integration of multi-source biological information (MPMB-DR). This approach combines meta-path and biomolecular similarity information to construct high-quality negative links within heterogeneous networks. It considers both the topological structure of the association network and the relationships among biomolecules. Based on the negative sample strategy, potential drug-disease associations are predicted by leveraging the synergy between meta-paths and multi-source biological data. Experimental results and case studies demonstrate that the MPMB-DR method has significant advantages in identifying associations between potential drugs and diseases.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145286156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-13DOI: 10.1007/s12539-025-00776-6
Bo Wang, Wenlong Zhao, Xiaoxin Du, Jianfei Zhang, Jingyou Li, Hang Sun
Extensive research has underscored the intricate relationships between microbial communities and human diseases. Delving into these associations enhances our understanding of disease mechanisms and facilitates the development of novel therapeutic strategies. Although traditional biological methods for identifying microbe-disease association (MDA) are reliable, they often entail high costs, extended timelines, and substantial manual effort. To address these limitations, this study introduces GRNCFMDA, an advanced deep learning framework designed to improve MDA prediction efficiency. Initially, the model integrates functional and Gaussian interaction profile (GIP) similarities of microbes, along with semantic and GIP similarities of diseases, to construct a comprehensive heterogeneous network. A graph random neural network (GRAND) enhanced with attention mechanisms is then applied to derive informative high-order representations of microbe and disease nodes. This is followed by a neural collaborative filtering module that merges the strengths of generalized matrix factorization for linear modeling with the deep learning capacity of multilayer perceptrons for capturing nonlinear patterns. Performance evaluations based on five-fold cross-validation across HMDAD and Disbiome datasets show that GRNCFMDA consistently outperforms four existing MDA prediction models. Additionally, empirical case studies affirm the model's practical utility in uncovering novel MDA. The implementation and datasets are publicly available at https://github.com/chenyunmolu/GRNCFMDA .
{"title":"Predicting Potential Microbe-Disease Associations Based on Heterogeneous Graph Random Attention Neural Network and Neural Collaborative Filtering.","authors":"Bo Wang, Wenlong Zhao, Xiaoxin Du, Jianfei Zhang, Jingyou Li, Hang Sun","doi":"10.1007/s12539-025-00776-6","DOIUrl":"https://doi.org/10.1007/s12539-025-00776-6","url":null,"abstract":"<p><p>Extensive research has underscored the intricate relationships between microbial communities and human diseases. Delving into these associations enhances our understanding of disease mechanisms and facilitates the development of novel therapeutic strategies. Although traditional biological methods for identifying microbe-disease association (MDA) are reliable, they often entail high costs, extended timelines, and substantial manual effort. To address these limitations, this study introduces GRNCFMDA, an advanced deep learning framework designed to improve MDA prediction efficiency. Initially, the model integrates functional and Gaussian interaction profile (GIP) similarities of microbes, along with semantic and GIP similarities of diseases, to construct a comprehensive heterogeneous network. A graph random neural network (GRAND) enhanced with attention mechanisms is then applied to derive informative high-order representations of microbe and disease nodes. This is followed by a neural collaborative filtering module that merges the strengths of generalized matrix factorization for linear modeling with the deep learning capacity of multilayer perceptrons for capturing nonlinear patterns. Performance evaluations based on five-fold cross-validation across HMDAD and Disbiome datasets show that GRNCFMDA consistently outperforms four existing MDA prediction models. Additionally, empirical case studies affirm the model's practical utility in uncovering novel MDA. The implementation and datasets are publicly available at https://github.com/chenyunmolu/GRNCFMDA .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145286252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-29DOI: 10.1007/s12539-025-00765-9
Shengpeng Yu, Zihan Yang, Tianyu Liu, Cheng Liang, Hong Wang
The advent of single-cell transcriptome sequencing (scRNA-seq) has revolutionized our ability to analyze gene expression at the individual cell level, overcoming the limitations of bulk RNA sequencing. However, the explosive growth of scRNA-seq data and the prevalence of dropout events pose significant challenges for downstream analysis. Existing methodologies often focus on isolated tasks, such as identifying cell communities, processing dropout events, and mitigating batch effects, neglecting collaborative multi-task analysis, and introducing new noise during dropout event handling. In response to these challenges, we propose scIMTA (interpretable multi-task analysis of single-cell), an advanced framework designed to enhance interpretability and effectively address the issues of topological structure preservation and dropout events. The key innovations of scIMTA are that scIMTA enables collaborative multi-task analysis of sparse, high-noise gene expression data, enhances interpretability through biological grounding, robustly handles dropout events by preserving data integrity, and demonstrates efficacy and generalizability through rigorous validation on breast cancer scRNA-seq datasets. scIMTA establishes a new framework for collaborative multi-task analysis, interpretability, and robust dropout handling in single-cell transcriptome studies. This work significantly advances the field and allows a more nuanced exploration of cellular heterogeneity and gene expression dynamics. The source code of scIMTA is available for download at https://github.com/ShengPengYu/scIMTA .
{"title":"Interpretable Multi-task Analysis of Single-Cell RNA-seq Data Through Topological Structure Preservation and Data Denoising.","authors":"Shengpeng Yu, Zihan Yang, Tianyu Liu, Cheng Liang, Hong Wang","doi":"10.1007/s12539-025-00765-9","DOIUrl":"https://doi.org/10.1007/s12539-025-00765-9","url":null,"abstract":"<p><p>The advent of single-cell transcriptome sequencing (scRNA-seq) has revolutionized our ability to analyze gene expression at the individual cell level, overcoming the limitations of bulk RNA sequencing. However, the explosive growth of scRNA-seq data and the prevalence of dropout events pose significant challenges for downstream analysis. Existing methodologies often focus on isolated tasks, such as identifying cell communities, processing dropout events, and mitigating batch effects, neglecting collaborative multi-task analysis, and introducing new noise during dropout event handling. In response to these challenges, we propose scIMTA (interpretable multi-task analysis of single-cell), an advanced framework designed to enhance interpretability and effectively address the issues of topological structure preservation and dropout events. The key innovations of scIMTA are that scIMTA enables collaborative multi-task analysis of sparse, high-noise gene expression data, enhances interpretability through biological grounding, robustly handles dropout events by preserving data integrity, and demonstrates efficacy and generalizability through rigorous validation on breast cancer scRNA-seq datasets. scIMTA establishes a new framework for collaborative multi-task analysis, interpretability, and robust dropout handling in single-cell transcriptome studies. This work significantly advances the field and allows a more nuanced exploration of cellular heterogeneity and gene expression dynamics. The source code of scIMTA is available for download at https://github.com/ShengPengYu/scIMTA .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145191500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}