Pub Date : 2026-03-01Epub Date: 2025-06-18DOI: 10.1007/s12539-025-00708-4
T Thanya, T Jeslin
Brain tumor classification using Magnetic Resonance Imaging (MRI) images is an important and emerging field of medical imaging and artificial intelligence in the current world. With advancements in technology, particularly in deep learning and machine learning, researchers and clinicians are leveraging these tools to create complex models that, using MRI data, can reliably detect and classify tumors in the brain. However, it has a number of drawbacks, including the intricacy of tumor types and grades, intensity variations in MRI data and tumors varying in severity. This paper proposes a Multi-Grade Hierarchical Classification Network Model (MGHCN) for the hierarchical classification of tumor grades in MRI images. The model's distinctive feature lies in its ability to categorize tumors into multiple grades, thereby capturing the hierarchical nature of tumor severity. To address variations in intensity levels across different MRI samples, an Improved Adaptive Intensity Normalization (IAIN) pre-processing step is employed. This step standardizes intensity values, effectively mitigating the impact of intensity variations and ensuring more consistent analyses. The model renders utilization of the Dual Tree Complex Wavelet Transform with Enhanced Trigonometric Features (DTCWT-ETF) for efficient feature extraction. DTCWT-ETF captures both spatial and frequency characteristics, allowing the model to distinguish between different tumor types more effectively. In the classification stage, the framework introduces the Adaptive Hierarchical Optimized Horse Herd BiLSTM Fusion Network (AHOHH-BiLSTM). This multi-grade classification model is designed with a comprehensive architecture, including distinct layers that enhance the learning process and adaptively refine parameters. The purpose of this study is to improve the precision of distinguishing different grades of tumors in MRI images. To evaluate the proposed MGHCN framework, a set of evaluation metrics is incorporated which includes precision, recall, and the F1-score. The structure employs BraTS Challenge 2021, Br35H, and BraTS Challenge 2023 datasets, a significant combination that ensures comprehensive training and evaluation. The MGHCN framework aims to enhance brain tumor classification in MRI images by utilizing these datasets along with a comprehensive set of evaluation metrics, providing a more thorough and sophisticated understanding of its capabilities and performance.
{"title":"Automated Multi-grade Brain Tumor Classification Using Adaptive Hierarchical Optimized Horse Herd BiLSTM Fusion Network in MRI Images.","authors":"T Thanya, T Jeslin","doi":"10.1007/s12539-025-00708-4","DOIUrl":"10.1007/s12539-025-00708-4","url":null,"abstract":"<p><p>Brain tumor classification using Magnetic Resonance Imaging (MRI) images is an important and emerging field of medical imaging and artificial intelligence in the current world. With advancements in technology, particularly in deep learning and machine learning, researchers and clinicians are leveraging these tools to create complex models that, using MRI data, can reliably detect and classify tumors in the brain. However, it has a number of drawbacks, including the intricacy of tumor types and grades, intensity variations in MRI data and tumors varying in severity. This paper proposes a Multi-Grade Hierarchical Classification Network Model (MGHCN) for the hierarchical classification of tumor grades in MRI images. The model's distinctive feature lies in its ability to categorize tumors into multiple grades, thereby capturing the hierarchical nature of tumor severity. To address variations in intensity levels across different MRI samples, an Improved Adaptive Intensity Normalization (IAIN) pre-processing step is employed. This step standardizes intensity values, effectively mitigating the impact of intensity variations and ensuring more consistent analyses. The model renders utilization of the Dual Tree Complex Wavelet Transform with Enhanced Trigonometric Features (DTCWT-ETF) for efficient feature extraction. DTCWT-ETF captures both spatial and frequency characteristics, allowing the model to distinguish between different tumor types more effectively. In the classification stage, the framework introduces the Adaptive Hierarchical Optimized Horse Herd BiLSTM Fusion Network (AHOHH-BiLSTM). This multi-grade classification model is designed with a comprehensive architecture, including distinct layers that enhance the learning process and adaptively refine parameters. The purpose of this study is to improve the precision of distinguishing different grades of tumors in MRI images. To evaluate the proposed MGHCN framework, a set of evaluation metrics is incorporated which includes precision, recall, and the F1-score. The structure employs BraTS Challenge 2021, Br35H, and BraTS Challenge 2023 datasets, a significant combination that ensures comprehensive training and evaluation. The MGHCN framework aims to enhance brain tumor classification in MRI images by utilizing these datasets along with a comprehensive set of evaluation metrics, providing a more thorough and sophisticated understanding of its capabilities and performance.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"77-100"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144325602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Survival prediction involves multiple factors, such as histopathological image data and omics data, making it a typical multimodal task. In this work, we introduce semantic annotations for genes in different cell types based on cell biology knowledge, enabling the model to achieve interpretability at the cellular level. Since these cell type annotations are derived from the unique sites of origin for each cancer type, they can be more closely aligned with morphological features in whole slide images (WSIs) and address the issue of genomic annotation ambiguity. We then propose a multimodal fusion model, SurvTransformer, with multi-layer attention to fuse cell type tags (CTTs) and WSIs for survival prediction. Finally, through attention and integrated gradient attribution, the model provides biologically meaningful interpretable analysis at three different levels: cell type, gene, and histopathology image. Comparative experiments show that SurvTransformer achieves the highest consistency index across four cancer datasets. The survival curves generated are also statistically significant. Ablation experiments show that SurvTransformer outperforms models based on different labeling methods and attention representations. In terms of interpretability, case studies validate the effectiveness of SurvTransformer at three levels: cell type, gene, and histopathological image.
{"title":"Interpretable Cancer Survival Prediction by Fusing Semantic Labelling of Cell Types and Whole Slide Images.","authors":"Jinchao Chen, Pei Liu, Chen Chen, Ying Su, Jiajia Wang, Cheng Chen, Xiantao Ai, Xiaoyi Lv","doi":"10.1007/s12539-025-00744-0","DOIUrl":"10.1007/s12539-025-00744-0","url":null,"abstract":"<p><p>Survival prediction involves multiple factors, such as histopathological image data and omics data, making it a typical multimodal task. In this work, we introduce semantic annotations for genes in different cell types based on cell biology knowledge, enabling the model to achieve interpretability at the cellular level. Since these cell type annotations are derived from the unique sites of origin for each cancer type, they can be more closely aligned with morphological features in whole slide images (WSIs) and address the issue of genomic annotation ambiguity. We then propose a multimodal fusion model, SurvTransformer, with multi-layer attention to fuse cell type tags (CTTs) and WSIs for survival prediction. Finally, through attention and integrated gradient attribution, the model provides biologically meaningful interpretable analysis at three different levels: cell type, gene, and histopathology image. Comparative experiments show that SurvTransformer achieves the highest consistency index across four cancer datasets. The survival curves generated are also statistically significant. Ablation experiments show that SurvTransformer outperforms models based on different labeling methods and attention representations. In terms of interpretability, case studies validate the effectiveness of SurvTransformer at three levels: cell type, gene, and histopathological image.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"46-59"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145174198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-02-21DOI: 10.1007/s12539-025-00688-5
Shengze Dong, Zhuorui Cui, Ding Liu, Jinzhi Lei
Single-cell RNA sequencing (scRNA-seq) is a groundbreaking technology extensively utilized in biological research, facilitating the examination of gene expression at the individual cell level within a given tissue sample. While numerous tools have been developed for scRNA-seq data analysis, the challenge persists in capturing the distinct features of such data and replicating virtual datasets that share analogous statistical properties. Our study introduces a generative approach termed scRNA-seq Diffusion Transformer (scRDiT). This method generates virtual scRNA-seq data by leveraging a real dataset. The method is a neural network constructed based on Denoising Diffusion Probabilistic Models (DDPMs) and Diffusion Transformers (DiTs). This involves subjecting Gaussian noises to the real dataset through iterative noise-adding steps and ultimately restoring the noises to form scRNA-seq samples. This scheme allows us to learn data features from actual scRNA-seq samples during model training. Our experiments, conducted on two distinct scRNA-seq datasets, demonstrate superior performance. Additionally, the model sampling process is expedited by incorporating Denoising Diffusion Implicit Models (DDIMs). scRDiT presents a unified methodology empowering users to train neural network models with their unique scRNA-seq datasets, enabling the generation of numerous high-quality scRNA-seq samples.
{"title":"scRDiT: Generating Single-cell RNA-seq Data by Diffusion Transformers and Accelerating Sampling.","authors":"Shengze Dong, Zhuorui Cui, Ding Liu, Jinzhi Lei","doi":"10.1007/s12539-025-00688-5","DOIUrl":"10.1007/s12539-025-00688-5","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) is a groundbreaking technology extensively utilized in biological research, facilitating the examination of gene expression at the individual cell level within a given tissue sample. While numerous tools have been developed for scRNA-seq data analysis, the challenge persists in capturing the distinct features of such data and replicating virtual datasets that share analogous statistical properties. Our study introduces a generative approach termed scRNA-seq Diffusion Transformer (scRDiT). This method generates virtual scRNA-seq data by leveraging a real dataset. The method is a neural network constructed based on Denoising Diffusion Probabilistic Models (DDPMs) and Diffusion Transformers (DiTs). This involves subjecting Gaussian noises to the real dataset through iterative noise-adding steps and ultimately restoring the noises to form scRNA-seq samples. This scheme allows us to learn data features from actual scRNA-seq samples during model training. Our experiments, conducted on two distinct scRNA-seq datasets, demonstrate superior performance. Additionally, the model sampling process is expedited by incorporating Denoising Diffusion Implicit Models (DDIMs). scRDiT presents a unified methodology empowering users to train neural network models with their unique scRNA-seq datasets, enabling the generation of numerous high-quality scRNA-seq samples.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"314-325"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143468010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate pulmonary nodule detection in CT imaging remains challenging due to fragmented feature integration in conventional deep learning models. This paper proposes SPCF-YOLO, a real-time detection framework that synergizes hierarchical feature fusion with anatomical context modeling. First, the space-to-depth convolution (SPDConv) module preserves fine-grained features in low-resolution images through spatial dimension reorganization. Second, the shared feature pyramid convolution (SFPConv) module is designed to dynamically extract multi-scale contextual information using multi-dilation-rate convolutional layers. Incorporating a small object detection layer aims to improve sensitivity to small nodules. This is achieved in combination with the improved pyramid squeeze attention (PSA) module and the improved contextual transformer (CoTB) module, which enhance global channel dependencies and reduce feature loss. The model achieves 82.8% mean average precision (mAP) and 82.9% F1 score on LUNA16 at 151 frames per second (representing improvements of 17.5% and 82.9% over YOLOv8 respectively), demonstrating real-time clinical viability. Cross-modality validation on SIIM-COVID-19 shows 1.5% improvement, confirming robust generalization.
{"title":"SPCF-YOLO: An Efficient Feature Optimization Model for Real-Time Lung Nodule Detection.","authors":"Yawen Ren, Chenyang Shi, Donglin Zhu, Changjun Zhou","doi":"10.1007/s12539-025-00720-8","DOIUrl":"10.1007/s12539-025-00720-8","url":null,"abstract":"<p><p>Accurate pulmonary nodule detection in CT imaging remains challenging due to fragmented feature integration in conventional deep learning models. This paper proposes SPCF-YOLO, a real-time detection framework that synergizes hierarchical feature fusion with anatomical context modeling. First, the space-to-depth convolution (SPDConv) module preserves fine-grained features in low-resolution images through spatial dimension reorganization. Second, the shared feature pyramid convolution (SFPConv) module is designed to dynamically extract multi-scale contextual information using multi-dilation-rate convolutional layers. Incorporating a small object detection layer aims to improve sensitivity to small nodules. This is achieved in combination with the improved pyramid squeeze attention (PSA) module and the improved contextual transformer (CoTB) module, which enhance global channel dependencies and reduce feature loss. The model achieves 82.8% mean average precision (mAP) and 82.9% F1 score on LUNA16 at 151 frames per second (representing improvements of 17.5% and 82.9% over YOLOv8 respectively), demonstrating real-time clinical viability. Cross-modality validation on SIIM-COVID-19 shows 1.5% improvement, confirming robust generalization.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"231-252"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144198998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-07-11DOI: 10.1007/s12539-025-00727-1
Lei Li, Miaosen Xue, Songyang Li, Zhuoli Dong, Tianli Liao, Peng Li
Semi-supervised medical image segmentation techniques have demonstrated significant potential and effectiveness in clinical diagnosis. The prevailing approaches using the mean-teacher (MT) framework achieve promising image segmentation results. However, due to the unreliability of the pseudo labels generated by the teacher model, existing methods still have some inherent limitations that must be considered and addressed. In this paper, we propose an innovative semi-supervised method for medical image segmentation by combining the heterogeneous complementary correction network and confidence contrastive learning (HC-CCL). Specifically, we develop a triple-branch framework by integrating a heterogeneous complementary correction (HCC) network into the MT framework. HCC serves as an auxiliary branch that corrects prediction errors in the student model and provides complementary information. To improve the capacity for feature learning in our proposed model, we introduce a confidence contrastive learning (CCL) approach with a novel sampling strategy. Furthermore, we develop a momentum style transfer (MST) method to narrow the gap between labeled and unlabeled data distributions. In addition, we introduce a Cutout-style augmentation for unsupervised learning to enhance performance. Three medical image datasets (including left atrial (LA) dataset, NIH pancreas dataset, Brats-2019 dataset) were employed to rigorously evaluate HC-CCL. Quantitative results demonstrate significant performance advantages over existing approaches, achieving state-of-the-art performance across all metrics. The implementation will be released at https://github.com/xxmmss/HC-CCL .
{"title":"Semi-supervised Medical Image Segmentation Using Heterogeneous Complementary Correction Network and Confidence Contrastive Learning.","authors":"Lei Li, Miaosen Xue, Songyang Li, Zhuoli Dong, Tianli Liao, Peng Li","doi":"10.1007/s12539-025-00727-1","DOIUrl":"10.1007/s12539-025-00727-1","url":null,"abstract":"<p><p>Semi-supervised medical image segmentation techniques have demonstrated significant potential and effectiveness in clinical diagnosis. The prevailing approaches using the mean-teacher (MT) framework achieve promising image segmentation results. However, due to the unreliability of the pseudo labels generated by the teacher model, existing methods still have some inherent limitations that must be considered and addressed. In this paper, we propose an innovative semi-supervised method for medical image segmentation by combining the heterogeneous complementary correction network and confidence contrastive learning (HC-CCL). Specifically, we develop a triple-branch framework by integrating a heterogeneous complementary correction (HCC) network into the MT framework. HCC serves as an auxiliary branch that corrects prediction errors in the student model and provides complementary information. To improve the capacity for feature learning in our proposed model, we introduce a confidence contrastive learning (CCL) approach with a novel sampling strategy. Furthermore, we develop a momentum style transfer (MST) method to narrow the gap between labeled and unlabeled data distributions. In addition, we introduce a Cutout-style augmentation for unsupervised learning to enhance performance. Three medical image datasets (including left atrial (LA) dataset, NIH pancreas dataset, Brats-2019 dataset) were employed to rigorously evaluate HC-CCL. Quantitative results demonstrate significant performance advantages over existing approaches, achieving state-of-the-art performance across all metrics. The implementation will be released at https://github.com/xxmmss/HC-CCL .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"211-230"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144608261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Protein structures are fundamental to understanding their functions and interactions. With the continuous advancement of protein structure prediction methods, structure databases are rapidly expanding. Identifying the origin of protein structures is crucial for assessing the reliability of experimental resolution and computational prediction methods, as well as for guiding downstream biological research. Existing protein representation approaches often fail to capture subtle yet critical structural differences, posing challenges for precise structural traceability. To address this, we propose a structure-sensitive supervised deep learning model, Crystal vs Predicted Evaluator for Protein Structure (CPE-Pro), for the representation and origin evaluation of protein structures. CPE-Pro integrates a pre-trained protein Structural Sequence Language Model (SSLM) and Geometric Vector Perceptron-Graph Neural Network (GVP-GNN) to learn structure-aware protein representations and capture structural differences, enabling accurate classification across four origins of structural data. Preliminary results indicate that, compared to large-scale protein language models trained on extensive amino acid sequences, structural sequences enriched with local structural features enable the model to capture more informative protein characteristics, thereby enhancing and refining protein representations. Future research directions include extending the architecture to additional protein structure paradigms and developing evaluation methodologies for low-pLDDT predicted structures, providing more effective tools for protein structure analysis. The code, model weights, and all relevant materials are available at https://github.com/wr1102/CPE-Pro .
{"title":"CPE-Pro: A Structure-Sensitive Deep Learning Method for Protein Representation and Origin Evaluation.","authors":"Wenrui Gou, Wenhui Ge, Yang Tan, Mingchen Li, Guisheng Fan, Huiqun Yu","doi":"10.1007/s12539-025-00732-4","DOIUrl":"10.1007/s12539-025-00732-4","url":null,"abstract":"<p><p>Protein structures are fundamental to understanding their functions and interactions. With the continuous advancement of protein structure prediction methods, structure databases are rapidly expanding. Identifying the origin of protein structures is crucial for assessing the reliability of experimental resolution and computational prediction methods, as well as for guiding downstream biological research. Existing protein representation approaches often fail to capture subtle yet critical structural differences, posing challenges for precise structural traceability. To address this, we propose a structure-sensitive supervised deep learning model, Crystal vs Predicted Evaluator for Protein Structure (CPE-Pro), for the representation and origin evaluation of protein structures. CPE-Pro integrates a pre-trained protein Structural Sequence Language Model (SSLM) and Geometric Vector Perceptron-Graph Neural Network (GVP-GNN) to learn structure-aware protein representations and capture structural differences, enabling accurate classification across four origins of structural data. Preliminary results indicate that, compared to large-scale protein language models trained on extensive amino acid sequences, structural sequences enriched with local structural features enable the model to capture more informative protein characteristics, thereby enhancing and refining protein representations. Future research directions include extending the architecture to additional protein structure paradigms and developing evaluation methodologies for low-pLDDT predicted structures, providing more effective tools for protein structure analysis. The code, model weights, and all relevant materials are available at https://github.com/wr1102/CPE-Pro .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"195-210"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144247736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The discovery of neuropeptides offers numerous opportunities for identifying novel drugs and targets to treat a variety of diseases. While various computational methods have been proposed, there remains potential for further performance improvement. In this work, we introduce NeuroPpred-MSN, an innovative and efficient neuropeptide prediction model that leverages multi-feature fusion and Siamese networks. To comprehensively represent the information of neuropeptides, the peptide sequences are encoded by four encoding schemes (token embedding, word2vec embedding, protein language embedding, and handcrafted features). Then, the token embedding and word2vector embedding are fed to a Siamese network channel. In the other channel of the model, peptide sequences and their secondary structure sequences are fed into ProtT5-XL-UniRef50 model to generate the embedding features, while handcrafted encoding techniques are used to extract the physicochemical information. Then the two kinds of features are fused and fed into a bidirectional gated recurrent unit (Bi-GRU) network for further processing. Ultimately, the outputs of the two channels are integrated into a fully connected layer, thereby facilitating the generation of the final prediction. The results on the independent test set indicate that NeuroPpred-MSN exhibits superior predictive performance, with an area under the receiver operating characteristic curve (AUROC) of 98.3%, exceeding the performance of other state-of-the-art predictors. Specifically, compared to other optimal results, this model exhibits improvements of 1.52% in accuracy (ACC), 1.52% in F1 score (F1), 3.2% in Matthews correlation coefficient (MCC), and 1.55% in AUROC. The model was further evaluated on imbalanced datasets, where it achieved the highest values in AUROC, ACC, MCC, sensitivity (SN), and F1, further demonstrating its robustness and generalization. The model can be accessed at the following GitHub repository: https://github.com/wenjean/NeuroPpred-MSN .
{"title":"NeuroPpred-MSN: A Neuropeptide Prediction Model Based on Multi-feature Fusion and Siamese Networks.","authors":"Jian Wen, Minyu Chen, Yongqi Shen, Honghong Wang, Zhuoyu Wei, Lichuan Gu, Xiaolei Zhu","doi":"10.1007/s12539-025-00730-6","DOIUrl":"10.1007/s12539-025-00730-6","url":null,"abstract":"<p><p>The discovery of neuropeptides offers numerous opportunities for identifying novel drugs and targets to treat a variety of diseases. While various computational methods have been proposed, there remains potential for further performance improvement. In this work, we introduce NeuroPpred-MSN, an innovative and efficient neuropeptide prediction model that leverages multi-feature fusion and Siamese networks. To comprehensively represent the information of neuropeptides, the peptide sequences are encoded by four encoding schemes (token embedding, word2vec embedding, protein language embedding, and handcrafted features). Then, the token embedding and word2vector embedding are fed to a Siamese network channel. In the other channel of the model, peptide sequences and their secondary structure sequences are fed into ProtT5-XL-UniRef50 model to generate the embedding features, while handcrafted encoding techniques are used to extract the physicochemical information. Then the two kinds of features are fused and fed into a bidirectional gated recurrent unit (Bi-GRU) network for further processing. Ultimately, the outputs of the two channels are integrated into a fully connected layer, thereby facilitating the generation of the final prediction. The results on the independent test set indicate that NeuroPpred-MSN exhibits superior predictive performance, with an area under the receiver operating characteristic curve (AUROC) of 98.3%, exceeding the performance of other state-of-the-art predictors. Specifically, compared to other optimal results, this model exhibits improvements of 1.52% in accuracy (ACC), 1.52% in F1 score (F1), 3.2% in Matthews correlation coefficient (MCC), and 1.55% in AUROC. The model was further evaluated on imbalanced datasets, where it achieved the highest values in AUROC, ACC, MCC, sensitivity (SN), and F1, further demonstrating its robustness and generalization. The model can be accessed at the following GitHub repository: https://github.com/wenjean/NeuroPpred-MSN .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"326-340"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144208492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dysregulation of microRNAs (miRNAs) is a cause of progression in numerous diseases. Uncovering miRNA-disease associations (MDAs) is essential for discovering new biomarkers. Nonetheless, in contrast to conventional biological approaches, advanced computational approaches are typically more rapid and cost-effective. However, most computational methods still face several challenges: (i) integrating multi-source information (MSI); (ii) optimizing feature fusion; (iii) mitigating over-smoothing in graph-based models. This paper introduces a novel model, AMFCL. To encapsulate the miRNA-disease relationships, three types of networks are first constructed. After that, the node representations are learned via multi-layer graph sample and aggregate (GraphSAGE). An adaptive fusion mechanism (AFM) dynamically assigns weights to feature representations to optimize the fusion process. Additionally, a residual connection is used to combat the over-smoothing effect that occurs in graph-based models. The robustness of miRNA and disease embeddings is improved by contrastive learning (CL). Lastly, a multi-layer perceptron (MLP) has all feature embeddings fed into it for the computation of MDA scores. The corresponding experimental results show remarkable improvements in AMFCL compared to advanced models. Moreover, relevant case studies systematically validate the approach's effectiveness in identifying unknown MDAs.
{"title":"AMFCL: Predicting miRNA-Disease Associations Through Adaptive Multi-source Modality Fusion and Contrastive Learning.","authors":"Yanfang Yang, Shuang Wang, Wenyue Kang, Cuina Jiao, Yinglian Gao, Jinxing Liu","doi":"10.1007/s12539-025-00724-4","DOIUrl":"10.1007/s12539-025-00724-4","url":null,"abstract":"<p><p>Dysregulation of microRNAs (miRNAs) is a cause of progression in numerous diseases. Uncovering miRNA-disease associations (MDAs) is essential for discovering new biomarkers. Nonetheless, in contrast to conventional biological approaches, advanced computational approaches are typically more rapid and cost-effective. However, most computational methods still face several challenges: (i) integrating multi-source information (MSI); (ii) optimizing feature fusion; (iii) mitigating over-smoothing in graph-based models. This paper introduces a novel model, AMFCL. To encapsulate the miRNA-disease relationships, three types of networks are first constructed. After that, the node representations are learned via multi-layer graph sample and aggregate (GraphSAGE). An adaptive fusion mechanism (AFM) dynamically assigns weights to feature representations to optimize the fusion process. Additionally, a residual connection is used to combat the over-smoothing effect that occurs in graph-based models. The robustness of miRNA and disease embeddings is improved by contrastive learning (CL). Lastly, a multi-layer perceptron (MLP) has all feature embeddings fed into it for the computation of MDA scores. The corresponding experimental results show remarkable improvements in AMFCL compared to advanced models. Moreover, relevant case studies systematically validate the approach's effectiveness in identifying unknown MDAs.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"165-179"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144198995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Identification of drug-target interactions (DTIs) is critical for drug discovery and drug repositioning. However, most DTI methods that extract features from drug molecules and protein entities neglect specific substructure information of pharmacological responses, which leads to poor predictive performance. Moreover, most existing methods are based on molecular graphs or molecular descriptors to obtain abstract representations of molecules, but combining the two feature learning methods for DTI prediction remains unexplored. Therefore, a new ASCS-DTI framework for DTI prediction is proposed, which utilizes a substructure attention mechanism to flexibly capture substructures of compounds at different grain sizes, allowing the important substructure information of each molecule to be learned. Additionally, the framework combines three different molecular fingerprinting information to comprehensively characterize molecular representations. A stacked convolutional coding module processes the sequence information of target proteins in a multi-scale and multi-level view. Finally, multi-modal fusion of molecular graph features and molecular fingerprint features, along with multi-modal information encoding of DTIs, is performed by the feature fusion module. The method outperforms six advanced baseline models on different benchmark datasets: Biosnap, BindingDB, and Human, with a significant improvement in performance, particularly in maintaining strong results across different experimental settings.
{"title":"Sensing Compound Substructures Combined with Molecular Fingerprinting to Predict Drug-Target Interactions.","authors":"Wanhua Huang, Xuecong Tian, Ying Su, Sizhe Zhang, Chen Chen, Cheng Chen","doi":"10.1007/s12539-025-00698-3","DOIUrl":"10.1007/s12539-025-00698-3","url":null,"abstract":"<p><p>Identification of drug-target interactions (DTIs) is critical for drug discovery and drug repositioning. However, most DTI methods that extract features from drug molecules and protein entities neglect specific substructure information of pharmacological responses, which leads to poor predictive performance. Moreover, most existing methods are based on molecular graphs or molecular descriptors to obtain abstract representations of molecules, but combining the two feature learning methods for DTI prediction remains unexplored. Therefore, a new ASCS-DTI framework for DTI prediction is proposed, which utilizes a substructure attention mechanism to flexibly capture substructures of compounds at different grain sizes, allowing the important substructure information of each molecule to be learned. Additionally, the framework combines three different molecular fingerprinting information to comprehensively characterize molecular representations. A stacked convolutional coding module processes the sequence information of target proteins in a multi-scale and multi-level view. Finally, multi-modal fusion of molecular graph features and molecular fingerprint features, along with multi-modal information encoding of DTIs, is performed by the feature fusion module. The method outperforms six advanced baseline models on different benchmark datasets: Biosnap, BindingDB, and Human, with a significant improvement in performance, particularly in maintaining strong results across different experimental settings.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"357-371"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143772098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-06-05DOI: 10.1007/s12539-025-00721-7
Lei Shi, Ranran Gui, Li Wang, Peng Li, Qunfeng Niu
{"title":"A Multi-Task Deep Learning Approach for Simultaneous Sleep Staging and Apnea Detection for Elderly People.","authors":"Lei Shi, Ranran Gui, Li Wang, Peng Li, Qunfeng Niu","doi":"10.1007/s12539-025-00721-7","DOIUrl":"10.1007/s12539-025-00721-7","url":null,"abstract":"","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"341-356"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144234005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}