Graves' disease (GD) is a common autoimmune disorder. However, the circulating proteins that causally drive its pathogenesis remain largely unconfirmed by genetic evidence. Identifying such proteins is critical for developing novel therapeutics. Methods: We performed a two-sample Mendelian randomization (MR) analysis using genetic instruments for 4907 plasma proteins and summary statistics from a GD genome-wide association study (GWAS) of European ancestry to identify causal proteins. We subsequently conducted protein-protein interaction (PPI) network analysis and used AlphaFold3 to model the structural impact of a key variant, rs41271951, in the top candidate protein, cathepsin S (CTSS). Results: MR analysis identified 23 plasma proteins with putative causal effects on GD risk. Among these, CD5L showed the strongest evidence for colocalization (posterior [Formula: see text]), suggesting a shared causal variant. Network analysis revealed that these proteins converge on a novel complement-ECM-coagulation axis in GD pathogenesis. CTSS emerged as a central hub in this network. AlphaFold3 modeling suggested that the CTSS variant rs41271951 (p.Val7Ala), located within the signal peptide, induces subtle structural perturbations. The primary and most plausible consequence is a reduction in CTSS secretion and circulating levels, as supported by the pQTL data. Conclusion: This multi-omics analysis proposes a novel complement-ECM-coagulation axis in GD. By structurally and functionally linking reduced CTSS abundance and secretion to genetic variation, we identify CTSS as a potential candidate for therapeutic repurposing in GD.
{"title":"Mendelian randomization and AlphaFold3 analysis suggest putative causal plasma proteins in graves' disease.","authors":"Bin Deng, Zhanlin Liao, Liangzhi Huang, Ting Chen, Qiao Chen, Shuai Zhong, Zugui Huang","doi":"10.1142/S0219720025500234","DOIUrl":"https://doi.org/10.1142/S0219720025500234","url":null,"abstract":"<p><p>Graves' disease (GD) is a common autoimmune disorder. However, the circulating proteins that causally drive its pathogenesis remain largely unconfirmed by genetic evidence. Identifying such proteins is critical for developing novel therapeutics. <b>Methods:</b> We performed a two-sample Mendelian randomization (MR) analysis using genetic instruments for 4907 plasma proteins and summary statistics from a GD genome-wide association study (GWAS) of European ancestry to identify causal proteins. We subsequently conducted protein-protein interaction (PPI) network analysis and used AlphaFold3 to model the structural impact of a key variant, rs41271951, in the top candidate protein, cathepsin S (CTSS). <b>Results:</b> MR analysis identified 23 plasma proteins with putative causal effects on GD risk. Among these, CD5L showed the strongest evidence for colocalization (posterior [Formula: see text]), suggesting a shared causal variant. Network analysis revealed that these proteins converge on a novel complement-ECM-coagulation axis in GD pathogenesis. CTSS emerged as a central hub in this network. AlphaFold3 modeling suggested that the CTSS variant rs41271951 (p.Val7Ala), located within the signal peptide, induces subtle structural perturbations. The primary and most plausible consequence is a reduction in CTSS secretion and circulating levels, as supported by the pQTL data. <b>Conclusion:</b> This multi-omics analysis proposes a novel complement-ECM-coagulation axis in GD. By structurally and functionally linking reduced CTSS abundance and secretion to genetic variation, we identify CTSS as a potential candidate for therapeutic repurposing in GD.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 6","pages":"2550023"},"PeriodicalIF":0.7,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145769565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.1142/S0219720025500210
Samar Monem, Ashraf Darwish, Aboul Ella Hassanien, Heba M Afify
Multi-drug therapy has become more common in recent years, especially among older people who have many illnesses. However, patients are put at risk when unanticipated drug-drug interactions (DDIs) result in negative reactions or serious toxicity. Predicting possible DDI by computational model improves the drug design process and minimizes unexpected drug interactions and research expenses. In this paper, the proposed model is constructed by a complete graph convolutional neural (GCN) network on publicly DDI data from the DrugBank, including 65 classes for DDI prediction. In this data, the number of samples is 37,264 for two drugs with three optimal features, including the chemical, target, and enzyme. This multi-classification model consists of three phases, including drug preprocessing, three layers of GCN, and a fully connected network. The findings confirmed the performance of this proposed model, achieving an accuracy of 95.12%, which is the best result compared with previous works on the same data. Although the data was imbalanced, this paper primary contribution was to enhance both the computational time and the classification evaluation metrics rather than other state-of-the-art models. Explainable artificial intelligence (XAI) is applied SHapley Additive exPlainations (SHAP) to the proposed model to avoid misclassification and produce easily comprehensible results. This proposed model will help to explore the possible drug hazards and support intelligent pharmaceutical management.
{"title":"Multi-Classification of Drug-Drug interaction based on a complete graph convolutional neural network and explainable artificial intelligence.","authors":"Samar Monem, Ashraf Darwish, Aboul Ella Hassanien, Heba M Afify","doi":"10.1142/S0219720025500210","DOIUrl":"https://doi.org/10.1142/S0219720025500210","url":null,"abstract":"<p><p>Multi-drug therapy has become more common in recent years, especially among older people who have many illnesses. However, patients are put at risk when unanticipated drug-drug interactions (DDIs) result in negative reactions or serious toxicity. Predicting possible DDI by computational model improves the drug design process and minimizes unexpected drug interactions and research expenses. In this paper, the proposed model is constructed by a complete graph convolutional neural (GCN) network on publicly DDI data from the DrugBank, including 65 classes for DDI prediction. In this data, the number of samples is 37,264 for two drugs with three optimal features, including the chemical, target, and enzyme. This multi-classification model consists of three phases, including drug preprocessing, three layers of GCN, and a fully connected network. The findings confirmed the performance of this proposed model, achieving an accuracy of 95.12%, which is the best result compared with previous works on the same data. Although the data was imbalanced, this paper primary contribution was to enhance both the computational time and the classification evaluation metrics rather than other state-of-the-art models. Explainable artificial intelligence (XAI) is applied SHapley Additive exPlainations (SHAP) to the proposed model to avoid misclassification and produce easily comprehensible results. This proposed model will help to explore the possible drug hazards and support intelligent pharmaceutical management.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 6","pages":"2550021"},"PeriodicalIF":0.7,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145769592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The emergence of SARS-CoV-2 has highlighted the need for computational methods to identify neutralizing antibodies. Existing sequence-based tools for predicting antigen-antibody interactions struggle to effectively identify antibodies capable of neutralizing different variants due to high sequence similarity among SARS-CoV-2 strains and the similarity in the framework regions (FWRs) of antibodies. To address this challenge, particularly the issue of high sequence similarity among homologous antigens that impedes accurate prediction of antigen-antibody interactions, we developed a deep learning framework named PLMABFW. It differentiates homologous antigens using encoding techniques and network architecture design. It employs pre-trained protein language models ESM-2 for antigens and AntiBERTy for antibodies to encode sequences and capture additional features. The framework also incorporates both antigen features and their transposed versions to enhance antigen information capture. To validate the performance of PLMABFW, we collected a SARS-CoV-2 neutralization dataset. PLMABFW outperformed existing neutralizing antibody prediction tools (AbAgIntPre, DeepAAI) and docking tools (HDOCK, LSTM-PHV) in predicting neutralizing antibodies for homologous antigens. Furthermore, it effectively learned the interactions between the antibody's CDR-H3 region and antigens via a partial masking strategy. The model code is available on GitHub for customization and adaptation to diverse research needs.
{"title":"PLMABFW: A deep learning framework for predicting Antibody-Antigen interactions using protein language model.","authors":"Yongbing Chen, Qianyi Jia, Xinyue Jia, Zhiguo Fu, Pingping Sun, Bo Li, Zilin Ren","doi":"10.1142/S0219720025500209","DOIUrl":"https://doi.org/10.1142/S0219720025500209","url":null,"abstract":"<p><p>The emergence of SARS-CoV-2 has highlighted the need for computational methods to identify neutralizing antibodies. Existing sequence-based tools for predicting antigen-antibody interactions struggle to effectively identify antibodies capable of neutralizing different variants due to high sequence similarity among SARS-CoV-2 strains and the similarity in the framework regions (FWRs) of antibodies. To address this challenge, particularly the issue of high sequence similarity among homologous antigens that impedes accurate prediction of antigen-antibody interactions, we developed a deep learning framework named PLMABFW. It differentiates homologous antigens using encoding techniques and network architecture design. It employs pre-trained protein language models ESM-2 for antigens and AntiBERTy for antibodies to encode sequences and capture additional features. The framework also incorporates both antigen features and their transposed versions to enhance antigen information capture. To validate the performance of PLMABFW, we collected a SARS-CoV-2 neutralization dataset. PLMABFW outperformed existing neutralizing antibody prediction tools (AbAgIntPre, DeepAAI) and docking tools (HDOCK, LSTM-PHV) in predicting neutralizing antibodies for homologous antigens. Furthermore, it effectively learned the interactions between the antibody's CDR-H3 region and antigens via a partial masking strategy. The model code is available on GitHub for customization and adaptation to diverse research needs.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 6","pages":"2550020"},"PeriodicalIF":0.7,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145769584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-12-06DOI: 10.1142/S0219720025500167
Miaomiao Jin, Weiyang Chen, Yi Pan
Early lifespan prediction in Caenorhabditiselegans faces the challenges of indistinct discriminative signals, subtle and localized key features, difficulty in data annotation, and poor generalization. We propose Contrastive Learning-guided Channel Attention Modulation (CLCAM), in which supervised contrastive learning clusters individuals with the same lifespan and separates different classes. The resulting embedding drives channel-wise gains that are additively coupled to the backbone, thereby amplifying subtle morphological cues. At inference, the contrastive branch is removed, keeping FLOPs essentially unchanged with a modest runtime cost on our hardware. On a public dataset, CLCAM achieves an AUC-ROC of 0.84, showing a consistent improvement over the EfficientNet-B3 baseline (0.82) and a substantial gain over the prior WormNet model (0.61). Grad-CAM indicates attention focused on the pharynx and body-wall musculature, supporting the biological plausibility of the model's decisions. CLCAM offers a clear, low-overhead paradigm for early lifespan phenotyping. CLCAM code is available at https://github.com/JMM502/CLCAM/tree/master/clcam.
{"title":"Early lifespan prediction in <i>Caenorhabditis elegans</i> via contrastive learning and channel attention.","authors":"Miaomiao Jin, Weiyang Chen, Yi Pan","doi":"10.1142/S0219720025500167","DOIUrl":"10.1142/S0219720025500167","url":null,"abstract":"<p><p>Early lifespan prediction in <i>Caenorhabditis</i> <i>elegans</i> faces the challenges of indistinct discriminative signals, subtle and localized key features, difficulty in data annotation, and poor generalization. We propose Contrastive Learning-guided Channel Attention Modulation (CLCAM), in which supervised contrastive learning clusters individuals with the same lifespan and separates different classes. The resulting embedding drives channel-wise gains that are additively coupled to the backbone, thereby amplifying subtle morphological cues. At inference, the contrastive branch is removed, keeping FLOPs essentially unchanged with a modest runtime cost on our hardware. On a public dataset, CLCAM achieves an AUC-ROC of 0.84, showing a consistent improvement over the EfficientNet-B3 baseline (0.82) and a substantial gain over the prior WormNet model (0.61). Grad-CAM indicates attention focused on the pharynx and body-wall musculature, supporting the biological plausibility of the model's decisions. CLCAM offers a clear, low-overhead paradigm for early lifespan phenotyping. CLCAM code is available at https://github.com/JMM502/CLCAM/tree/master/clcam.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2550016"},"PeriodicalIF":0.7,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145688439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-11-26DOI: 10.1142/S0219720025500180
Tatiana V Koshlan, Kirill G Kulikov
This study investigates the thermodynamic behavior of molecular self-assembly along biochemical pathways leading to the formation of higher-order complexes. We specifically examine how thermodynamic parameters evolve - such as the dissociation constant [Formula: see text], the entropic contribution [Formula: see text], and the stability parameter of the interaction matrix [Formula: see text] - as molecular complexity increases from monomers to dimers, trimers, and tetramers. A central hypothesis is that stepwise thermodynamic modeling allows prediction of assembly pathways, identification of dead ends points, and the co-directional changes of thermodynamic variables during complex formation which reflect a preference for main biocomplex formation direction. We also introduce a practical rule to classify dead-end intermediates: a pathway step is considered a dead-end if the minimum [Formula: see text] occurs at a non-final intermediate or if [Formula: see text] falls below zero, indicating an entropic barrier. This criterion provides a reproducible way to flag non-viable assembly routes. We apply this analysis to several biologically relevant molecular systems, including the complex of LGP2 bound to an 8-base pair double-stranded RNA molecule, the dimer of VP35 protein interacting with double-stranded RNA and hexamer formations.
{"title":"Study of the mechanism of step-by-step interaction of viral proteins during replication and transcription.","authors":"Tatiana V Koshlan, Kirill G Kulikov","doi":"10.1142/S0219720025500180","DOIUrl":"https://doi.org/10.1142/S0219720025500180","url":null,"abstract":"<p><p>This study investigates the thermodynamic behavior of molecular self-assembly along biochemical pathways leading to the formation of higher-order complexes. We specifically examine how thermodynamic parameters evolve - such as the dissociation constant [Formula: see text], the entropic contribution [Formula: see text], and the stability parameter of the interaction matrix [Formula: see text] - as molecular complexity increases from monomers to dimers, trimers, and tetramers. A central hypothesis is that stepwise thermodynamic modeling allows prediction of assembly pathways, identification of dead ends points, and the co-directional changes of thermodynamic variables during complex formation which reflect a preference for main biocomplex formation direction. We also introduce a practical rule to classify dead-end intermediates: a pathway step is considered a dead-end if the minimum [Formula: see text] occurs at a non-final intermediate or if [Formula: see text] falls below zero, indicating an entropic barrier. This criterion provides a reproducible way to flag non-viable assembly routes. We apply this analysis to several biologically relevant molecular systems, including the complex of LGP2 bound to an 8-base pair double-stranded RNA molecule, the dimer of VP35 protein interacting with double-stranded RNA and hexamer formations.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 6","pages":"2550018"},"PeriodicalIF":0.7,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145769534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-12-06DOI: 10.1142/S0219720025500192
Fatemeh Khoushehgir, Zahra Noshad
Predicting interactions between ncRNAs and proteins is crucial for advancing our understanding of gene regulation, disease mechanisms, targeted drug design, and biomarker discovery, thereby driving innovation in research and therapeutic development. Numerous computational methods, particularly those employing machine learning and deep learning, have been proposed to address this challenge. Recent studies show that graph neural networks (GNNs) enhance ncRNA-protein interaction prediction accuracy by capturing intricate relationships and structural details in molecular data. However, current GNN approaches frequently rely on fixed-hop subgraphs for structural analysis, limiting their capacity to capture diverse interaction patterns fully. This fixed-hop approach may omit crucial nodes and edges outside the predefined neighborhood, potentially reducing prediction accuracy. To overcome this constraint, we introduce a novel method for ncRNA-protein interaction prediction by extracting the most informative subgraphs around each interaction using the personalized subgraph selection framework. These subgraphs are then utilized in a graph attention network (GAT) to learn node representations. K-mer frequencies are used to capture sequence-level features, while node2vec embeddings capture structural information, providing the GNN with a robust set of features. Experimental results on relevant datasets indicate a significant improvement in predicting ncRNA-protein interactions, with the algorithm maintaining an acceptable level of computational complexity even on large datasets. By integrating both sequence and structural insights through personalized subgraphs, this approach delivers a more accurate and scalable solution for predicting ncRNA-protein interactions.
{"title":"Predicting ncRNA-Protein interactions with a graph attention model exploiting personalized subgraphs.","authors":"Fatemeh Khoushehgir, Zahra Noshad","doi":"10.1142/S0219720025500192","DOIUrl":"10.1142/S0219720025500192","url":null,"abstract":"<p><p>Predicting interactions between ncRNAs and proteins is crucial for advancing our understanding of gene regulation, disease mechanisms, targeted drug design, and biomarker discovery, thereby driving innovation in research and therapeutic development. Numerous computational methods, particularly those employing machine learning and deep learning, have been proposed to address this challenge. Recent studies show that graph neural networks (GNNs) enhance ncRNA-protein interaction prediction accuracy by capturing intricate relationships and structural details in molecular data. However, current GNN approaches frequently rely on fixed-hop subgraphs for structural analysis, limiting their capacity to capture diverse interaction patterns fully. This fixed-hop approach may omit crucial nodes and edges outside the predefined neighborhood, potentially reducing prediction accuracy. To overcome this constraint, we introduce a novel method for ncRNA-protein interaction prediction by extracting the most informative subgraphs around each interaction using the personalized subgraph selection framework. These subgraphs are then utilized in a graph attention network (GAT) to learn node representations. <i>K</i>-mer frequencies are used to capture sequence-level features, while node2vec embeddings capture structural information, providing the GNN with a robust set of features. Experimental results on relevant datasets indicate a significant improvement in predicting ncRNA-protein interactions, with the algorithm maintaining an acceptable level of computational complexity even on large datasets. By integrating both sequence and structural insights through personalized subgraphs, this approach delivers a more accurate and scalable solution for predicting ncRNA-protein interactions.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2550019"},"PeriodicalIF":0.7,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145688423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01Epub Date: 2025-10-07DOI: 10.1142/S0219720025500143
Mohsen Karami Fath
Background: Dengue virus (DENV) remains a major public health challenge with limited vaccine options, and current licensed vaccines exhibit restricted efficacy and safety concerns in certain populations. Advanced immunoinformatics approaches offer opportunities for designing multi-epitope vaccines targeting conserved and immunogenic regions of viral proteins. Objective: To design and computationally evaluate a novel multi-epitope vaccine targeting the Envelope (E) and Non-Structural protein 1 (NSP1) of DENV-1 and DENV-2 using integrated immunoinformatics and structural bioinformatics. Methods: CTL, HTL, and B-cell epitopes were predicted from the E and NSP1 proteins and screened for antigenicity, non-allergenicity, and non-toxicity. High-affinity epitopes were linked with appropriate spacers and adjuvants (human [Formula: see text]-defensin-3 or 50S ribosomal protein L7/L12) to construct two vaccine candidates. Molecular docking with TLR2/TLR4, molecular dynamics (MD) simulations, MM/GBSA binding free energy analysis, population coverage assessment, codon optimization, and immune simulations were conducted. Control docking using scrambled peptides was included to evaluate binding specificity. Results: Both vaccine constructs were predicted to be stable, soluble, non-allergenic, and non-toxic. Vaccine 2 showed higher antigenicity (VaxiJen: 0.6127) and stronger TLR2 binding ([Formula: see text]: -110.37[Formula: see text]kcal/mol), whereas vaccine 1 demonstrated better solubility and TLR4 interaction stability. Control docking with scrambled peptides produced less favorable binding energies, supporting specificity. MD simulations confirmed structural stability, and immune simulations predicted robust humoral and cellular responses with high IFN-[Formula: see text] production. Population coverage exceeded 98% in most regions. Conclusion: The designed multi-epitope vaccines demonstrate promising immunogenic potential in silico. Experimental validation is required to confirm safety, efficacy, and protective capability against multiple DENV serotypes.
{"title":"The novel design of a multi-epitope vaccine candidate against the dengue virus using advanced immunoinformatics and structural analysis.","authors":"Mohsen Karami Fath","doi":"10.1142/S0219720025500143","DOIUrl":"https://doi.org/10.1142/S0219720025500143","url":null,"abstract":"<p><p><b>Background:</b> Dengue virus (DENV) remains a major public health challenge with limited vaccine options, and current licensed vaccines exhibit restricted efficacy and safety concerns in certain populations. Advanced immunoinformatics approaches offer opportunities for designing multi-epitope vaccines targeting conserved and immunogenic regions of viral proteins. <b>Objective:</b> To design and computationally evaluate a novel multi-epitope vaccine targeting the Envelope (E) and Non-Structural protein 1 (NSP1) of DENV-1 and DENV-2 using integrated immunoinformatics and structural bioinformatics. <b>Methods:</b> CTL, HTL, and B-cell epitopes were predicted from the E and NSP1 proteins and screened for antigenicity, non-allergenicity, and non-toxicity. High-affinity epitopes were linked with appropriate spacers and adjuvants (human [Formula: see text]-defensin-3 or 50S ribosomal protein L7/L12) to construct two vaccine candidates. Molecular docking with TLR2/TLR4, molecular dynamics (MD) simulations, MM/GBSA binding free energy analysis, population coverage assessment, codon optimization, and immune simulations were conducted. Control docking using scrambled peptides was included to evaluate binding specificity. <b>Results:</b> Both vaccine constructs were predicted to be stable, soluble, non-allergenic, and non-toxic. Vaccine 2 showed higher antigenicity (VaxiJen: 0.6127) and stronger TLR2 binding ([Formula: see text]: -110.37[Formula: see text]kcal/mol), whereas vaccine 1 demonstrated better solubility and TLR4 interaction stability. Control docking with scrambled peptides produced less favorable binding energies, supporting specificity. MD simulations confirmed structural stability, and immune simulations predicted robust humoral and cellular responses with high IFN-[Formula: see text] production. Population coverage exceeded 98% in most regions. <b>Conclusion:</b> The designed multi-epitope vaccines demonstrate promising immunogenic potential in silico. Experimental validation is required to confirm safety, efficacy, and protective capability against multiple DENV serotypes.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 5","pages":"2550014"},"PeriodicalIF":0.7,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145287359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01Epub Date: 2025-10-07DOI: 10.1142/S0219720025500155
Jie-Huei Wang, Tzung-Ying Guo, Yen-Yi Pai, Po-Lin Hou, Himani Kumari, Michael W Y Chan
Leveraging high-dimensional transcriptomic data from The Cancer Genome Atlas (TCGA) for cancer classification holds critical significance for advancing precision oncology. Matched Case-control Design (MCCD), by pairing similar cases with controls, can enhance statistical power and reduce confounding bias. However, high-dimensional data present challenges such as overfitting, instability, and difficulty in interpretation, collectively referred to as the "curse of dimensionality." Feature selection can help mitigate these problems by identifying representative variables and reducing redundancy. This study's innovation lies in integrating a set of existing techniques into a unified analytical workflow tailored specifically for MCCD, validated through both simulated and real TCGA datasets. We compared the performance of paired versus unpaired feature selection approaches under simulated 1:1 MCCD scenarios, and developed a modular, pluggable pipeline. This includes mean-centering, gene filtering, and a Corrected Feature Matrix (CFM) transformation step that explicitly preserves the matched structure. This transformation is then combined with machine learning classifiers to predict cancer status. We also incorporated Incremental Feature Selection (IFS) to refine gene subsets and employed gene set enrichment analysis to enhance biological interpretability. While the individual components we used, such as paired testing, CFM, IFS, and model-based gene set analysis, are not novel in themselves, we demonstrate an integrated workflow optimized for MCCD tasks. This workflow outperforms uncorrected approaches in terms of classification accuracy, feature stability, and interpretability. Our results indicate that this method can enhance cancer classification accuracy, facilitate biomarker discovery, and aid in building interpretable diagnostic models, providing a practical and scalable tool for precision medicine.
{"title":"Cancer classification and functional pathway discovery using TCGA transcriptomic profiles: A matched case-control framework.","authors":"Jie-Huei Wang, Tzung-Ying Guo, Yen-Yi Pai, Po-Lin Hou, Himani Kumari, Michael W Y Chan","doi":"10.1142/S0219720025500155","DOIUrl":"https://doi.org/10.1142/S0219720025500155","url":null,"abstract":"<p><p>Leveraging high-dimensional transcriptomic data from The Cancer Genome Atlas (TCGA) for cancer classification holds critical significance for advancing precision oncology. Matched Case-control Design (MCCD), by pairing similar cases with controls, can enhance statistical power and reduce confounding bias. However, high-dimensional data present challenges such as overfitting, instability, and difficulty in interpretation, collectively referred to as the \"curse of dimensionality.\" Feature selection can help mitigate these problems by identifying representative variables and reducing redundancy. This study's innovation lies in integrating a set of existing techniques into a unified analytical workflow tailored specifically for MCCD, validated through both simulated and real TCGA datasets. We compared the performance of paired versus unpaired feature selection approaches under simulated 1:1 MCCD scenarios, and developed a modular, pluggable pipeline. This includes mean-centering, gene filtering, and a Corrected Feature Matrix (CFM) transformation step that explicitly preserves the matched structure. This transformation is then combined with machine learning classifiers to predict cancer status. We also incorporated Incremental Feature Selection (IFS) to refine gene subsets and employed gene set enrichment analysis to enhance biological interpretability. While the individual components we used, such as paired testing, CFM, IFS, and model-based gene set analysis, are not novel in themselves, we demonstrate an integrated workflow optimized for MCCD tasks. This workflow outperforms uncorrected approaches in terms of classification accuracy, feature stability, and interpretability. Our results indicate that this method can enhance cancer classification accuracy, facilitate biomarker discovery, and aid in building interpretable diagnostic models, providing a practical and scalable tool for precision medicine.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 5","pages":"2550015"},"PeriodicalIF":0.7,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145287332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Non-small cell lung carcinoma (NSCLC) is well-known for its high incidence (about 80% of lung cancer) and genetic heterogeneity. Personalized driver mutations such as EGFR and KRAS have established targeted therapies with kinase inhibitors, whereas immune checkpoint inhibitors (ICIs) have revolutionized immunotherapy. However, challenges such as frequent drug resistance and low response rates highlight the need for novel therapeutic strategies. Boolean network modeling is a powerful mathematical tool to simulate complex biological processes and optimize potential treatment strategies. This study developed a Boolean network model for NSCLC patients with different mutational backgrounds and evaluated the therapeutic effects by incorporating key kinase mutation inhibitors and immunological interventions. Simulations in both the Boolean network model and another quantitative model consistently suggested that the optimal therapeutic strategy involves a combination of KRAS inhibitor and ICI for KRAS-mutant patients, which is also in line with mouse model studies and the KRYSTAL-7 phase-2 clinical trial data. It would be reasonable to expect further validations from the recently announced KRYSTAL-7 phase-3 clinical trial comparing the combined therapy over pembrolizumab monotherapy in the future. Our approach highlights the value of computational modeling to evaluate and refine therapeutic strategies for precision oncology.
{"title":"Modelling and optimizing combination therapeutic strategies for KRAS- and EGFR-mutant lung cancer.","authors":"Lanqi Wu, Ruocheng Yu, Minghui Yao, Md Matiur Rahaman, Zhaoyuan Fang","doi":"10.1142/S0219720025500179","DOIUrl":"https://doi.org/10.1142/S0219720025500179","url":null,"abstract":"<p><p>Non-small cell lung carcinoma (NSCLC) is well-known for its high incidence (about 80% of lung cancer) and genetic heterogeneity. Personalized driver mutations such as EGFR and KRAS have established targeted therapies with kinase inhibitors, whereas immune checkpoint inhibitors (ICIs) have revolutionized immunotherapy. However, challenges such as frequent drug resistance and low response rates highlight the need for novel therapeutic strategies. Boolean network modeling is a powerful mathematical tool to simulate complex biological processes and optimize potential treatment strategies. This study developed a Boolean network model for NSCLC patients with different mutational backgrounds and evaluated the therapeutic effects by incorporating key kinase mutation inhibitors and immunological interventions. Simulations in both the Boolean network model and another quantitative model consistently suggested that the optimal therapeutic strategy involves a combination of KRAS inhibitor and ICI for KRAS-mutant patients, which is also in line with mouse model studies and the KRYSTAL-7 phase-2 clinical trial data. It would be reasonable to expect further validations from the recently announced KRYSTAL-7 phase-3 clinical trial comparing the combined therapy over pembrolizumab monotherapy in the future. Our approach highlights the value of computational modeling to evaluate and refine therapeutic strategies for precision oncology.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 5","pages":"2550017"},"PeriodicalIF":0.7,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145287387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01DOI: 10.1142/S0219720025010012
Limsoon Wong
{"title":"Editorial: Guidelines for Credible Machine Learning in Computational Biology.","authors":"Limsoon Wong","doi":"10.1142/S0219720025010012","DOIUrl":"https://doi.org/10.1142/S0219720025010012","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 5","pages":"2501001"},"PeriodicalIF":0.7,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145287347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}