Pub Date : 2026-06-01Epub Date: 2026-01-13DOI: 10.1016/j.compbiolchem.2026.108893
Xinyi Wang , Minjie Zhou , Yunshan Su , Shunfang Wang
Multi-functional therapeutic peptides (MFTP) play a crucial role in drug development, exhibiting properties such as anti-cancer, anti-inflammatory effects, and more. Artificial intelligence-based predictors have been developed to identify MFTP and achieve satisfactory performance. However, these predictors often heavily rely on latent sequence features, overlooking physicochemical patterns and facing challenges with dataset imbalances. In this study, we propose MFTP-MFML, a model that combines interpretable multiple features and loss functions. Firstly, embedding features with positional information are input to the bi-directional long short-term memory (BiLSTM) network, generating latent representation information while preserving the original sequence information. Secondly, physicochemical attributes are utilized to supplement the amino acid composition and physicochemical properties across different functions of therapeutic peptides, and latent representation information are concatenated with these physicochemical attributes to enhance classification. Thirdly, addressing class imbalances and capturing label correlations, integration loss is employed, incorporating focal loss, binary cross entropy loss, and dice loss. Fourth, to enhance the diversity of functions of therapeutic peptides, MFTP-Mixed-90, a benchmark dataset comprising 27 functions, is constructed. Finally, to evaluate the performance of the model, we compare it with other methods on PrMFTP dataset and MFTP-Mixed-90 dataset. Experimental results demonstrate that MFTP-MFML outperforms existing methods, effectively utilizing integrated features and loss functions. Our code and the datasets are available at https://github.com/wongsing/MFTP-MFML.
{"title":"Integration of interpretable multi-features and multi-loss functions for multi-functional therapeutic peptide prediction via dataset construction","authors":"Xinyi Wang , Minjie Zhou , Yunshan Su , Shunfang Wang","doi":"10.1016/j.compbiolchem.2026.108893","DOIUrl":"10.1016/j.compbiolchem.2026.108893","url":null,"abstract":"<div><div>Multi-functional therapeutic peptides (MFTP) play a crucial role in drug development, exhibiting properties such as anti-cancer, anti-inflammatory effects, and more. Artificial intelligence-based predictors have been developed to identify MFTP and achieve satisfactory performance. However, these predictors often heavily rely on latent sequence features, overlooking physicochemical patterns and facing challenges with dataset imbalances. In this study, we propose MFTP-MFML, a model that combines interpretable multiple features and loss functions. Firstly, embedding features with positional information are input to the bi-directional long short-term memory (BiLSTM) network, generating latent representation information while preserving the original sequence information. Secondly, physicochemical attributes are utilized to supplement the amino acid composition and physicochemical properties across different functions of therapeutic peptides, and latent representation information are concatenated with these physicochemical attributes to enhance classification. Thirdly, addressing class imbalances and capturing label correlations, integration loss is employed, incorporating focal loss, binary cross entropy loss, and dice loss. Fourth, to enhance the diversity of functions of therapeutic peptides, MFTP-Mixed-90, a benchmark dataset comprising 27 functions, is constructed. Finally, to evaluate the performance of the model, we compare it with other methods on PrMFTP dataset and MFTP-Mixed-90 dataset. Experimental results demonstrate that MFTP-MFML outperforms existing methods, effectively utilizing integrated features and loss functions. Our code and the datasets are available at <span><span>https://github.com/wongsing/MFTP-MFML</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108893"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-01-08DOI: 10.1016/j.compbiolchem.2026.108891
Sonia Knawal , Ameer Mahmood Shaker , Mohammed Albahloul Rajab , Yatreb Omar Alkhbulli , Shifaa O. Alshammari , Emad Solouma , Mustafa Sabri Cheyad
The antimicrobial resistance is a serious health problem worldwide, and one of the causes of multidrug resistance is AmpC β-lactamase-producing Enterobacter cloacae. This work focused on analyzing the antibacterial properties of the plant-derived alkaloid sanguinarine and improve its derivatives with the help of artificial intelligence (AI) that may enhance antimicrobial activity. Therefore, molecular docking results demonstrate that AI-optimized Ligand 1 displayed the strongest binding affinity of −9.7 kcal/mol (AutoDock Vina) and −166.94 kcal/mol (HDOCK). Compared to Sanguinarine with a binding affinity of −9.2 kcal/mol. The lead AI-optimized Sanguinarine derivative with stable binding and good energetics was confirmed in molecular dynamics and in MMGBSA/MMPBSA analyses, suggesting that it may be a promising lead of AmpC β-lactamase inhibitors. Density Functional Theory (DFT) computations revealed that the lead AI-optimized compound had the HOMO-LUMO gap of 0.17089 eV and indicated moderate reactivity that would as a result of analysing the pharmacophore, key aromatic, hydrogen bond acceptor and hydrophobic sites were identified and the AI-optimized derivative was found to be a better drug-like assembly than natural Sanguinarine. The ADMET analysis showed potential lipophilicity, whole-GI absorption, and BBB permeability of the AI-optimized derivative and decreased toxicity in general, especially regarding neurotoxicity. The results indicate the possible improvement of the resistance to antibiotics using AI optimization that can aid in promoting the antimicrobial activity and safety set of Sanguinarine, which is a particularly promising additional tool that can be used to combat antibiotic resistance. These findings require further in vivo studies to validate the computational predictions that should prove their validity and confirm the possibility of using the results of trials with AI-optimized derivatives in clinical practice.
{"title":"AI-optimized sanguinarine derivatives inhibiting sortase A for combating AmpC β-lactamase resistance in Enterobacter cloacae: An integrated computational approach","authors":"Sonia Knawal , Ameer Mahmood Shaker , Mohammed Albahloul Rajab , Yatreb Omar Alkhbulli , Shifaa O. Alshammari , Emad Solouma , Mustafa Sabri Cheyad","doi":"10.1016/j.compbiolchem.2026.108891","DOIUrl":"10.1016/j.compbiolchem.2026.108891","url":null,"abstract":"<div><div>The antimicrobial resistance is a serious health problem worldwide, and one of the causes of multidrug resistance is <em>AmpC β-lactamase</em>-producing <em>Enterobacter cloacae</em>. This work focused on analyzing the antibacterial properties of the plant-derived alkaloid sanguinarine and improve its derivatives with the help of artificial intelligence (AI) that may enhance antimicrobial activity. Therefore, molecular docking results demonstrate that AI-optimized Ligand 1 displayed the strongest binding affinity of −9.7 kcal/mol (AutoDock Vina) and −166.94 kcal/mol (HDOCK). Compared to Sanguinarine with a binding affinity of −9.2 kcal/mol. The lead AI-optimized Sanguinarine derivative with stable binding and good energetics was confirmed in molecular dynamics and in MMGBSA/MMPBSA analyses, suggesting that it may be a promising lead of <em>AmpC β-lactamase</em> inhibitors. Density Functional Theory (DFT) computations revealed that the lead AI-optimized compound had the HOMO-LUMO gap of 0.17089 eV and indicated moderate reactivity that would as a result of analysing the pharmacophore, key aromatic, hydrogen bond acceptor and hydrophobic sites were identified and the AI-optimized derivative was found to be a better drug-like assembly than natural Sanguinarine. The ADMET analysis showed potential lipophilicity, whole-GI absorption, and BBB permeability of the AI-optimized derivative and decreased toxicity in general, especially regarding neurotoxicity. The results indicate the possible improvement of the resistance to antibiotics using AI optimization that can aid in promoting the antimicrobial activity and safety set of Sanguinarine, which is a particularly promising additional tool that can be used to combat antibiotic resistance. These findings require further in vivo studies to validate the computational predictions that should prove their validity and confirm the possibility of using the results of trials with AI-optimized derivatives in clinical practice.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108891"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-01-08DOI: 10.1016/j.compbiolchem.2026.108879
Xin Chen , Sheng Yi , Anwaier Yuemaierabola , Yuhan Liu , Liang He , Jing Ma , Wenjia Guo , Gang Sun
Determining molecular markers that mediate clinically aggressive phenotypes in prostate cancer is a significant challenge. While traditional linear models offer some interpretability, they often lack the precision needed for complex multi-omics data. Conversely, conventional deep learning methods provide robust predictions but typically remain opaque, hindering the identification of impactful molecular markers and biological mechanisms. To address this, we propose the Cross-omics Interpretable Neural Network (CINN), a biomimetic framework designed to predict prostate cancer states and identify key molecular markers by integrating diverse omics data.
CINN innovatively leverages prior biological knowledge from either pathway or protein–protein interaction (PPI) networks, combined with a novel trainable mask layer. This mask dynamically optimizes the strength of pre-defined biological connections, thereby enhancing both knowledge representation and model interpretability. The framework effectively integrates multi-omics data, including gene expression, somatic mutations, and copy number variations, to provide a holistic view of the disease.
Extensive experiments on a prostate cancer dataset demonstrate that CINN achieves substantial and statistically significant performance enhancements over a strong baseline (P-NET). Specifically, our best-performing variant, CINN-pw with a trainable mask, improved F1 scores by 13.1% to 0.843, Accuracy by 8.3% to 0.894, and AUC by 2.3% to 0.949. These gains were consistently statistically significant ( for most key metrics), underscoring the robustness of our approach. Crucially, CINN’s inherent interpretability facilitated the identification of pivotal molecular candidates, including TBP and TAF2, which are implicated in prostate cancer progression. These findings are supported by existing literature and provide valuable insights into the underlying mechanisms of prostate cancer, offering potential avenues for targeted therapeutic interventions and precision medicine.
{"title":"Cross-omics interpretable neural network for discovery of molecular markers in prostate cancer","authors":"Xin Chen , Sheng Yi , Anwaier Yuemaierabola , Yuhan Liu , Liang He , Jing Ma , Wenjia Guo , Gang Sun","doi":"10.1016/j.compbiolchem.2026.108879","DOIUrl":"10.1016/j.compbiolchem.2026.108879","url":null,"abstract":"<div><div>Determining molecular markers that mediate clinically aggressive phenotypes in prostate cancer is a significant challenge. While traditional linear models offer some interpretability, they often lack the precision needed for complex multi-omics data. Conversely, conventional deep learning methods provide robust predictions but typically remain opaque, hindering the identification of impactful molecular markers and biological mechanisms. To address this, we propose the Cross-omics Interpretable Neural Network (CINN), a biomimetic framework designed to predict prostate cancer states and identify key molecular markers by integrating diverse omics data.</div><div>CINN innovatively leverages prior biological knowledge from either pathway or protein–protein interaction (PPI) networks, combined with a novel trainable mask layer. This mask dynamically optimizes the strength of pre-defined biological connections, thereby enhancing both knowledge representation and model interpretability. The framework effectively integrates multi-omics data, including gene expression, somatic mutations, and copy number variations, to provide a holistic view of the disease.</div><div>Extensive experiments on a prostate cancer dataset demonstrate that CINN achieves substantial and statistically significant performance enhancements over a strong baseline (P-NET). Specifically, our best-performing variant, CINN-pw with a trainable mask, improved F1 scores by <strong>13.1%</strong> to <strong>0.843</strong>, Accuracy by <strong>8.3%</strong> to <strong>0.894</strong>, and AUC by <strong>2.3%</strong> to <strong>0.949</strong>. These gains were consistently statistically significant (<span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>0001</mn></mrow></math></span> for most key metrics), underscoring the robustness of our approach. Crucially, CINN’s inherent interpretability facilitated the identification of pivotal molecular candidates, including <strong>TBP</strong> and <strong>TAF2</strong>, which are implicated in prostate cancer progression. These findings are supported by existing literature and provide valuable insights into the underlying mechanisms of prostate cancer, offering potential avenues for targeted therapeutic interventions and precision medicine.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108879"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145941033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-01-09DOI: 10.1016/j.compbiolchem.2026.108892
Zhichen Cai , Jia Xue , Yongyi Zhou , Haijie Chen , Jingjing Shi , Lisi Zou , Cuihua Chen , Xunhong Liu
Epimedii Folium (EF) is a widely used traditional Chinese drug that encompasses a variety of species. It has been reported that the active constituents in EF vary, leading to the uneven quality of commercial medicinal materials. To investigate the specific differences, we developed a comprehensive evaluation method to access the quality of EF from four species. First, UPLC-Triple TOF-MS/MS was used to generate characteristic fingerprints of Epimedium samples; second, adenine-induced Kidney Yang deficiency model was established to evaluate the quality of four Epimedium varieties by evaluating biochemical markers and morphology; third, multivariate statistical analysis, including gray correlation analysis and bivariate correlation analysis was combined; and finally, UPLC-QTRAP MS/MS identified a potential biomarker. The results showed that 12 common peaks were identified in 40 batches derived from four Epimedium species. The severity of kidney and testicular lesions in experimental groups of rats showed significant improvement compared with the model. GCA and BCA indicated that three ingredients, icariin, quercitrin, and epimedin B were potential biomarkers, confirmed using LC-MS. In addition, epimedin B and icariin were significantly higher in EBM compared to the other three species, consistent with the pharmacological tests. The quality and efficacy of EF from different origins were stable, and all of them had protective effects on Kidney Yang deficiency of rats. Especially, all data suggested that EBM possesses superior quality than the other three. Overall, our work offers fundamental data for the thorough assessment and a fresh viewpoint on the quality control of EF from several species.
{"title":"Quality evaluation of Epmedii Folium from different species based on spectrum-efficacy relationship","authors":"Zhichen Cai , Jia Xue , Yongyi Zhou , Haijie Chen , Jingjing Shi , Lisi Zou , Cuihua Chen , Xunhong Liu","doi":"10.1016/j.compbiolchem.2026.108892","DOIUrl":"10.1016/j.compbiolchem.2026.108892","url":null,"abstract":"<div><div>Epimedii Folium (EF) is a widely used traditional Chinese drug that encompasses a variety of species. It has been reported that the active constituents in EF vary, leading to the uneven quality of commercial medicinal materials. To investigate the specific differences, we developed a comprehensive evaluation method to access the quality of EF from four species. First, UPLC-Triple TOF-MS/MS was used to generate characteristic fingerprints of Epimedium samples; second, adenine-induced Kidney Yang deficiency model was established to evaluate the quality of four Epimedium varieties by evaluating biochemical markers and morphology; third, multivariate statistical analysis, including gray correlation analysis and bivariate correlation analysis was combined; and finally, UPLC-QTRAP MS/MS identified a potential biomarker. The results showed that 12 common peaks were identified in 40 batches derived from four Epimedium species. The severity of kidney and testicular lesions in experimental groups of rats showed significant improvement compared with the model. GCA and BCA indicated that three ingredients, icariin, quercitrin, and epimedin B were potential biomarkers, confirmed using LC-MS. In addition, epimedin B and icariin were significantly higher in EBM compared to the other three species, consistent with the pharmacological tests. The quality and efficacy of EF from different origins were stable, and all of them had protective effects on Kidney Yang deficiency of rats. Especially, all data suggested that EBM possesses superior quality than the other three. Overall, our work offers fundamental data for the thorough assessment and a fresh viewpoint on the quality control of EF from several species.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108892"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145968016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Virtual screening has emerged as one of the most impactful in silico approaches for the identification of novel drug candidates, substantially reducing the cost and time associated with high-throughput screening (HTS). Ongoing efforts focus on exploring large-scale libraries of drug-like molecules to identify candidates with favourable pharmacological properties. In this study, we propose an applicability domain-based virtual screening strategy that extends beyond conventional approaches by prioritising compounds with ADMET profiles comparable to marketed drugs. To further enhance predictive performance, we developed a QSAR model on PI3K ligands using Light Gradient Boosting Machine (LGBM), which achieved an R2 value of 0.799, thereby providing an additional layer of validation for compound selection. The phosphoinositide 3-kinase (PI3K) pathway, a critical regulator of cell growth, survival, metabolism, and proliferation, is frequently dysregulated in multiple cancers and other diseases. Repurposing existing drugs that modulate PI3K activity offers the potential to accelerate therapeutic development while mitigating the challenges of de novo drug discovery.
To demonstrate the utility of our approach, we screened two compound libraries from Enamine—a hit-like locator library (400,000 molecules) and a kinase-focused library (64,000 molecules)—against the PI3K- isoform. In addition, a set of 1367 FDA-approved drugs was screened to identify potential candidates for repurposing. From these extensive datasets, three small molecules from the Enamine libraries were identified with favourable drug-like properties and synthetic accessibility compared with existing PI3K- inhibitors. Furthermore, one FDA-approved drug demonstrated potential PI3K- inhibitory activity. Pharmacophore mapping provided additional validation of their drug-likeness. Importantly, wet-lab evaluation of the FDA-approved drug confirmed its inhibitory activity, thereby supporting the computational predictions.
Overall, our integrated in silico and experimental framework highlights promising PI3K- inhibitors, underscoring the potential of applicability domain–based virtual screening and QSAR modelling for both drug discovery and repurposing.
{"title":"From virtual screening to bench: A dual-validation framework for drug repurposing against PI3K","authors":"Kavita Tewani , Zunnun Narmawala , Deepshikha Rathore , Heena Dave","doi":"10.1016/j.compbiolchem.2026.108934","DOIUrl":"10.1016/j.compbiolchem.2026.108934","url":null,"abstract":"<div><div>Virtual screening has emerged as one of the most impactful in silico approaches for the identification of novel drug candidates, substantially reducing the cost and time associated with high-throughput screening (HTS). Ongoing efforts focus on exploring large-scale libraries of drug-like molecules to identify candidates with favourable pharmacological properties. In this study, we propose an applicability domain-based virtual screening strategy that extends beyond conventional approaches by prioritising compounds with ADMET profiles comparable to marketed drugs. To further enhance predictive performance, we developed a QSAR model on PI3K ligands using Light Gradient Boosting Machine (LGBM), which achieved an R2 value of 0.799, thereby providing an additional layer of validation for compound selection. The phosphoinositide 3-kinase (PI3K) pathway, a critical regulator of cell growth, survival, metabolism, and proliferation, is frequently dysregulated in multiple cancers and other diseases. Repurposing existing drugs that modulate PI3K activity offers the potential to accelerate therapeutic development while mitigating the challenges of de novo drug discovery.</div><div>To demonstrate the utility of our approach, we screened two compound libraries from Enamine—a hit-like locator library (<span><math><mo>></mo></math></span>400,000 molecules) and a kinase-focused library (<span><math><mo>></mo></math></span>64,000 molecules)—against the PI3K-<span><math><mi>α</mi></math></span> isoform. In addition, a set of 1367 FDA-approved drugs was screened to identify potential candidates for repurposing. From these extensive datasets, three small molecules from the Enamine libraries were identified with favourable drug-like properties and synthetic accessibility compared with existing PI3K-<span><math><mi>α</mi></math></span> inhibitors. Furthermore, one FDA-approved drug demonstrated potential PI3K-<span><math><mi>α</mi></math></span> inhibitory activity. Pharmacophore mapping provided additional validation of their drug-likeness. Importantly, wet-lab evaluation of the FDA-approved drug confirmed its inhibitory activity, thereby supporting the computational predictions.</div><div>Overall, our integrated in silico and experimental framework highlights promising PI3K-<span><math><mi>α</mi></math></span> inhibitors, underscoring the potential of applicability domain–based virtual screening and QSAR modelling for both drug discovery and repurposing.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108934"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146115046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Drug repurposing represents a promising approach towards drug discovery that has the potential to improve patient outcomes and address unmet medical needs. This study attempted to repurpose existing sulfonamide drugs in search of novel anticancer drugs because of their effectiveness in treating bacterial infections. A search was made in DrugBank for Sulfonamide, and 25 drugs with functional groups like SH, OSO, CS, and -S- were chosen for our study. The drug properties, such as dipole moment, volume, polarisability, highest occupied molecular orbital (HOMO), lowest unoccupied molecular orbital (LUMO), and electrostatic potential map, were analysed through a quantum mechanical approach at different functionals: M062X, M06HF, and B3LYP with basis sets (6–31 +G*, LANL2DZ). The electrostatic potential map was analyzed to determine the magnitude, size, and distribution of the electron cloud surrounding the sulfur atoms. Analysis of NBO (Natural Bond Orbital) and NCI (Non-Covalent Interaction) plots confirmed the presence of intramolecular hydrogen bonding in the sulfonamide drugs. Furthermore, the frontier molecular orbitals (HOMO and LUMO) and the band gap were thoroughly examined for all drugs to identify the best electron acceptors and donors. Docking analysis was performed to have a lock-and-key model of 25 sulfonamide drugs with the most promising cancer-targeted protein (1ZZ1): histone deacetylases (HDACs). The best drug orientation (optimal position) was discussed and compared with the control ligand SHH based on the analysis of binding affinity and root mean square deviation (RMSD). Binding affinity of control ligand SHH is −8.1 kcal/mol for the 2nd pose, which matches exactly with 1ZZ1 SHH ligand. The drugs Tolazamide, Fezolinetant, Ensulizole, Taurolidine, Acetohexamide, Isoxicam, Sulfamethizole, Sulfamethoxazole, Sulfapyridine, Sulfaphenazole, and Dodecyl sulphate were observed to exhibit high molecular volume, polarizability, dipole moment and significant HOMO, LUMO values, which are recommended for further quantum mechanical calculations. The findings of this study will be essential for evaluating the properties of sulfonamide drugs from a drugbank using a variety of analyses in order to repurpose them as novel anticancer drugs. Quantum mechanical calculations will be performed on the optimal docking poses in future work. Keywords: Sulfonamide drugs, Docking, Histone deacetylases, Lipinsk’s rule, Binding affinity
药物再利用是一种很有前途的药物发现方法,有可能改善患者的治疗效果并解决未满足的医疗需求。由于磺胺类药物在治疗细菌感染方面的有效性,本研究试图重新利用现有的磺胺类药物来寻找新的抗癌药物。我们在DrugBank中检索了磺胺类药物,选取了含有SH、OSO、CS、- s -等功能基团的25种药物作为研究对象。利用量子力学方法分析了M062X、M06HF和B3LYP不同官能团(6-31 +G*, LANL2DZ)上的偶极矩、体积、极化率、最高占据分子轨道(HOMO)、最低未占据分子轨道(LUMO)和静电势图等药物性质。分析静电势图以确定硫原子周围电子云的大小、大小和分布。NBO(天然键轨道)和NCI(非共价相互作用)图的分析证实了磺胺类药物分子内氢键的存在。此外,对所有药物的前沿分子轨道(HOMO和LUMO)和带隙进行了彻底的检查,以确定最佳的电子受体和给体。对接分析25种磺胺类药物与最有希望的癌症靶向蛋白(1ZZ1):组蛋白去乙酰化酶(hdac)建立锁-钥匙模型。通过结合亲和力和均方根偏差(RMSD)分析,讨论了最佳药物取向(最佳位置),并与对照配体SHH进行了比较。控制配体SHH第二位姿的结合亲和力为-8.1 kcal/mol,与1ZZ1 SHH配体完全匹配。药物Tolazamide、Fezolinetant、ensullizole、taaurolidine、Acetohexamide、Isoxicam、sulfameethizole、Sulfamethoxazole、Sulfapyridine、Sulfaphenazole和Dodecyl sulphate表现出较高的分子体积、极化率、偶极矩和显著的HOMO、LUMO值,建议进一步进行量子力学计算。本研究的发现对于利用各种分析方法评估药库中磺胺类药物的特性,以便将其重新用作新型抗癌药物至关重要。在未来的工作中,将对最佳对接姿态进行量子力学计算。关键词:磺胺类药物,对接,组蛋白去乙酰化酶,利平斯克规则,结合亲和力
{"title":"Repurposing sulfonamide drugs as anticancer ligands and understanding its properties through density functional theory","authors":"Palanisamy Deepa , Balasubramanian Sundarakannan , Duraisamy Thirumeignanam","doi":"10.1016/j.compbiolchem.2026.108933","DOIUrl":"10.1016/j.compbiolchem.2026.108933","url":null,"abstract":"<div><div>Drug repurposing represents a promising approach towards drug discovery that has the potential to improve patient outcomes and address unmet medical needs. This study attempted to repurpose existing sulfonamide drugs in search of novel anticancer drugs because of their effectiveness in treating bacterial infections. A search was made in DrugBank for Sulfonamide, and 25 drugs with functional groups like SH, O<img>S<img>O, C<img>S, and -S- were chosen for our study. The drug properties, such as dipole moment, volume, polarisability, highest occupied molecular orbital (HOMO), lowest unoccupied molecular orbital (LUMO), and electrostatic potential map, were analysed through a quantum mechanical approach at different functionals: M062X, M06HF, and B3LYP with basis sets (6–31 +G*, LANL2DZ). The electrostatic potential map was analyzed to determine the magnitude, size, and distribution of the electron cloud surrounding the sulfur atoms. Analysis of NBO (Natural Bond Orbital) and NCI (Non-Covalent Interaction) plots confirmed the presence of intramolecular hydrogen bonding in the sulfonamide drugs. Furthermore, the frontier molecular orbitals (HOMO and LUMO) and the band gap were thoroughly examined for all drugs to identify the best electron acceptors and donors. Docking analysis was performed to have a lock-and-key model of 25 sulfonamide drugs with the most promising cancer-targeted protein (1ZZ1): histone deacetylases (HDACs). The best drug orientation (optimal position) was discussed and compared with the control ligand SHH based on the analysis of binding affinity and root mean square deviation (RMSD). Binding affinity of control ligand SHH is −8.1 kcal/mol for the 2nd pose, which matches exactly with 1ZZ1 SHH ligand. The drugs Tolazamide, Fezolinetant, Ensulizole, Taurolidine, Acetohexamide, Isoxicam, Sulfamethizole, Sulfamethoxazole, Sulfapyridine, Sulfaphenazole, and Dodecyl sulphate were observed to exhibit high molecular volume, polarizability, dipole moment and significant HOMO, LUMO values, which are recommended for further quantum mechanical calculations. The findings of this study will be essential for evaluating the properties of sulfonamide drugs from a drugbank using a variety of analyses in order to repurpose them as novel anticancer drugs. Quantum mechanical calculations will be performed on the optimal docking poses in future work. <strong>Keywords:</strong> Sulfonamide drugs, Docking, Histone deacetylases, Lipinsk’s rule, Binding affinity</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108933"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146121383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Urinary proteins are promising non-invasive biomarkers, but their low abundance and wide dynamic range make detection challenging. This study presents UriPred, a computational tool that integrates machine learning (ML), BLAST, and Motif-EmeRging and Classes-Identification (MERCI) to predict urinary proteins and facilitate the identification of liver cancer (LC) biomarkers. A dataset of 10588 urinary and non-urinary proteins was curated, from which two feature types were generated: 10074 compositional and 20 evolutionary features. Seven feature selection methods were applied to compositional features, and 11 ML algorithms were trained on different feature sets. Evolutionary features achieved the highest training performance (AUC 0.79, accuracy 71.99 %), whereas amino acid composition (AAC) with 20 features achieved identical validation AUC (0.74) and comparable accuracy while being computationally less expensive and consistently selected. The ML-AAC model was therefore chosen as the final model. This optimal model was integrated with BLAST and MERCI to create UriPred, which reduced false positives from 34.59 % (ML) to 3.12 % (hybrid) on the validation dataset and from 5.8 % (ML) to zero (hybrid) on an external dataset. Using UriPred, 53 LC differentially expressed protein-coding genes were predicted as urinary proteins. Protein-protein interaction analysis, AUROC evaluation (AUC > 0.80), survival analysis, and cross-verification of urine detectability with the Human Protein Atlas and Human Urine PeptideAtlas databases identified five proteins (KIF23, COL15A1, CTHRC1, MMP9, and SPP1) as potential LC biomarkers. UriPred efficiently predicts urinary proteins using AAC features and enables biomarker discovery for LC. The tool is publicly available at https://github.com/Dahrii-Paul/UriPred.
尿蛋白是一种很有前途的非侵入性生物标志物,但其低丰度和宽动态范围给检测带来了挑战。本研究提出了UriPred,一种集成了机器学习(ML), BLAST和Motif-EmeRging and Classes-Identification (MERCI)的计算工具,用于预测尿蛋白并促进肝癌(LC)生物标志物的鉴定。收集了10588个尿蛋白和非尿蛋白的数据集,从中生成了两种特征类型:10074个组成特征和20个进化特征。将7种特征选择方法应用于组合特征,并在不同的特征集上训练了11种 ML算法。进化特征获得了最高的训练性能(AUC 0.79,准确率71.99 %),而氨基酸组成(AAC)与20个特征获得相同的验证AUC(0.74)和相当的准确性,同时计算成本更低,选择一致。因此选择ML-AAC模型作为最终模型。该优化模型与BLAST和MERCI集成创建了UriPred,在验证数据集上将假阳性从34.59 % (ML)减少到3.12 %(混合),在外部数据集上将假阳性从5.8 % (ML)减少到零(混合)。利用UriPred预测了53个LC差异表达蛋白编码基因作为尿蛋白。蛋白-蛋白相互作用分析、AUROC评估(AUC > 0.80)、生存分析以及与Human Protein Atlas和Human urine PeptideAtlas数据库交叉验证尿液可检出性,确定了5种蛋白(KIF23、COL15A1、CTHRC1、MMP9和SPP1)作为潜在的LC生物标志物。UriPred利用AAC特征有效地预测尿蛋白,并使LC的生物标志物发现成为可能。该工具可在https://github.com/Dahrii-Paul/UriPred上公开获取。
{"title":"UriPred: Machine learning prediction of urinary proteins and identification of biomarkers for liver cancer","authors":"Dahrii Paul, Vigneshwar Suriya Prakash Sinnarasan, Rajesh Das, Md Mujibur Rahman Sheikh, Santhosh Manickannan, Amouda Venkatesan","doi":"10.1016/j.compbiolchem.2026.108946","DOIUrl":"10.1016/j.compbiolchem.2026.108946","url":null,"abstract":"<div><div>Urinary proteins are promising non-invasive biomarkers, but their low abundance and wide dynamic range make detection challenging. This study presents UriPred, a computational tool that integrates machine learning (ML), BLAST, and Motif-EmeRging and Classes-Identification (MERCI) to predict urinary proteins and facilitate the identification of liver cancer (LC) biomarkers. A dataset of 10588 urinary and non-urinary proteins was curated, from which two feature types were generated: 10074 compositional and 20 evolutionary features. Seven feature selection methods were applied to compositional features, and 11 ML algorithms were trained on different feature sets. Evolutionary features achieved the highest training performance (AUC 0.79, accuracy 71.99 %), whereas amino acid composition (AAC) with 20 features achieved identical validation AUC (0.74) and comparable accuracy while being computationally less expensive and consistently selected. The ML-AAC model was therefore chosen as the final model. This optimal model was integrated with BLAST and MERCI to create UriPred, which reduced false positives from 34.59 % (ML) to 3.12 % (hybrid) on the validation dataset and from 5.8 % (ML) to zero (hybrid) on an external dataset. Using UriPred, 53 LC differentially expressed protein-coding genes were predicted as urinary proteins. Protein-protein interaction analysis, AUROC evaluation (AUC > 0.80), survival analysis, and cross-verification of urine detectability with the Human Protein Atlas and Human Urine PeptideAtlas databases identified five proteins (KIF23, COL15A1, CTHRC1, MMP9, and SPP1) as potential LC biomarkers. UriPred efficiently predicts urinary proteins using AAC features and enables biomarker discovery for LC. The tool is publicly available at <span><span>https://github.com/Dahrii-Paul/UriPred</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108946"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146183832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-02-08DOI: 10.1016/j.compbiolchem.2026.108939
Shirui Li , Feng Jiang , Xiuyang Li
Background
Irritable Bowel Syndrome (IBS) and Major Depressive Disorder (MDD) exhibit high comorbidity, driven by dysregulation of gut-brain axis interactions. Despite evidence of shared pathophysiology, the core molecular mechanisms and therapeutic targets remain elusive, largely due to clinical heterogeneity and fragmented research approaches.
Methods
We established an integrated framework combining: (1) Bidirectional epidemiological analysis using the CHARLS cohort; (2) Multi-tissue transcriptomics (intestinal mucosa/prefrontal cortex) from GEO datasets using differential expression analysis, WGCNA, and machine learning (LASSO/RF/SVM-RFE); (3) PPI network reconstruction followed by multi-algorithm topological validation; (4) Functional enrichment and immune deconvolution (CIBERSORTx); (5) Bidirectional pharmacology (CTD-based compounds screening and TCM network pharmacology); (6) Molecular docking and short-term molecular dynamics (MD) simulations for binding stability assessment; (7) ADME/Tox Profiling.
Results
Epidemiological analysis indicated bidirectional IBS-MDD risk (Digestive to Mental: OR=1.82(95%CI:1.65-6.79), Mental to Digestive: OR=3.34(95%CI:1.17-2.82)). Integrated transcriptomics identified MPO, LCN2, and GMPPB as core comorbidity genes, validated across cohorts and linked to neutrophil activation, iron dysregulation, and glycosylation defects. Immune profiling revealed tissue-specific dysregulation, with gut-dominated neutrophil/M2 macrophage infiltration in IBS versus brain-enriched CD8⁺ T/NK cells in MDD. Bidirectional pharmacology prioritized bisphenol A/lipopolysaccharide (pathogenic) and resveratrol/quercetin (therapeutic) as high-affinity binders to core targets (ΔG < –7.0 kcal/mol). Short-term MD simulations provided preliminary support for the binding of key therapeutic compounds to targets GMPPB and MPO, supported by TCM herbs (e.g., Jujubae Fructus).
Conclusion
Our study analyzes neuro-immune-endocrine crosstalk underlying IBS-MDD comorbidity, nominating MPO/LCN2/GMPPB as diagnostic biomarkers and therapeutic targets. Environmental toxins and natural compounds offer actionable strategies for gut-brain axis modulation.
{"title":"Targeting the MPO/LCN2/GMPPB axis in IBS-depression comorbidity: Integrated multi-omics and bidirectional network pharmacology for precision diagnostics and therapeutics","authors":"Shirui Li , Feng Jiang , Xiuyang Li","doi":"10.1016/j.compbiolchem.2026.108939","DOIUrl":"10.1016/j.compbiolchem.2026.108939","url":null,"abstract":"<div><h3>Background</h3><div>Irritable Bowel Syndrome (IBS) and Major Depressive Disorder (MDD) exhibit high comorbidity, driven by dysregulation of gut-brain axis interactions. Despite evidence of shared pathophysiology, the core molecular mechanisms and therapeutic targets remain elusive, largely due to clinical heterogeneity and fragmented research approaches.</div></div><div><h3>Methods</h3><div>We established an integrated framework combining: (1) Bidirectional epidemiological analysis using the CHARLS cohort; (2) Multi-tissue transcriptomics (intestinal mucosa/prefrontal cortex) from GEO datasets using differential expression analysis, WGCNA, and machine learning (LASSO/RF/SVM-RFE); (3) PPI network reconstruction followed by multi-algorithm topological validation; (4) Functional enrichment and immune deconvolution (CIBERSORTx); (5) Bidirectional pharmacology (CTD-based compounds screening and TCM network pharmacology); (6) Molecular docking and short-term molecular dynamics (MD) simulations for binding stability assessment; (7) ADME/Tox Profiling.</div></div><div><h3>Results</h3><div>Epidemiological analysis indicated bidirectional IBS-MDD risk (Digestive to Mental: OR=1.82(95%CI:1.65-6.79), Mental to Digestive: OR=3.34(95%CI:1.17-2.82)). Integrated transcriptomics identified MPO, LCN2, and GMPPB as core comorbidity genes, validated across cohorts and linked to neutrophil activation, iron dysregulation, and glycosylation defects. Immune profiling revealed tissue-specific dysregulation, with gut-dominated neutrophil/M2 macrophage infiltration in IBS versus brain-enriched CD8⁺ T/NK cells in MDD. Bidirectional pharmacology prioritized bisphenol A/lipopolysaccharide (pathogenic) and resveratrol/quercetin (therapeutic) as high-affinity binders to core targets (ΔG < –7.0 kcal/mol). Short-term MD simulations provided preliminary support for the binding of key therapeutic compounds to targets GMPPB and MPO, supported by TCM herbs (e.g., Jujubae Fructus).</div></div><div><h3>Conclusion</h3><div>Our study analyzes neuro-immune-endocrine crosstalk underlying IBS-MDD comorbidity, nominating MPO/LCN2/GMPPB as diagnostic biomarkers and therapeutic targets. Environmental toxins and natural compounds offer actionable strategies for gut-brain axis modulation.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108939"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146183922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-01-15DOI: 10.1016/j.compbiolchem.2026.108911
Yunfei Li , weiwei Liu , Yiting Tian , Ying Wang , Yuelong Jia , Xiyin Wang
DNA rearrangements contribute to the formation of new chromosomes, which are often the foundation of speciation. Deciphering the order of DNA rearrangements facilitates the reconstruction of evolutionary trajectories from extant to ancestral chromosomes, a task that is computationally NP-hard. Here, a mathematically rigorous chromosome-rearrangement model is integrated with the “Chromosomal Inversion Path Exploration via Monte-Carlo Tree Search (CIPE-MCTS)” framework in a deeply coupled manner. This integration yields a robust analytical framework that ensures both global optimality and computational efficiency, thereby enabling the precise reconstruction of complex chromosomal evolutionary trajectories. The framework rigorously characterizes elementary rearrangement operations—inversion, translocation, fusion, and fission—within a strict graph-theoretic and group-theoretic formalism. On this basis, MCTS is introduced, guided by a domain-specific heuristic evaluation function that integrates a decay function and a dynamic hash table. The Upper Confidence Bound for Trees (UCT) serves as the node-selection criterion, while an extensible sampling strategy enables rapid convergence toward near-optimal solutions within the vast state space. Comprehensive benchmark tests demonstrate that, under identical hardware constraints, this method achieves significantly higher reconstruction accuracy than exhaustive global search, while its overall running time is markedly shorter than that of a single heuristic algorithm, thereby achieving simultaneous improvements in both accuracy and efficiency. This study introduces, for the first time, a scalable, reproducible, and mathematically guaranteed tool for accurate analysis of complex plant genomes, offering a novel quantitative perspective for elucidating the chromosomal basis of plant diversification and adaptive evolution.
{"title":"Optimal path reconstruction of plant chromosome evolution","authors":"Yunfei Li , weiwei Liu , Yiting Tian , Ying Wang , Yuelong Jia , Xiyin Wang","doi":"10.1016/j.compbiolchem.2026.108911","DOIUrl":"10.1016/j.compbiolchem.2026.108911","url":null,"abstract":"<div><div>DNA rearrangements contribute to the formation of new chromosomes, which are often the foundation of speciation. Deciphering the order of DNA rearrangements facilitates the reconstruction of evolutionary trajectories from extant to ancestral chromosomes, a task that is computationally NP-hard. Here, a mathematically rigorous chromosome-rearrangement model is integrated with the “Chromosomal Inversion Path Exploration via Monte-Carlo Tree Search (CIPE-MCTS)” framework in a deeply coupled manner. This integration yields a robust analytical framework that ensures both global optimality and computational efficiency, thereby enabling the precise reconstruction of complex chromosomal evolutionary trajectories. The framework rigorously characterizes elementary rearrangement operations—inversion, translocation, fusion, and fission—within a strict graph-theoretic and group-theoretic formalism. On this basis, MCTS is introduced, guided by a domain-specific heuristic evaluation function that integrates a decay function and a dynamic hash table. The Upper Confidence Bound for Trees (UCT) serves as the node-selection criterion, while an extensible sampling strategy enables rapid convergence toward near-optimal solutions within the vast state space. Comprehensive benchmark tests demonstrate that, under identical hardware constraints, this method achieves significantly higher reconstruction accuracy than exhaustive global search, while its overall running time is markedly shorter than that of a single heuristic algorithm, thereby achieving simultaneous improvements in both accuracy and efficiency. This study introduces, for the first time, a scalable, reproducible, and mathematically guaranteed tool for accurate analysis of complex plant genomes, offering a novel quantitative perspective for elucidating the chromosomal basis of plant diversification and adaptive evolution.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108911"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-01-08DOI: 10.1016/j.compbiolchem.2026.108878
Euphinia Tiberius Kharsyiemiong, Bhupal Haribhakta Raghunandan, Seema Mishra
PVT1 lncRNA can regulate multi-gene expression through diverse mechanisms, one of which is through binding interactions with mRNAs. Our previous study highlighted its regulatory role in pan-cancer systems through predicted interactions with select mRNAs that are significantly differentially expressed across 15 cancer types. The structural basis of these interactions are yet to be propounded. Here, in order to identify and compare secondary structural features that may mediate PVT1 binding to select cancer-relevant mRNAs, and to determine evidence of evolutionary conservation, if any, we adopted the secondary structure folding information to identify key intermolecular interactions mediating the binding of PVT1 to specific regions, 5′UTR, coding region, and 3′UTR, of these select mRNAs, which may influence the translation process. Forming stable secondary structures, using both Watson-Crick and non-Watson-Crick base pairs, the flexibility of PVT1 lncRNA in interacting with these varied molecules at specific locations is deduced at the secondary structure level. To demonstrate the possible presence of conserved structural elements in PVT1 secondary structure generated based on 7SL non-coding RNA seed sequences, covariation analysis identified 10 significantly co-varying base pairs, suggesting structural conservation. The location of the start point of these lncRNA-mRNA interactions is majorly in the open loop regions. A-U nucleotides in the loops are observed to be higher in number than G-C nucleotides in PVT1 secondary structure. This may initiate multiple base-pairing interactions with other macromolecules more readily, owing to a lesser strength of the hydrogen bonding interactions between A-U base pairs. In the case of these mRNAs, comparatively speaking, there is a variability in the number of purines in the loop regions in their respective secondary structures. Since GC content correlates with the stability of mRNA secondary structures, our analysis shows that even though there is a variable sequence length, some of these mRNAs may demonstrate a higher stability of their specific secondary structures based on a higher GC content. Further, in order to potentially correlate with high protein expression, the distal segment of CDS and the 3′UTR regions of mRNAs require the presence of increased secondary structure. In our analysis, we found the same underlying pattern in a few of our select mRNA molecules. Exploration of the sequence and structural details of these lncRNA-mRNA interactions led us to an insight on a probable mechanism of a single PVT1 molecule being able to bind multiple mRNAs simultaneously or sequentially, in a spatio-temporal manner. Our research also seeks to further elucidate the contribution of bases and intermolecular interactions in the formation of these complexes.
{"title":"Structural basis of long non-coding RNA PVT1 interactions with select mRNAs universal in pan-cancer system: A computational study","authors":"Euphinia Tiberius Kharsyiemiong, Bhupal Haribhakta Raghunandan, Seema Mishra","doi":"10.1016/j.compbiolchem.2026.108878","DOIUrl":"10.1016/j.compbiolchem.2026.108878","url":null,"abstract":"<div><div><em>PVT1</em> lncRNA can regulate multi-gene expression through diverse mechanisms, one of which is through binding interactions with mRNAs. Our previous study highlighted its regulatory role in pan-cancer systems through predicted interactions with select mRNAs that are significantly differentially expressed across 15 cancer types. The structural basis of these interactions are yet to be propounded. Here, in order to identify and compare secondary structural features that may mediate <em>PVT1</em> binding to select cancer-relevant mRNAs, and to determine evidence of evolutionary conservation, if any, we adopted the secondary structure folding information to identify key intermolecular interactions mediating the binding of <em>PVT1</em> to specific regions, 5′UTR, coding region, and 3′UTR, of these select mRNAs, which may influence the translation process. Forming stable secondary structures, using both Watson-Crick and non-Watson-Crick base pairs, the flexibility of <em>PVT1</em> lncRNA in interacting with these varied molecules at specific locations is deduced at the secondary structure level. To demonstrate the possible presence of conserved structural elements in <em>PVT1</em> secondary structure generated based on 7SL non-coding RNA seed sequences, covariation analysis identified 10 significantly co-varying base pairs, suggesting structural conservation. The location of the start point of these lncRNA-mRNA interactions is majorly in the open loop regions. A-U nucleotides in the loops are observed to be higher in number than G-C nucleotides in <em>PVT1</em> secondary structure. This may initiate multiple base-pairing interactions with other macromolecules more readily, owing to a lesser strength of the hydrogen bonding interactions between A-U base pairs. In the case of these mRNAs, comparatively speaking, there is a variability in the number of purines in the loop regions in their respective secondary structures. Since GC content correlates with the stability of mRNA secondary structures, our analysis shows that even though there is a variable sequence length, some of these mRNAs may demonstrate a higher stability of their specific secondary structures based on a higher GC content. Further, in order to potentially correlate with high protein expression, the distal segment of CDS and the 3′UTR regions of mRNAs require the presence of increased secondary structure. In our analysis, we found the same underlying pattern in a few of our select mRNA molecules. Exploration of the sequence and structural details of these lncRNA-mRNA interactions led us to an insight on a probable mechanism of a single <em>PVT1</em> molecule being able to bind multiple mRNAs simultaneously or sequentially, in a spatio-temporal manner. Our research also seeks to further elucidate the contribution of bases and intermolecular interactions in the formation of these complexes.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108878"},"PeriodicalIF":3.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}