Frontiers in bioinformatics最新文献

Cross-disease transcriptomic meta-analysis and network pharmacology reveal key therapeutic targets in rheumatoid arthritis, systemic lupus erythematosus and multiple sclerosis. 跨疾病转录组荟萃分析和网络药理学揭示了类风湿关节炎、系统性红斑狼疮和多发性硬化症的关键治疗靶点。

IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2026-01-21 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1744094

K Lakshmi, Sundararajan Vino

Autoimmune disease has a complex etiology that remains not fully understood. We aimed to identify highly perturbed DEGs and hub genes associated with autoimmune disease Rheumatoid Arthritis (RA), Systemic Lupus Erythematosus (SLE) and Multiple Sclerosis (MS). To find potentially lead to more effective therapies that target the root causes of these diseases.

Materials and methods: Datasets for autoimmune diseases (RA, SLE, and MS) were collected from the GEO database. Differentially expressed genes were identified and subjected to meta-analysis to obtain common DEGs, which were then used for functional enrichment analysis GO and pathway analysis. A PPI network was constructed, and topology-based ranking identified hub genes. These hub genes were further analyzed through regulatory network analysis (TF and miRNA), gene-disease association studies, and drug-gene interaction analysis. Finally, molecular docking and molecular dynamics (MD) simulations were performed on the hub genes.

Results: A total of 341 differentially expressed genes were identified, with 172 upregulated and 169 downregulated genes. Among these, eight hub genes STAT1, PTPRC, IRF8, JAK2, IL10RA, OAS2, CCR1, and IFI44L were found to be closely associated with the disease. Functional enrichment analysis revealed significant involvement in 143 biological processes, 53 cellular components, and 67 molecular functions, as well as 60 KEGG pathways. Further regulatory network analysis highlighted the interactions of the suggested hub genes with 198 transcription factors (TFs) and 993 microRNAs (miRNAs). Additionally, these genes were associated to 2,769 diseases, and 132 drugs were identified to interact with them. Molecular docking studies, along with Molecular Dynamics Simulation (MDS) stability analysis, demonstrated the potential of natural compounds and known immunomodulatory drugs as promising therapeutic targets for clinical application.

Conclusion: These findings explored identifying the DEGs among shade of the autoimmune disease RA, SLE, MS, and this hub gene are associated with transcription factors are most crucial role play in the disease potentially clinical therapeutic targets of the autoimmune disease.

自身免疫性疾病具有复杂的病因，至今仍未完全了解。我们旨在鉴定与自身免疫性疾病类风湿关节炎（RA）、系统性红斑狼疮（SLE）和多发性硬化症（MS）相关的高度扰动的DEGs和hub基因。寻找可能导致更有效的治疗方法，针对这些疾病的根本原因。材料和方法：自身免疫性疾病（RA、SLE和MS）的数据集从GEO数据库中收集。鉴定差异表达基因并进行荟萃分析以获得共同的deg，然后用于功能富集分析GO和途径分析。构建了一个PPI网络，并基于拓扑对枢纽基因进行排序。通过调控网络分析（TF和miRNA）、基因-疾病关联研究和药物-基因相互作用分析进一步分析这些枢纽基因。最后，对枢纽基因进行分子对接和分子动力学模拟。结果：共鉴定出341个差异表达基因，其中上调172个，下调169个。其中，8个中心基因STAT1、PTPRC、IRF8、JAK2、IL10RA、OAS2、CCR1和IFI44L被发现与该疾病密切相关。功能富集分析显示，其参与143个生物过程、53个细胞组分、67个分子功能和60个KEGG通路。进一步的调控网络分析强调了所建议的枢纽基因与198个转录因子（TFs）和993个microrna （miRNAs）的相互作用。此外，这些基因与2769种疾病相关，并确定了132种药物与它们相互作用。分子对接研究以及分子动力学模拟（MDS）稳定性分析证明了天然化合物和已知免疫调节药物作为临床应用的有希望的治疗靶点的潜力。结论：本研究探索了自身免疫性疾病RA、SLE、MS中deg基因的表达，该中心基因与转录因子相关，在该疾病中起着至关重要的作用，是自身免疫性疾病潜在的临床治疗靶点。

{"title":"Cross-disease transcriptomic meta-analysis and network pharmacology reveal key therapeutic targets in rheumatoid arthritis, systemic lupus erythematosus and multiple sclerosis.","authors":"K Lakshmi, Sundararajan Vino","doi":"10.3389/fbinf.2025.1744094","DOIUrl":"10.3389/fbinf.2025.1744094","url":null,"abstract":"Autoimmune disease has a complex etiology that remains not fully understood. We aimed to identify highly perturbed DEGs and hub genes associated with autoimmune disease Rheumatoid Arthritis (RA), Systemic Lupus Erythematosus (SLE) and Multiple Sclerosis (MS). To find potentially lead to more effective therapies that target the root causes of these diseases.Materials and methods: Datasets for autoimmune diseases (RA, SLE, and MS) were collected from the GEO database. Differentially expressed genes were identified and subjected to meta-analysis to obtain common DEGs, which were then used for functional enrichment analysis GO and pathway analysis. A PPI network was constructed, and topology-based ranking identified hub genes. These hub genes were further analyzed through regulatory network analysis (TF and miRNA), gene-disease association studies, and drug-gene interaction analysis. Finally, molecular docking and molecular dynamics (MD) simulations were performed on the hub genes.Results: A total of 341 differentially expressed genes were identified, with 172 upregulated and 169 downregulated genes. Among these, eight hub genes STAT1, PTPRC, IRF8, JAK2, IL10RA, OAS2, CCR1, and IFI44L were found to be closely associated with the disease. Functional enrichment analysis revealed significant involvement in 143 biological processes, 53 cellular components, and 67 molecular functions, as well as 60 KEGG pathways. Further regulatory network analysis highlighted the interactions of the suggested hub genes with 198 transcription factors (TFs) and 993 microRNAs (miRNAs). Additionally, these genes were associated to 2,769 diseases, and 132 drugs were identified to interact with them. Molecular docking studies, along with Molecular Dynamics Simulation (MDS) stability analysis, demonstrated the potential of natural compounds and known immunomodulatory drugs as promising therapeutic targets for clinical application.Conclusion: These findings explored identifying the DEGs among shade of the autoimmune disease RA, SLE, MS, and this hub gene are associated with transcription factors are most crucial role play in the disease potentially clinical therapeutic targets of the autoimmune disease.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1744094"},"PeriodicalIF":3.9,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12868135/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep learning software and revised 2D model to segment bone in micro-CT scans. 深度学习软件和修改的二维模型在微ct扫描中分割骨骼。

IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2026-01-21 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1677527

Andrew H Lee, Ganesh Talluri, Manan Damani, Brandon Vera Covarrubias, Helena Hanna, Jeremy Chavez, Julian M Moore, Jacob Baradarian, Michael Molgaard, Beau Nielson, Kalah Walden, Thomas L Broderick, Layla Al-Nakkash

Deep learning (DL) enables automated bone segmentation in micro-CT datasets but can struggle to generalize across developmental stages, anatomical regions, and imaging conditions. We present BP-2D-03, which is a revised 2D Bone-Pores segmentation model. It was fitted to a dataset comprising 20 micro-CT scans spanning five mammalian species and 142,960 image patches. To manage the substantially larger and more varied dataset, we developed a DL software interface with modules for training ("BONe DLFit"), prediction ("BONe DLPred"), and evaluation ("BONe IoU"). These tools resolve prior issues such as slice-level data leakage, high memory usage, and limited multi-GPU support. Model performance was evaluated through three analyses. First, 5-fold cross-validation with three seeds per fold evaluated baseline robustness and stability. The model showed generally high mean Intersection-over-Union (IoU) with minimal variation across seeds, but performance varied more across folds related to differences in scan composition. These findings show that the baseline model is stable overall but that predictivity can decline for atypical scans. Second, 30 benchmarking experiments tested how model architecture, encoder backbone, and patch size influence segmentation IoU and computational efficiency. U-Net and UNet++ architectures with simple convolutional backbones (e.g., ResNet-18) achieved the highest IoU values, approaching 0.97. Third, cross-platform experiments confirmed that results are consistent across hardware configurations, operating systems, and implementations (Avizo 3D and standalone). Together, these analyses demonstrate that the BONe DL software delivers robust baseline performance and reproducible results across platforms.

深度学习（DL）可以在微ct数据集中实现自动骨分割，但很难在发育阶段、解剖区域和成像条件下进行推广。我们提出了BP-2D-03，这是一个修正的2D骨孔分割模型。它与一个数据集相匹配，该数据集包括20个微型ct扫描，涵盖5种哺乳动物物种和142960个图像补丁。为了管理更大、更多样化的数据集，我们开发了一个带有训练（“BONe DLFit”）、预测（“BONe DLPred”）和评估（“BONe IoU”）模块的深度学习软件界面。这些工具解决了先前的问题，如片级数据泄漏、高内存使用和有限的多gpu支持。通过三个分析来评价模型的性能。首先，5倍交叉验证，每倍3个种子评估基线稳健性和稳定性。该模型显示出较高的平均交联（Intersection-over-Union, IoU），不同种子间的差异很小，但由于扫描成分的差异，其性能在不同折叠间的差异更大。这些发现表明，基线模型总体上是稳定的，但对于非典型扫描，预测能力可能会下降。其次，30个基准测试实验测试了模型架构、编码器骨干和补丁大小对分割IoU和计算效率的影响。具有简单卷积主干的U-Net和UNet++架构（例如，ResNet-18）实现了最高的IoU值，接近0.97。第三，跨平台实验证实了不同硬件配置、操作系统和实现（Avizo 3D和单机）的结果是一致的。总之，这些分析表明，BONe DL软件提供了强大的基线性能和跨平台可重复的结果。

{"title":"Deep learning software and revised 2D model to segment bone in micro-CT scans.","authors":"Andrew H Lee, Ganesh Talluri, Manan Damani, Brandon Vera Covarrubias, Helena Hanna, Jeremy Chavez, Julian M Moore, Jacob Baradarian, Michael Molgaard, Beau Nielson, Kalah Walden, Thomas L Broderick, Layla Al-Nakkash","doi":"10.3389/fbinf.2025.1677527","DOIUrl":"10.3389/fbinf.2025.1677527","url":null,"abstract":"Deep learning (DL) enables automated bone segmentation in micro-CT datasets but can struggle to generalize across developmental stages, anatomical regions, and imaging conditions. We present BP-2D-03, which is a revised 2D Bone-Pores segmentation model. It was fitted to a dataset comprising 20 micro-CT scans spanning five mammalian species and 142,960 image patches. To manage the substantially larger and more varied dataset, we developed a DL software interface with modules for training (\"BONe DLFit\"), prediction (\"BONe DLPred\"), and evaluation (\"BONe IoU\"). These tools resolve prior issues such as slice-level data leakage, high memory usage, and limited multi-GPU support. Model performance was evaluated through three analyses. First, 5-fold cross-validation with three seeds per fold evaluated baseline robustness and stability. The model showed generally high mean Intersection-over-Union (IoU) with minimal variation across seeds, but performance varied more across folds related to differences in scan composition. These findings show that the baseline model is stable overall but that predictivity can decline for atypical scans. Second, 30 benchmarking experiments tested how model architecture, encoder backbone, and patch size influence segmentation IoU and computational efficiency. U-Net and UNet++ architectures with simple convolutional backbones (e.g., ResNet-18) achieved the highest IoU values, approaching 0.97. Third, cross-platform experiments confirmed that results are consistent across hardware configurations, operating systems, and implementations (Avizo 3D and standalone). Together, these analyses demonstrate that the BONe DL software delivers robust baseline performance and reproducible results across platforms.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1677527"},"PeriodicalIF":3.9,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12868216/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TRANSAID: a hybrid deep learning framework for translation site prediction with integrated biological feature scoring. TRANSAID：一个用于翻译站点预测的混合深度学习框架，集成了生物特征评分。

IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2026-01-19 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1676149

Yan Li, Boran Wang, Zhen Liu, Wei Wei, Caiyi Fei, Shi Xu, Tiyun Han, Wei Geng, Zengding Wu

Introduction: Translation initiation and termination are critical regulatory checkpoints in protein synthesis, yet accurate computational prediction of their sites remains challenging due to training data biases and the complexity of full-length transcripts.

Methods: To address these limitations, we present TRANSAID (TRANSlation AI for Detection), a novel deep learning framework that accurately and simultaneously predicts translation initiation (TIS) and termination (TTS) sites from complete transcript sequences. TRANSAID's hierarchical architecture efficiently processes long transcripts, capturing both local motifs and long-range dependencies. Crucially, the model was trained on a human transcriptome dataset that was rigorously partitioned at the gene level to prevent data leakage and included both protein-coding (NM) and non-coding (NR) transcripts.

Results: This mixed-training strategy enables TRANSAID to achieve high fidelity, correctly identifying 73.61% of NR transcripts as non-coding. Performance is further enhanced by an integrated biological scoring system, improving "perfect ORF prediction" for coding sequences to 94.94% and "correct non-coding prediction" to 82.00%. The human-trained model demonstrates remarkable cross-species applicability, maintaining high accuracy on organisms from mammals to yeast. Beyond annotation, TRANSAID serves as a powerful discovery tool for novel coding events. When applied to long-read sequencing data, it accurately identified previously unannotated protein isoforms validated by mass spectrometry (76.28% validation rate). Furthermore, homology searches of high-scoring ORFs predicted within NR transcripts suggest a strong potential for identifying cryptic translation events.

Discussion: As a fully documented open-source tool with a user-friendly web server, TRANSAID provides a powerful and accessible resource for improving transcriptome annotation and proteomic discovery.

翻译起始和终止是蛋白质合成中关键的调控检查点，但由于训练数据的偏差和全长转录本的复杂性，对其位置的准确计算预测仍然具有挑战性。方法：为了解决这些限制，我们提出了TRANSAID (TRANSlation AI for Detection)，这是一个新的深度学习框架，可以准确地同时预测完整转录序列中的翻译起始（TIS）和终止（TTS）位点。TRANSAID的分层结构有效地处理长转录本，捕捉本地主题和长期依赖关系。至关重要的是，该模型是在人类转录组数据集上进行训练的，该数据集在基因水平上进行了严格划分，以防止数据泄露，并包括蛋白质编码（NM）和非编码（NR）转录本。结果：这种混合训练策略使TRANSAID达到高保真度，正确识别73.61%的NR转录本为非编码。集成的生物评分系统进一步提高了性能，将编码序列的“完美ORF预测”提高到94.94%，“正确非编码预测”提高到82.00%。人类训练的模型显示出显著的跨物种适用性，在从哺乳动物到酵母的生物体上保持了很高的准确性。除了注释之外，TRANSAID还是一个强大的发现新编码事件的工具。当应用于长读测序数据时，它准确地鉴定了先前未经质谱验证的未注释的蛋白质异构体（验证率为76.28%）。此外，对NR转录本中预测的高分orf的同源性搜索表明，识别隐翻译事件具有很大的潜力。讨论：TRANSAID是一个有完整文档的开源工具，带有用户友好的web服务器，它为改进转录组注释和蛋白质组学发现提供了强大的可访问资源。

{"title":"TRANSAID: a hybrid deep learning framework for translation site prediction with integrated biological feature scoring.","authors":"Yan Li, Boran Wang, Zhen Liu, Wei Wei, Caiyi Fei, Shi Xu, Tiyun Han, Wei Geng, Zengding Wu","doi":"10.3389/fbinf.2025.1676149","DOIUrl":"10.3389/fbinf.2025.1676149","url":null,"abstract":"Introduction: Translation initiation and termination are critical regulatory checkpoints in protein synthesis, yet accurate computational prediction of their sites remains challenging due to training data biases and the complexity of full-length transcripts.Methods: To address these limitations, we present TRANSAID (TRANSlation AI for Detection), a novel deep learning framework that accurately and simultaneously predicts translation initiation (TIS) and termination (TTS) sites from complete transcript sequences. TRANSAID's hierarchical architecture efficiently processes long transcripts, capturing both local motifs and long-range dependencies. Crucially, the model was trained on a human transcriptome dataset that was rigorously partitioned at the gene level to prevent data leakage and included both protein-coding (NM) and non-coding (NR) transcripts.Results: This mixed-training strategy enables TRANSAID to achieve high fidelity, correctly identifying 73.61% of NR transcripts as non-coding. Performance is further enhanced by an integrated biological scoring system, improving \"perfect ORF prediction\" for coding sequences to 94.94% and \"correct non-coding prediction\" to 82.00%. The human-trained model demonstrates remarkable cross-species applicability, maintaining high accuracy on organisms from mammals to yeast. Beyond annotation, TRANSAID serves as a powerful discovery tool for novel coding events. When applied to long-read sequencing data, it accurately identified previously unannotated protein isoforms validated by mass spectrometry (76.28% validation rate). Furthermore, homology searches of high-scoring ORFs predicted within NR transcripts suggest a strong potential for identifying cryptic translation events.Discussion: As a fully documented open-source tool with a user-friendly web server, TRANSAID provides a powerful and accessible resource for improving transcriptome annotation and proteomic discovery.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1676149"},"PeriodicalIF":3.9,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12862215/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PreBP: an interpretable, optimized ensemble framework using routine complete blood count for rapid pathogen identification in bacterial pneumonia. PreBP：一个可解释的、优化的集合框架，使用常规全血细胞计数快速鉴定细菌性肺炎的病原体。

IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2026-01-14 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1769816

Xiaoxi Hao, Dingjian Liang, Yimin Shen, Cuimin Sun, Wei Lan

Introduction: Bacterial pneumonia remains a major global health challenge, and early pathogen identification is important for timely and targeted treatment. However, conventional microbiological diagnostics such as sputum or blood culture are labor-intensive and time-consuming.

Methods: We propose an interpretable ensemble learning framework (PreBP) for rapid pathogen identification using routinely available complete blood count (CBC) parameters. We analyzed 1,334 CBC samples from patients with culture-confirmed bacterial pneumonia caused by four major pathogens: Pseudomonas aeruginosa, Escherichia coli, Staphylococcus aureus, and Streptococcus pneumoniae. Pathogen labels were determined based on clinical culture results. Five machine learning models (extreme gradient boosting (XGBoost), multilayer perceptron neural network (MLPNN), adaptive boosting (AdaBoost), random forest (RF), and extremely randomized trees (ExtraTrees)) were trained as comparators, and PreBP was developed with metaheuristic-optimized hyperparameters. Key CBC biomarkers were refined using a dual-phase feature selection strategy combining Lasso and Boruta. To enhance transparency, SHapley additive explanations (SHAP) were applied to provide both global biomarker importance and local, case-level explanations.

Results: PreBP achieved the best overall performance, with an AUC of 0.920, precision of 87.1%, and accuracy and sensitivity of 86.7%.

Discussion: Because the framework relies on routine CBC measurements, it can generate interpretable predictions once CBC results are available, which may provide supplementary evidence for earlier pathogen-oriented clinical decision-making alongside culture-dependent workflows. Overall, PreBP offers an interpretable and computational approach for pathogen identification in bacterial pneumonia based on routine laboratory data.

细菌性肺炎仍然是一个主要的全球卫生挑战，早期病原体识别对于及时和有针对性的治疗非常重要。然而，传统的微生物诊断，如痰或血培养，是劳动密集型和耗时的。方法：我们提出了一个可解释的集成学习框架（PreBP），用于使用常规全血细胞计数（CBC）参数快速鉴定病原体。我们分析了1334例由四种主要病原体（铜绿假单胞菌、大肠杆菌、金黄色葡萄球菌和肺炎链球菌）引起的培养确诊细菌性肺炎患者的CBC样本。根据临床培养结果确定病原体标记。五个机器学习模型（极端梯度增强（XGBoost）、多层感知器神经网络（MLPNN）、自适应增强（AdaBoost）、随机森林（RF）和极端随机树（ExtraTrees））作为比较器进行训练，并使用元启发优化的超参数开发PreBP。采用Lasso和Boruta相结合的双相特征选择策略对关键的CBC生物标志物进行了细化。为了提高透明度，应用SHapley加性解释（SHAP）来提供全球生物标志物重要性和局部病例级解释。结果：PreBP综合性能最佳，AUC为0.920，精密度为87.1%，准确度和灵敏度为86.7%。讨论：由于该框架依赖于常规CBC测量，一旦CBC结果可用，它可以产生可解释的预测，这可能为早期以病原体为导向的临床决策提供补充证据，以及依赖文化的工作流程。总体而言，PreBP提供了一种基于常规实验室数据的细菌性肺炎病原体鉴定的可解释和计算方法。

{"title":"PreBP: an interpretable, optimized ensemble framework using routine complete blood count for rapid pathogen identification in bacterial pneumonia.","authors":"Xiaoxi Hao, Dingjian Liang, Yimin Shen, Cuimin Sun, Wei Lan","doi":"10.3389/fbinf.2025.1769816","DOIUrl":"10.3389/fbinf.2025.1769816","url":null,"abstract":"Introduction: Bacterial pneumonia remains a major global health challenge, and early pathogen identification is important for timely and targeted treatment. However, conventional microbiological diagnostics such as sputum or blood culture are labor-intensive and time-consuming.Methods: We propose an interpretable ensemble learning framework (PreBP) for rapid pathogen identification using routinely available complete blood count (CBC) parameters. We analyzed 1,334 CBC samples from patients with culture-confirmed bacterial pneumonia caused by four major pathogens: Pseudomonas aeruginosa, Escherichia coli, Staphylococcus aureus, and Streptococcus pneumoniae. Pathogen labels were determined based on clinical culture results. Five machine learning models (extreme gradient boosting (XGBoost), multilayer perceptron neural network (MLPNN), adaptive boosting (AdaBoost), random forest (RF), and extremely randomized trees (ExtraTrees)) were trained as comparators, and PreBP was developed with metaheuristic-optimized hyperparameters. Key CBC biomarkers were refined using a dual-phase feature selection strategy combining Lasso and Boruta. To enhance transparency, SHapley additive explanations (SHAP) were applied to provide both global biomarker importance and local, case-level explanations.Results: PreBP achieved the best overall performance, with an AUC of 0.920, precision of 87.1%, and accuracy and sensitivity of 86.7%.Discussion: Because the framework relies on routine CBC measurements, it can generate interpretable predictions once CBC results are available, which may provide supplementary evidence for earlier pathogen-oriented clinical decision-making alongside culture-dependent workflows. Overall, PreBP offers an interpretable and computational approach for pathogen identification in bacterial pneumonia based on routine laboratory data.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1769816"},"PeriodicalIF":3.9,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12847367/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146087272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An integrated subtractive genomics and immunoinformatic approach for designing a multi-epitope peptide vaccine against methicillin-resistant Staphylococcus aureus. 综合减法基因组学和免疫信息学方法设计抗耐甲氧西林金黄色葡萄球菌的多表位肽疫苗。

IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2026-01-14 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1745495

Nandha Kumar Subramani, Subhashree Venugopal, Anand Prem Rajan

Introduction: MRSA is a multi-drug-resistant bacteria responsible for severe infections that has become a major health concern. Due to constraints of traditional methods, there is a need for developing a new approach to prevent the MRSA-related infections by targeting key pathogens.

Methods: Initially, the subtractive genomics was applied to the MRSA proteome to identify non-homologous, essential, and virulence targets using comparative BLAST-based screening. Further, immunoinformatic tools were employed for B- and T-cell epitope prediction and vaccine construction with appropriate adjuvants and linkers, followed by immune simulation and molecular docking with immune receptors.

Results: Comparative metabolic pathway analysis identified 294 MRSA pathway proteins, with acetolactate synthase (ALS) as a non-homologous, essential, and virulent protein that is involved in the branched amino acid biosynthesis pathway. The constructed ALS vaccine consists of 3 B-cell and 19 T-cell epitopes exhibited stable immunological features with 97.55% global population coverage. Molecular docking revealed that ALS exhibited a superior binding affinity with the TLR4 receptor (-1,438.7 kcal/mol) than the TLR2 receptor (-1,103.5 kcal/mol), which was further confirmed by high structural stability and compactness analysis. Immune simulations also exhibited elevated IgM, IgG subtypes, and cytokine productions, suggesting a robust humoral and cellular immunity.

Discussion: Identified ALS highlights its biological relevance in MRSA survival. The stability predictions with TLR4 suggested effective activation of innate immunity that may enhance antigen presentation and downstream adaptive immunity. The validation of the ALS vaccine's safety and immunogenicity further requires comprehensive in vitro and in vivo examinations.

Conclusion: Thus, ALS is recognized as a promising MRSA vaccine candidate and has the potential to activate immune responses effectively.

MRSA是一种多重耐药细菌，可导致严重感染，已成为主要的健康问题。由于传统方法的局限性，需要开发一种针对关键病原体的新方法来预防mrsa相关感染。方法：首先，将减法基因组学应用于MRSA蛋白质组，通过基于blast的比较筛选来鉴定非同源、必需和毒力靶点。此外，利用免疫信息学工具预测B细胞和t细胞表位，并使用合适的佐剂和连接剂构建疫苗，随后进行免疫模拟和与免疫受体的分子对接。结果：比较代谢途径分析鉴定出294个MRSA途径蛋白，其中乙酰乳酸合成酶（acetolactate synthase， ALS）是参与支链氨基酸生物合成途径的非同源、必需和毒性蛋白。构建的ALS疫苗由3个b细胞和19个t细胞表位组成，具有稳定的免疫特性，全球人口覆盖率为97.55%。分子对接发现，ALS与TLR4受体（- 1438.7 kcal/mol）的结合亲和力优于TLR2受体（- 1103.5 kcal/mol），高结构稳定性和紧密性分析进一步证实了这一点。免疫模拟也显示IgM、IgG亚型和细胞因子的产生升高，表明有强大的体液和细胞免疫。讨论：已鉴定的ALS突出了其与MRSA生存的生物学相关性。TLR4的稳定性预测表明，它可以有效激活先天免疫，从而增强抗原呈递和下游适应性免疫。进一步验证ALS疫苗的安全性和免疫原性需要全面的体外和体内试验。结论：ALS被认为是一种很有前途的MRSA候选疫苗，具有有效激活免疫反应的潜力。

{"title":"An integrated subtractive genomics and immunoinformatic approach for designing a multi-epitope peptide vaccine against methicillin-resistant Staphylococcus aureus.","authors":"Nandha Kumar Subramani, Subhashree Venugopal, Anand Prem Rajan","doi":"10.3389/fbinf.2025.1745495","DOIUrl":"10.3389/fbinf.2025.1745495","url":null,"abstract":"Introduction: MRSA is a multi-drug-resistant bacteria responsible for severe infections that has become a major health concern. Due to constraints of traditional methods, there is a need for developing a new approach to prevent the MRSA-related infections by targeting key pathogens.Methods: Initially, the subtractive genomics was applied to the MRSA proteome to identify non-homologous, essential, and virulence targets using comparative BLAST-based screening. Further, immunoinformatic tools were employed for B- and T-cell epitope prediction and vaccine construction with appropriate adjuvants and linkers, followed by immune simulation and molecular docking with immune receptors.Results: Comparative metabolic pathway analysis identified 294 MRSA pathway proteins, with acetolactate synthase (ALS) as a non-homologous, essential, and virulent protein that is involved in the branched amino acid biosynthesis pathway. The constructed ALS vaccine consists of 3 B-cell and 19 T-cell epitopes exhibited stable immunological features with 97.55% global population coverage. Molecular docking revealed that ALS exhibited a superior binding affinity with the TLR4 receptor (-1,438.7 kcal/mol) than the TLR2 receptor (-1,103.5 kcal/mol), which was further confirmed by high structural stability and compactness analysis. Immune simulations also exhibited elevated IgM, IgG subtypes, and cytokine productions, suggesting a robust humoral and cellular immunity.Discussion: Identified ALS highlights its biological relevance in MRSA survival. The stability predictions with TLR4 suggested effective activation of innate immunity that may enhance antigen presentation and downstream adaptive immunity. The validation of the ALS vaccine's safety and immunogenicity further requires comprehensive in vitro and in vivo examinations.Conclusion: Thus, ALS is recognized as a promising MRSA vaccine candidate and has the potential to activate immune responses effectively.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1745495"},"PeriodicalIF":3.9,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12847441/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SpaLLM: a general framework for spatial domain identification with large language models. SpaLLM：使用大型语言模型进行空间域识别的通用框架。

IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2026-01-12 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1713975

Zeyu Zou, Ziheng Duan

Spatial transcriptomics (ST) technologies enable the profiling of gene expression while preserving spatial context, offering unprecedented insights into tissue organization. However, traditional spatial domain identification methods primarily rely on gene expression matrices and spatial coordinates while overlooking the rich biological knowledge encoded in gene functional descriptions. Here, we propose SpaLLM, a general framework that integrates large language model (LLM) embeddings of gene descriptions with conventional spatial transcriptomics analysis. Our approach leverages pre-computed GenePT embeddings from NCBI gene summaries to create biologically-informed gene representations. SpaLLM combines these LLM-derived gene features with cell-gene expression matrices through matrix multiplication, generating enriched cell representations that capture both expression patterns and functional knowledge. These enriched features are then integrated with existing graph-based spatial analysis methods for improved spatial domain identification. Extensive validation on 12 sequencing-based Visium sections and an independent imaging-based osmFISH dataset demonstrate that SpaLLM consistently enhances spatial domain identification. Our modular framework can be seamlessly integrated with existing spatial analysis pipelines, making it broadly applicable to diverse research scenarios.

空间转录组学（ST）技术能够在保留空间背景的同时分析基因表达，为组织组织提供前所未有的见解。然而，传统的空间域识别方法主要依赖于基因表达矩阵和空间坐标，忽略了基因功能描述中所编码的丰富的生物学知识。在这里，我们提出了SpaLLM，这是一个将大型语言模型（LLM）嵌入基因描述与传统空间转录组学分析相结合的通用框架。我们的方法利用NCBI基因摘要中预先计算的GenePT嵌入来创建生物学知情的基因表示。SpaLLM通过矩阵增殖将这些llm衍生的基因特征与细胞基因表达矩阵相结合，生成丰富的细胞表示，捕获表达模式和功能知识。然后将这些丰富的特征与现有的基于图的空间分析方法相结合，以改进空间域识别。对12个基于测序的Visium切片和一个独立的基于成像的osmFISH数据集的广泛验证表明，SpaLLM持续增强了空间域识别。我们的模块化框架可以与现有的空间分析管道无缝集成，使其广泛适用于不同的研究场景。

{"title":"SpaLLM: a general framework for spatial domain identification with large language models.","authors":"Zeyu Zou, Ziheng Duan","doi":"10.3389/fbinf.2025.1713975","DOIUrl":"10.3389/fbinf.2025.1713975","url":null,"abstract":"Spatial transcriptomics (ST) technologies enable the profiling of gene expression while preserving spatial context, offering unprecedented insights into tissue organization. However, traditional spatial domain identification methods primarily rely on gene expression matrices and spatial coordinates while overlooking the rich biological knowledge encoded in gene functional descriptions. Here, we propose SpaLLM, a general framework that integrates large language model (LLM) embeddings of gene descriptions with conventional spatial transcriptomics analysis. Our approach leverages pre-computed GenePT embeddings from NCBI gene summaries to create biologically-informed gene representations. SpaLLM combines these LLM-derived gene features with cell-gene expression matrices through matrix multiplication, generating enriched cell representations that capture both expression patterns and functional knowledge. These enriched features are then integrated with existing graph-based spatial analysis methods for improved spatial domain identification. Extensive validation on 12 sequencing-based Visium sections and an independent imaging-based osmFISH dataset demonstrate that SpaLLM consistently enhances spatial domain identification. Our modular framework can be seamlessly integrated with existing spatial analysis pipelines, making it broadly applicable to diverse research scenarios.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1713975"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12833451/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146069191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Recent trends in machine learning and deep learning-based prediction of G-protein coupled receptor-ligand binding affinities. 机器学习和基于深度学习的g蛋白偶联受体-配体结合亲和力预测的最新趋势。

IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2026-01-12 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1712577

Joshua Stephenson, Konda Reddy Karnati

Accurately predicting protein-ligand binding affinity is key in drug discovery. Machine Learning and Deep Learning methods used in the drug discovery process have advanced the prediction of drug-target binding affinities, particularly for G protein-coupled receptors (GPCRs), a pharmacologically significant yet structurally heterogeneous protein family. In this review, binding affinity prediction models are examined and organized according to sequence-based one-dimensional, graph-based two-dimensional, and structure-based three-dimensional frameworks. Sequence-based models utilize convolutional neural networks for high-throughput screening. Recently published models incorporated attention mechanisms and self-supervised learning, enhancing interpretability and reducing dependence on annotated datasets. Graph-based models employ graph neural networks and molecular contact maps to capture topological features, enabling substructure-sensitive predictions. Structure-based approaches integrate spatial and conformational data into high-resolution interaction models. The hybrid use of these three approaches could significantly increase the success rate of in silico models for drug discovery, particularly for GPCRs.

准确预测蛋白质与配体的结合亲和力是药物发现的关键。在药物发现过程中使用的机器学习和深度学习方法已经推进了药物靶标结合亲和力的预测，特别是对于G蛋白偶联受体（gpcr），这是一种具有药理意义但结构异质的蛋白质家族。在这篇综述中，结合亲和预测模型根据基于序列的一维，基于图的二维和基于结构的三维框架进行了检查和组织。基于序列的模型利用卷积神经网络进行高通量筛选。最近发表的模型结合了注意机制和自监督学习，增强了可解释性并减少了对注释数据集的依赖。基于图的模型采用图神经网络和分子接触图来捕获拓扑特征，从而实现子结构敏感的预测。基于结构的方法将空间和构象数据集成到高分辨率的相互作用模型中。这三种方法的混合使用可以显著提高药物发现的计算机模型的成功率，特别是对于gpcr。

{"title":"Recent trends in machine learning and deep learning-based prediction of G-protein coupled receptor-ligand binding affinities.","authors":"Joshua Stephenson, Konda Reddy Karnati","doi":"10.3389/fbinf.2025.1712577","DOIUrl":"10.3389/fbinf.2025.1712577","url":null,"abstract":"Accurately predicting protein-ligand binding affinity is key in drug discovery. Machine Learning and Deep Learning methods used in the drug discovery process have advanced the prediction of drug-target binding affinities, particularly for G protein-coupled receptors (GPCRs), a pharmacologically significant yet structurally heterogeneous protein family. In this review, binding affinity prediction models are examined and organized according to sequence-based one-dimensional, graph-based two-dimensional, and structure-based three-dimensional frameworks. Sequence-based models utilize convolutional neural networks for high-throughput screening. Recently published models incorporated attention mechanisms and self-supervised learning, enhancing interpretability and reducing dependence on annotated datasets. Graph-based models employ graph neural networks and molecular contact maps to capture topological features, enabling substructure-sensitive predictions. Structure-based approaches integrate spatial and conformational data into high-resolution interaction models. The hybrid use of these three approaches could significantly increase the success rate of in silico models for drug discovery, particularly for GPCRs.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1712577"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12832930/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146069057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrative transcriptomic analysis reveals microglial metabolic-inflammatory crosstalk of HK2-HSPA5-TNF axis after intracerebral hemorrhage. 综合转录组学分析揭示脑出血后HK2-HSPA5-TNF轴的小胶质代谢-炎症串音。

IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2026-01-12 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1740715

Yi Zhang, Yongqian Liu, Wei Meng, Xiaobo Yu, Xiaojun Xu

Background: Intracerebral hemorrhage (ICH) triggers secondary brain injury through neuroinflammation, yet the interplay between metabolic reprogramming and inflammatory responses remains poorly defined. This study investigated how glucose metabolism dysregulation contributes to neuroinflammatory pathogenesis following ICH.

Methods: We integrated transcriptomic datasets from bulk RNA sequencing (human perihematomal tissue), single-cell RNA sequencing (mouse ICH model), and spatial transcriptomics (mouse time-series). Bioinformatic analyses included differential expression screening, single-cell weighted gene co-expression network analysis, pseudotemporal trajectory reconstruction, and cell-cell communication inference to identify key metabolic-inflammation regulators and their spatiotemporal dynamics.

Results: Multi-omics convergence revealed hexokinase 2 (HK2), heat shock protein A5 (HSPA5), and tumor necrosis factor (TNF) as core regulators linking glucose metabolism to neuroinflammation. Single-cell analysis showed significant time-dependent regulation of HK2 in microglia, while spatial transcriptomics uncovered synchronized alterations of HK2, HSPA5, and TNF in perihematomal regions at day 7. Cell communication analysis highlighted enhanced microglia-to-neutrophil signaling via Tnf-Tnfrsf1b pairs, with TNF signaling identified as the most significantly upregulated pathway in ICH conditions.

Conclusion: Our multi-omics approach reveals coordinated dysregulation of glucose metabolism and inflammatory genes following ICH, with time-dependent HK2 regulation in microglia and synchronized transcriptional changes at day 7 representing critical events in neuroinflammatory progression. The identified gene networks and cellular communication patterns provide new insights into the metabolic-immune interface in ICH, offering potential targets for future therapeutic strategies.

背景：脑出血（ICH）通过神经炎症引发继发性脑损伤，但代谢重编程与炎症反应之间的相互作用仍不清楚。本研究探讨了脑出血后葡萄糖代谢失调如何促进神经炎症发病。方法：我们整合了大量RNA测序（人血肿周围组织）、单细胞RNA测序（小鼠脑出血模型）和空间转录组学（小鼠时间序列）的转录组学数据。生物信息学分析包括差异表达筛选、单细胞加权基因共表达网络分析、伪时间轨迹重建和细胞-细胞通讯推断，以确定关键的代谢-炎症调节因子及其时空动态。结果：多组学趋同显示己糖激酶2 （HK2）、热休克蛋白A5 （HSPA5）和肿瘤坏死因子（TNF）是糖代谢与神经炎症相关的核心调节因子。单细胞分析显示HK2在小胶质细胞中有明显的时间依赖性调节，而空间转录组学发现在第7天，HK2、HSPA5和TNF在血肿周围区域同步改变。细胞通讯分析强调通过TNF - tnfrsf1b对增强小胶质细胞到中性粒细胞的信号传导，TNF信号传导被认为是ICH条件下最显著的上调途径。结论：我们的多组学方法揭示了脑出血后糖代谢和炎症基因的协调失调，小胶质细胞中HK2的时间依赖性调节和第7天的同步转录变化代表了神经炎症进展的关键事件。已确定的基因网络和细胞通讯模式为脑出血的代谢-免疫界面提供了新的见解，为未来的治疗策略提供了潜在的靶点。

{"title":"Integrative transcriptomic analysis reveals microglial metabolic-inflammatory crosstalk of HK2-HSPA5-TNF axis after intracerebral hemorrhage.","authors":"Yi Zhang, Yongqian Liu, Wei Meng, Xiaobo Yu, Xiaojun Xu","doi":"10.3389/fbinf.2025.1740715","DOIUrl":"10.3389/fbinf.2025.1740715","url":null,"abstract":"Background: Intracerebral hemorrhage (ICH) triggers secondary brain injury through neuroinflammation, yet the interplay between metabolic reprogramming and inflammatory responses remains poorly defined. This study investigated how glucose metabolism dysregulation contributes to neuroinflammatory pathogenesis following ICH.Methods: We integrated transcriptomic datasets from bulk RNA sequencing (human perihematomal tissue), single-cell RNA sequencing (mouse ICH model), and spatial transcriptomics (mouse time-series). Bioinformatic analyses included differential expression screening, single-cell weighted gene co-expression network analysis, pseudotemporal trajectory reconstruction, and cell-cell communication inference to identify key metabolic-inflammation regulators and their spatiotemporal dynamics.Results: Multi-omics convergence revealed hexokinase 2 (HK2), heat shock protein A5 (HSPA5), and tumor necrosis factor (TNF) as core regulators linking glucose metabolism to neuroinflammation. Single-cell analysis showed significant time-dependent regulation of HK2 in microglia, while spatial transcriptomics uncovered synchronized alterations of HK2, HSPA5, and TNF in perihematomal regions at day 7. Cell communication analysis highlighted enhanced microglia-to-neutrophil signaling via Tnf-Tnfrsf1b pairs, with TNF signaling identified as the most significantly upregulated pathway in ICH conditions.Conclusion: Our multi-omics approach reveals coordinated dysregulation of glucose metabolism and inflammatory genes following ICH, with time-dependent HK2 regulation in microglia and synchronized transcriptional changes at day 7 representing critical events in neuroinflammatory progression. The identified gene networks and cellular communication patterns provide new insights into the metabolic-immune interface in ICH, offering potential targets for future therapeutic strategies.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1740715"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12833071/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146069091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

High-dimensional co-expression network analysis reveals persistent TRH gene expression throughout axolotl telencephalon regeneration. 高维共表达网络分析揭示了TRH基因在美西河豚端脑再生过程中的持续表达。

IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2026-01-12 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1697212

Iveth Gómez-Morales, Adriana P Mendizabal-Ruiz, J Alejandro Morales, Teresa Romero-Gutiérrez

Introduction: The Axolotl (Ambystoma mexicanum) offers a deep insight into brain regeneration by fully reconstructing its telencephalon post-injury, a capability that most vertebrates do not have. This study aimed to identify hub genes (highest-weighted genes) underlying this process and to map their cell location by analyzing spatiotemporal transcriptomic data using high-dimensional weighted gene co-expression network analysis, integrating protein-protein interaction networks, and cross-validating findings through literature.

Results: We identified 180 hub genes across the regeneration timeline, including several with conserved orthologs previously reported in vertebrate regeneration models. Among these candidates, TRH (Thyrotropin-Releasing Hormone) displayed the most consistent spatiotemporal pattern, appearing repeatedly as a hub gene and localizing to MSN enriched regions at multiple stages. TRH is broadly characterized in vertebrates as a neuroendocrine peptide with roles in hormonal signaling, and MSNs are known to respond to a variety of hormonal and neuropeptidergic cues. In our dataset, this background provides additional perspective on the transcriptional configurations in which TRH appears. Other hub genes showed stage/cell specific patterns, together outlining a heterogeneous and dynamic landscape of transcriptional states detected during telencephalon regeneration.

Conclusion: This study provides a descriptive map of gene co-expression dynamics during axolotl telencephalon regeneration. By integrating hdWGCNA, spatial transcriptomics, and network-based context, we identify hub genes and transcriptional states associated with injury response, including a persistent TRH linked MSN state. These findings offer a foundation for future experimental studies aimed at elucidating the molecular basis of axolotl brain repair.

简介：蝾螈（Ambystoma mexicanum）通过完全重建其损伤后的端脑，提供了对大脑再生的深刻见解，这是大多数脊椎动物所没有的能力。本研究旨在通过高维加权基因共表达网络分析，整合蛋白质-蛋白质相互作用网络，并通过文献交叉验证发现，分析时空转录组数据，确定这一过程背后的枢纽基因（权重最高的基因），并绘制其细胞位置。结果：我们在再生时间线上鉴定了180个枢纽基因，其中包括几个先前在脊椎动物再生模型中报道的保守同源基因。在这些候选基因中，TRH（促甲状腺激素释放激素）表现出最一致的时空模式，作为枢纽基因反复出现，并在多个阶段定位于MSN富集区域。在脊椎动物中，TRH被广泛认为是一种参与激素信号传导的神经内分泌肽，而msn则对多种激素和神经肽能信号做出反应。在我们的数据集中，这一背景为TRH出现的转录配置提供了额外的视角。其他中枢基因显示阶段/细胞特异性模式，共同概述了端脑再生过程中检测到的转录状态的异质性和动态景观。结论：本研究提供了蝾螈端脑再生过程中基因共表达动态的描述图谱。通过整合hdWGCNA、空间转录组学和基于网络的背景，我们确定了与损伤反应相关的枢纽基因和转录状态，包括与TRH相关的持续MSN状态。这些发现为今后的实验研究奠定了基础，旨在阐明蝾螈脑修复的分子基础。

{"title":"High-dimensional co-expression network analysis reveals persistent TRH gene expression throughout axolotl telencephalon regeneration.","authors":"Iveth Gómez-Morales, Adriana P Mendizabal-Ruiz, J Alejandro Morales, Teresa Romero-Gutiérrez","doi":"10.3389/fbinf.2025.1697212","DOIUrl":"10.3389/fbinf.2025.1697212","url":null,"abstract":"Introduction: The Axolotl (Ambystoma mexicanum) offers a deep insight into brain regeneration by fully reconstructing its telencephalon post-injury, a capability that most vertebrates do not have. This study aimed to identify hub genes (highest-weighted genes) underlying this process and to map their cell location by analyzing spatiotemporal transcriptomic data using high-dimensional weighted gene co-expression network analysis, integrating protein-protein interaction networks, and cross-validating findings through literature.Results: We identified 180 hub genes across the regeneration timeline, including several with conserved orthologs previously reported in vertebrate regeneration models. Among these candidates, TRH (Thyrotropin-Releasing Hormone) displayed the most consistent spatiotemporal pattern, appearing repeatedly as a hub gene and localizing to MSN enriched regions at multiple stages. TRH is broadly characterized in vertebrates as a neuroendocrine peptide with roles in hormonal signaling, and MSNs are known to respond to a variety of hormonal and neuropeptidergic cues. In our dataset, this background provides additional perspective on the transcriptional configurations in which TRH appears. Other hub genes showed stage/cell specific patterns, together outlining a heterogeneous and dynamic landscape of transcriptional states detected during telencephalon regeneration.Conclusion: This study provides a descriptive map of gene co-expression dynamics during axolotl telencephalon regeneration. By integrating hdWGCNA, spatial transcriptomics, and network-based context, we identify hub genes and transcriptional states associated with injury response, including a persistent TRH linked MSN state. These findings offer a foundation for future experimental studies aimed at elucidating the molecular basis of axolotl brain repair.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1697212"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12832632/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146069088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

REST missense mutations reveal disrupted Re1 motif binding and co-repressor interactions in uterine fibroids. REST错义突变揭示子宫肌瘤中中断的Re1基序结合和共抑制因子相互作用。

IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in bioinformatics

Pub Date : 2026-01-12 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1703356

Srineevas Sriram, Chandresh Palanichamy, P T Subash, Manshi Kumari Gupta, C Sudandiradoss

Introduction: The Re1-Silencing Transcription Factor (REST) is a master regulator of gene silencing, orchestrating transcriptional repression by tethering chromatin-modifying co-repressors to the Re1 motif of target genes. While REST is recognized as a sentinel of cellular identity, its role in uterine fibroids (UF) remains unclear. This study aims to investigate how structural perturbations in REST may compromise its regulatory function and contribute to altered transcriptional control in fibroid biology.

Methods: A deep structural interrogation of REST was performed through expansive in silico analysis of 938 missense SNPs. Evolutionary conservation was assessed across ten primate species to identify structurally disruptive variants. Structural modelling, protein-protein and protein-DNA docking analyses were conducted to evaluate interactions with co-repressors and DNA. Molecular dynamics simulations were used to assess conformational stability, flexibility, compactness, and energetic changes in wild-type and mutant REST variants.

Results: Five structurally disruptive REST variants (Y31C, Y31D, L76Q, Y283C, L427Q) were identified at evolutionarily conserved residues. Structural modelling and docking analyses revealed weakened affinity for co-repressors, with the Y283C variant showing a marked reduction in SIN3A interaction (Z-score: 2.4 to -1.2) and impaired DNA binding (Z-score: 2.0 to -1.3). Molecular dynamics simulations demonstrated that Y283C increased rigidity (RMSF: 0.33 to 0.27 nm), reduced compactness (Rg: 3.48-3.51 nm), and lowered potential energy. Upon Re1 binding, destabilization intensified, with increased RMSD (0.95-1.07 nm) and pronounced shifts in energy.

Discussion: This integrative analysis highlights REST as a candidate regulatory component in uterine fibroid biology. Structural disruption of REST, particularly through the Y283C mutation, may destabilize molecular interactions and compromise DNA-binding precision, potentially unleashing transcriptional noise that fuels fibroid growth. These findings suggest that perturbation of REST-mediated transcriptional repression may be associated with altered regulatory control in this disease and could inform future strategies to investigate dysregulation in uterine fibroids.

Re1沉默转录因子（REST）是基因沉默的主要调控因子，通过将染色质修饰共抑制因子系在靶基因的Re1基序上来协调转录抑制。虽然REST被认为是细胞身份的哨兵，但其在子宫肌瘤（UF）中的作用尚不清楚。本研究旨在探讨REST的结构扰动如何损害其调节功能，并导致肌瘤生物学中转录控制的改变。方法：通过对938个错义snp进行广泛的芯片分析，对REST进行深入的结构分析。对10种灵长类动物的进化保护进行了评估，以确定结构上的破坏性变异。通过结构建模、蛋白质-蛋白质和蛋白质-DNA对接分析来评估共抑制因子和DNA的相互作用。分子动力学模拟用于评估野生型和突变型REST变体的构象稳定性、灵活性、紧凑性和能量变化。结果：在进化保守的残基上鉴定出5个具有结构破坏性的REST变体（Y31C、Y31D、L76Q、Y283C、L427Q）。结构建模和对接分析显示，Y283C变体对共抑制因子的亲和力减弱，显示出SIN3A相互作用显著降低（Z-score: 2.4至-1.2），DNA结合受损（Z-score: 2.0至-1.3）。分子动力学模拟表明，Y283C提高了刚性（RMSF: 0.33 ~ 0.27 nm），降低了致密度（Rg: 3.48 ~ 3.51 nm），降低了势能。Re1结合后，不稳定性加剧，RMSD （0.95-1.07 nm）增加，能量明显变化。讨论：这一综合分析强调REST作为子宫肌瘤生物学中的候选调节成分。REST的结构破坏，特别是通过Y283C突变，可能会破坏分子相互作用的稳定性，损害dna结合的精度，潜在地释放促进肌瘤生长的转录噪声。这些发现表明，rest介导的转录抑制的扰动可能与这种疾病的调节控制改变有关，并可能为未来研究子宫肌瘤的调节失调提供策略。

{"title":"REST missense mutations reveal disrupted Re1 motif binding and co-repressor interactions in uterine fibroids.","authors":"Srineevas Sriram, Chandresh Palanichamy, P T Subash, Manshi Kumari Gupta, C Sudandiradoss","doi":"10.3389/fbinf.2025.1703356","DOIUrl":"10.3389/fbinf.2025.1703356","url":null,"abstract":"Introduction: The Re1-Silencing Transcription Factor (REST) is a master regulator of gene silencing, orchestrating transcriptional repression by tethering chromatin-modifying co-repressors to the Re1 motif of target genes. While REST is recognized as a sentinel of cellular identity, its role in uterine fibroids (UF) remains unclear. This study aims to investigate how structural perturbations in REST may compromise its regulatory function and contribute to altered transcriptional control in fibroid biology.Methods: A deep structural interrogation of REST was performed through expansive in silico analysis of 938 missense SNPs. Evolutionary conservation was assessed across ten primate species to identify structurally disruptive variants. Structural modelling, protein-protein and protein-DNA docking analyses were conducted to evaluate interactions with co-repressors and DNA. Molecular dynamics simulations were used to assess conformational stability, flexibility, compactness, and energetic changes in wild-type and mutant REST variants.Results: Five structurally disruptive REST variants (Y31C, Y31D, L76Q, Y283C, L427Q) were identified at evolutionarily conserved residues. Structural modelling and docking analyses revealed weakened affinity for co-repressors, with the Y283C variant showing a marked reduction in SIN3A interaction (Z-score: 2.4 to -1.2) and impaired DNA binding (Z-score: 2.0 to -1.3). Molecular dynamics simulations demonstrated that Y283C increased rigidity (RMSF: 0.33 to 0.27 nm), reduced compactness (Rg: 3.48-3.51 nm), and lowered potential energy. Upon Re1 binding, destabilization intensified, with increased RMSD (0.95-1.07 nm) and pronounced shifts in energy.Discussion: This integrative analysis highlights REST as a candidate regulatory component in uterine fibroid biology. Structural disruption of REST, particularly through the Y283C mutation, may destabilize molecular interactions and compromise DNA-binding precision, potentially unleashing transcriptional noise that fuels fibroid growth. These findings suggest that perturbation of REST-mediated transcriptional repression may be associated with altered regulatory control in this disease and could inform future strategies to investigate dysregulation in uterine fibroids.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1703356"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12832642/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146069043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0