Bioinformatics and Biology Insights最新文献_第4页

The Microchimerism Literature Atlas. 微嵌合文学图集。

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS

Bioinformatics and Biology Insights

Pub Date : 2025-04-03 eCollection Date: 2025-01-01 DOI: 10.1177/11779322251324104

Michael Christian Gruber, Daniel Kummer, Katja Sallinger, Henderson James Cleaves, Arsev Umur Aydinoğlu, Thomas Kroneis

The Microchimerism Literature Atlas (MCLA) is a comprehensive online dataset to facilitate the investigation of microchimerism (MC), condition where individuals harbor cells from another individual of the same species. The MCLA provides access to more than 15 000 references from MC research, covering peer-reviewed articles and reviews from 1970 to the present. Key features include a multidimensional search function and logical operators for assembling search queries. The MCLA dataset offers a clearly structured data table view, combined with dynamic graphical data representation and visual citation analysis, aiding in the investigation and identification of research trends and patterns. The MCLA supports data export in various formats and receives regular updates. The MCLA is being developed as an essential resource for the MC research community while its framework is easily adaptable for custom literature datasets, enabling its use in other research fields.

微嵌合文献图谱（MCLA）是一个综合性的在线数据集，用于促进微嵌合（MC）的研究，即个体携带来自同一物种的另一个个体的细胞。MCLA提供了超过15,000篇MC研究参考文献，涵盖1970年至今的同行评审文章和评论。主要特性包括多维搜索功能和用于组合搜索查询的逻辑运算符。MCLA数据集提供结构清晰的数据表视图，结合动态图形数据表示和可视化引文分析，有助于调查和识别研究趋势和模式。MCLA支持多种格式的数据导出，并定期更新。MCLA正在发展成为MC研究界的一个重要资源，而它的框架很容易适应自定义文献数据集，使其能够在其他研究领域使用。

引用次数: 0

Development of a Novel mRNA Vaccine Against Shigella Pathotypes Causing Widespread Shigellosis Endemic: An In-Silico Immunoinformatic Approach. 一种新型mRNA疫苗的研制：一种硅免疫信息学方法对抗引起广泛志贺氏菌病的致病性。

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS

Bioinformatics and Biology Insights

Pub Date : 2025-03-28 eCollection Date: 2025-01-01 DOI: 10.1177/11779322251328302

Abdur Razzak, Otun Saha, Khandokar Fahmida Sultana, Mohammad Ruhul Amin, Abdullah Bin Zahid, Afroza Sultana, Uditi Paul Bristi, Sultana Rajia, Nikkon Sarker, Md Mizanur Rahaman, Newaz Mohammed Bahadur, Foysal Hossen

Shigellosis remains a major global health concern, particularly in regions with poor sanitation and limited access to clean water. This study used immunoinformatics and reverse vaccinology to design a potential mRNA vaccine targeting Shigella pathotypes out of 4071 proteins from Shigella sonnei str. Ss046, 4 key antigenic candidates were identified: putative outer membrane protein (Q3YZL0), PapC-like porin protein (Q3YZM5), putative fimbrial-like protein (Q3Z3I2), and lipopolysaccharide (LPS)-assembly protein LptD (Q3Z5V5), ensuring broad pathotype coverage. A multitope vaccine was designed incorporating cytotoxic T lymphocyte, helper T lymphocyte, and B-cell epitopes, linked with suitable linkers and adjuvants to enhance immunogenicity. Computational analyses predicted vaccine's favorable antigenicity, solubility, and stability, while molecular docking and dynamic simulations demonstrated strong binding affinity and stability with Toll-like receptor 4 (TLR-4), indicating potential for robust immune activation. Immune simulations predicted strong humoral and cellular immune responses, characterized by significant cytokine production and long-term immune memory. Structural evaluations of the complex, including radius of gyration, root mean square deviation, root mean square fluctuation, and solvent accessibility, confirmed the vaccine's structural integrity, and stability under physiological conditions. This research contributes to the ongoing effort to alleviate the global burden of Shigella infections, providing a foundation for future wet laboratory investigations aimed at vaccine development.

志贺氏菌病仍然是一个主要的全球卫生问题，特别是在卫生条件差和获得清洁水有限的地区。本研究利用免疫信息学和反向疫苗学技术，从索尼氏志贺氏菌的4071种蛋白中设计了一种潜在的针对志贺氏菌病型的mRNA疫苗。Ss046鉴定出4种关键抗原候选物：推定的外膜蛋白（Q3YZL0）、papc样孔蛋白（Q3YZM5）、推定的纤维样蛋白（Q3Z3I2）和脂多糖（LPS）组装蛋白ltd (Q3Z5V5)，确保了广泛的病型覆盖。设计了一种包含细胞毒性T淋巴细胞、辅助性T淋巴细胞和b细胞表位的多位点疫苗，并与合适的连接物和佐剂连接以增强免疫原性。计算分析预测疫苗具有良好的抗原性、溶解度和稳定性，而分子对接和动态模拟显示疫苗与toll样受体4 （TLR-4）具有很强的结合亲和力和稳定性，表明疫苗具有强大的免疫激活潜力。免疫模拟预测了强烈的体液和细胞免疫反应，其特征是显著的细胞因子产生和长期免疫记忆。复合物的结构评价，包括旋转半径、均方根偏差、均方根波动和溶剂可及性，证实了疫苗在生理条件下的结构完整性和稳定性。这项研究有助于正在进行的减轻志贺氏菌感染全球负担的努力，为未来旨在开发疫苗的湿实验室调查提供基础。

{"title":"Development of a Novel mRNA Vaccine Against Shigella Pathotypes Causing Widespread Shigellosis Endemic: An In-Silico Immunoinformatic Approach.","authors":"Abdur Razzak, Otun Saha, Khandokar Fahmida Sultana, Mohammad Ruhul Amin, Abdullah Bin Zahid, Afroza Sultana, Uditi Paul Bristi, Sultana Rajia, Nikkon Sarker, Md Mizanur Rahaman, Newaz Mohammed Bahadur, Foysal Hossen","doi":"10.1177/11779322251328302","DOIUrl":"10.1177/11779322251328302","url":null,"abstract":"Shigellosis remains a major global health concern, particularly in regions with poor sanitation and limited access to clean water. This study used immunoinformatics and reverse vaccinology to design a potential mRNA vaccine targeting Shigella pathotypes out of 4071 proteins from Shigella sonnei str. Ss046, 4 key antigenic candidates were identified: putative outer membrane protein (Q3YZL0), PapC-like porin protein (Q3YZM5), putative fimbrial-like protein (Q3Z3I2), and lipopolysaccharide (LPS)-assembly protein LptD (Q3Z5V5), ensuring broad pathotype coverage. A multitope vaccine was designed incorporating cytotoxic T lymphocyte, helper T lymphocyte, and B-cell epitopes, linked with suitable linkers and adjuvants to enhance immunogenicity. Computational analyses predicted vaccine's favorable antigenicity, solubility, and stability, while molecular docking and dynamic simulations demonstrated strong binding affinity and stability with Toll-like receptor 4 (TLR-4), indicating potential for robust immune activation. Immune simulations predicted strong humoral and cellular immune responses, characterized by significant cytokine production and long-term immune memory. Structural evaluations of the complex, including radius of gyration, root mean square deviation, root mean square fluctuation, and solvent accessibility, confirmed the vaccine's structural integrity, and stability under physiological conditions. This research contributes to the ongoing effort to alleviate the global burden of Shigella infections, providing a foundation for future wet laboratory investigations aimed at vaccine development.","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251328302"},"PeriodicalIF":2.3,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11951904/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143751080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dynamic Gene Attention Focus (DyGAF): Enhancing Biomarker Identification Through Dual-Model Attention Networks. 动态基因注意焦点（DyGAF）：通过双模型注意网络增强生物标志物识别。

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS

Bioinformatics and Biology Insights

Pub Date : 2025-03-27 eCollection Date: 2025-01-01 DOI: 10.1177/11779322251325390

Md Khairul Islam, Himanshu Wagh, Hairong Wei

The DyGAF model, which stands for Dynamic Gene Attention Focus, is specifically designed and tailored to address the challenges in biomarker detection, progression reporting of pathogen infection, and disease diagnostics. The DyGAF model introduced a novel dual-model attention-based mechanism within neural networks, combined with machine learning algorithms to enhance the process of biomarker identification. The model transcended traditional diagnostic approaches by meticulously analyzing gene expression data. DyGAF not only identified but also ranked genes based on their significance, revealing a comprehensive list of the top genes essential for disease detection and prognosis. In addition, KEGG pathways, Wiki Pathways, and Gene Ontology-based analyses provided a multileveled evaluation of the genes' roles. In our analyses, we tailored COVID-19 gene expression profile from nasopharyngeal swabs that offer a more nuanced view of the intricate interplay between the host and the virus. The genes ranked by the DyGAF model were compared against those selected by differential expression analysis and random forest feature selection methods for further validation of our model. DyGAF demonstrated its prowess in identifying important biomarkers that could enrich gene ontologies and pathways crucial for elucidating the pathogenesis of COVID-19. Furthermore, DyGAF was also employed for diagnosing COVID-19 patients by classifying gene-expression profiles with an accuracy of 94.23%. Benchmarking against other conventional models revealed DyGAF's superior performance, highlighting its effectiveness in identifying and categorizing COVID-19 cases. In summary, DyGAF model represents a significant advancement in genomic research, providing a more comprehensive and precise tool for identifying key genetic markers and unraveling the complex biological insights of a disease. The DyGAF model is available as a software package at the following link: https://github.com/hiddenntreasure/DyGAF.

DyGAF模型，代表动态基因关注焦点，是专门设计和定制的，用于解决生物标志物检测，病原体感染进展报告和疾病诊断方面的挑战。DyGAF模型在神经网络中引入了一种新的基于注意力的双模型机制，并结合机器学习算法来增强生物标志物的识别过程。该模型通过细致地分析基因表达数据，超越了传统的诊断方法。DyGAF不仅对基因进行识别，还根据其重要性对基因进行排序，从而揭示了对疾病检测和预后至关重要的顶级基因的综合列表。此外，KEGG通路、Wiki通路和基于基因本体论的分析提供了对基因作用的多层次评估。在我们的分析中，我们从鼻咽拭子中定制了COVID-19基因表达谱，为宿主与病毒之间复杂的相互作用提供了更细致的视角。将DyGAF模型排序的基因与差异表达分析和随机森林特征选择方法选择的基因进行比较，以进一步验证我们的模型。DyGAF展示了其在识别重要生物标志物方面的能力，这些生物标志物可以丰富对阐明COVID-19发病机制至关重要的基因本体和途径。此外，DyGAF还用于诊断COVID-19患者，对基因表达谱进行分类，准确率为94.23%。与其他传统模型的对比显示，DyGAF的性能优越，突出了其在COVID-19病例识别和分类方面的有效性。总之，DyGAF模型代表了基因组研究的重大进步，为识别关键遗传标记和揭示疾病的复杂生物学见解提供了更全面和精确的工具。DyGAF模型作为软件包可在以下链接获得：https://github.com/hiddenntreasure/DyGAF。

{"title":"Dynamic Gene Attention Focus (DyGAF): Enhancing Biomarker Identification Through Dual-Model Attention Networks.","authors":"Md Khairul Islam, Himanshu Wagh, Hairong Wei","doi":"10.1177/11779322251325390","DOIUrl":"10.1177/11779322251325390","url":null,"abstract":"The DyGAF model, which stands for Dynamic Gene Attention Focus, is specifically designed and tailored to address the challenges in biomarker detection, progression reporting of pathogen infection, and disease diagnostics. The DyGAF model introduced a novel dual-model attention-based mechanism within neural networks, combined with machine learning algorithms to enhance the process of biomarker identification. The model transcended traditional diagnostic approaches by meticulously analyzing gene expression data. DyGAF not only identified but also ranked genes based on their significance, revealing a comprehensive list of the top genes essential for disease detection and prognosis. In addition, KEGG pathways, Wiki Pathways, and Gene Ontology-based analyses provided a multileveled evaluation of the genes' roles. In our analyses, we tailored COVID-19 gene expression profile from nasopharyngeal swabs that offer a more nuanced view of the intricate interplay between the host and the virus. The genes ranked by the DyGAF model were compared against those selected by differential expression analysis and random forest feature selection methods for further validation of our model. DyGAF demonstrated its prowess in identifying important biomarkers that could enrich gene ontologies and pathways crucial for elucidating the pathogenesis of COVID-19. Furthermore, DyGAF was also employed for diagnosing COVID-19 patients by classifying gene-expression profiles with an accuracy of 94.23%. Benchmarking against other conventional models revealed DyGAF's superior performance, highlighting its effectiveness in identifying and categorizing COVID-19 cases. In summary, DyGAF model represents a significant advancement in genomic research, providing a more comprehensive and precise tool for identifying key genetic markers and unraveling the complex biological insights of a disease. The DyGAF model is available as a software package at the following link: https://github.com/hiddenntreasure/DyGAF.","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251325390"},"PeriodicalIF":2.3,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11951896/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143751083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evolutionary and epidemic dynamics of COVID-19 in Germany exemplified by three Bayesian phylodynamic case studies. 以三个贝叶斯系统动力学案例为例分析新冠肺炎在德国的进化和流行动力学。

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS

Bioinformatics and Biology Insights

Pub Date : 2025-03-12 eCollection Date: 2025-01-01 DOI: 10.1177/11779322251321065

Sanni Översti, Ariane Weber, Viktor Baran, Bärbel Kieninger, Alexander Dilthey, Torsten Houwaart, Andreas Walker, Wulf Schneider-Brachert, Denise Kühnert

The importance of genomic surveillance strategies for pathogens has been particularly evident during the coronavirus disease 2019 (COVID-19) pandemic, as genomic data from the causative agent, severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2), have guided public health decisions worldwide. Bayesian phylodynamic inference, integrating epidemiology and evolutionary biology, has become an essential tool in genomic epidemiological surveillance. It enables the estimation of epidemiological parameters, such as the reproductive number, from pathogen sequence data alone. Despite the phylodynamic approach being widely adopted, the abundance of phylodynamic models often makes it challenging to select the appropriate model for specific research questions. This article illustrates the application of phylodynamic birth-death-sampling models in public health using genomic data, with a focus on SARS-CoV-2. Targeting researchers less familiar with phylodynamics, it introduces a comprehensive workflow, including the conceptualisation of a research study and detailed steps for data preprocessing and postprocessing. In addition, we demonstrate the versatility of birth-death-sampling models through three case studies from Germany, utilising the BEAST2 software and its model implementations. Each case study addresses a distinct research question relevant not only to SARS-CoV-2 but also to other pathogens: Case study 1 finds traces of a superspreading event at the start of an early outbreak, exemplifying how simple models for genomic data can provide information that would otherwise only be accessible through extensive contact tracing. Case study 2 compares transmission dynamics in a nosocomial outbreak to community transmission, highlighting distinct dynamics through integrative analysis. Case study 3 investigates whether local transmission patterns align with national trends, demonstrating how phylodynamic models can disentangle complex population substructure with little additional information. For each case study, we emphasise critical points where model assumptions and data properties may misalign and outline appropriate validation assessments. Overall, we aim to provide researchers with examples on using birth-death-sampling models in genomic epidemiology, balancing theoretical and practical aspects.

在2019冠状病毒病（COVID-19）大流行期间，病原体基因组监测战略的重要性尤为明显，因为病原体严重急性呼吸综合征冠状病毒2型（SARS-CoV-2）的基因组数据指导了全球公共卫生决策。贝叶斯系统动力学推断集流行病学和进化生物学于一体，已成为基因组流行病学监测的重要工具。它能够仅从病原体序列数据估计流行病学参数，如繁殖数。尽管系统动力学方法被广泛采用，但系统动力学模型的丰富往往使选择适合特定研究问题的模型具有挑战性。本文阐述了使用基因组数据的系统动力学出生-死亡抽样模型在公共卫生中的应用，重点是SARS-CoV-2。针对不太熟悉系统动力学的研究人员，它介绍了一个全面的工作流程，包括研究的概念化和数据预处理和后处理的详细步骤。此外，我们利用BEAST2软件及其模型实现，通过来自德国的三个案例研究，展示了出生-死亡抽样模型的多功能性。每个案例研究都解决了一个独特的研究问题，不仅与SARS-CoV-2有关，而且与其他病原体有关：案例研究1在早期疫情开始时发现了超级传播事件的痕迹，举例说明了基因组数据的简单模型如何能够提供信息，否则这些信息只能通过广泛的接触者追踪获得。案例研究2比较了医院暴发与社区传播的传播动态，通过综合分析突出了不同的传播动态。案例研究3调查了地方传播模式是否与国家趋势一致，展示了系统动力学模型如何在几乎没有额外信息的情况下解开复杂的种群亚结构。对于每个案例研究，我们强调模型假设和数据属性可能不一致的关键点，并概述适当的验证评估。总的来说，我们的目标是为研究人员提供在基因组流行病学中使用出生-死亡抽样模型的例子，平衡理论和实践方面。

{"title":"Evolutionary and epidemic dynamics of COVID-19 in Germany exemplified by three Bayesian phylodynamic case studies.","authors":"Sanni Översti, Ariane Weber, Viktor Baran, Bärbel Kieninger, Alexander Dilthey, Torsten Houwaart, Andreas Walker, Wulf Schneider-Brachert, Denise Kühnert","doi":"10.1177/11779322251321065","DOIUrl":"10.1177/11779322251321065","url":null,"abstract":"The importance of genomic surveillance strategies for pathogens has been particularly evident during the coronavirus disease 2019 (COVID-19) pandemic, as genomic data from the causative agent, severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2), have guided public health decisions worldwide. Bayesian phylodynamic inference, integrating epidemiology and evolutionary biology, has become an essential tool in genomic epidemiological surveillance. It enables the estimation of epidemiological parameters, such as the reproductive number, from pathogen sequence data alone. Despite the phylodynamic approach being widely adopted, the abundance of phylodynamic models often makes it challenging to select the appropriate model for specific research questions. This article illustrates the application of phylodynamic birth-death-sampling models in public health using genomic data, with a focus on SARS-CoV-2. Targeting researchers less familiar with phylodynamics, it introduces a comprehensive workflow, including the conceptualisation of a research study and detailed steps for data preprocessing and postprocessing. In addition, we demonstrate the versatility of birth-death-sampling models through three case studies from Germany, utilising the BEAST2 software and its model implementations. Each case study addresses a distinct research question relevant not only to SARS-CoV-2 but also to other pathogens: Case study 1 finds traces of a superspreading event at the start of an early outbreak, exemplifying how simple models for genomic data can provide information that would otherwise only be accessible through extensive contact tracing. Case study 2 compares transmission dynamics in a nosocomial outbreak to community transmission, highlighting distinct dynamics through integrative analysis. Case study 3 investigates whether local transmission patterns align with national trends, demonstrating how phylodynamic models can disentangle complex population substructure with little additional information. For each case study, we emphasise critical points where model assumptions and data properties may misalign and outline appropriate validation assessments. Overall, we aim to provide researchers with examples on using birth-death-sampling models in genomic epidemiology, balancing theoretical and practical aspects.","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251321065"},"PeriodicalIF":2.3,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11898094/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143613195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Gene Set Enrichment Analysis in Zebrafish Embryos Is Susceptible to False-Positive Results in the Absence of Differentially Expressed Genes. 在缺乏差异表达基因的情况下，斑马鱼胚胎的基因集富集分析容易产生假阳性结果。

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS

Bioinformatics and Biology Insights

Pub Date : 2025-03-04 eCollection Date: 2025-01-01 DOI: 10.1177/11779322251321071

John Dh Stead, Hyojin Lee, Andrew Williams, Sergio A Cortés Ramírez, Ella Atlas, Jan A Mennigen, Jason M O'Brien, Carole Yauk

High-throughput gene expression studies commonly employ pathway analyses to infer biological meaning from lists of differentially expressed genes (DEGs). In toxicology and pharmacology studies, treatment groups are analysed against vehicle controls to identify DEGs and altered pathways. Previously, we empirically quantified false-positive rates of DEGs in gene expression data from pools of vehicle-treated zebrafish embryos to determine appropriate study designs (sample and pool size). Here, the same data were subject to Over-Representation Analysis (ORA) and Gene Set Enrichment Analysis (GSEA) to identify false-positive enriched pathways. As expected, the number of false-positive ORA results was lowest where pool and sample sizes were largest (conditions which also generated the fewest significant DEGs). In contrast, the frequency of GSEA false-positives generated through the fast GSEA (fgsea) algorithm increased with pool and sample size and was highest for simulations that generated 0 DEGs, with ribosomal gene sets significantly enriched with the highest frequency. We describe 2 distinct mechanisms by which GSEA generated these false-positive results, both of which are most likely to generate significant gene sets under conditions where expression differences are particularly low. Finally, GSEA analyses were repeated using 1 alternative GSEA algorithm (CERNO) and 11 different ranking statistics. In almost every analysis, the number of significant results was highest where pool size was highest, with ribosome as the more frequently enriched gene set, suggesting our observations to be generalizable to different implementations of GSEA. These results from zebrafish embryos suggest caution in interpreting any GSEA results in contrasts where there are no DEGs.

高通量基因表达研究通常采用途径分析从差异表达基因（DEGs）列表中推断生物学意义。在毒理学和药理学研究中，将治疗组与对照对照进行分析，以确定deg和改变的途径。在此之前，我们通过经验性地量化了经过载体处理的斑马鱼胚胎池中基因表达数据中deg的假阳性率，以确定合适的研究设计（样本和池大小）。在这里，对相同的数据进行过代表性分析（ORA）和基因集富集分析（GSEA）以确定假阳性富集途径。正如预期的那样，在池和样本量最大的地方，ORA假阳性结果的数量最低（也产生最少的显著deg的条件）。相比之下，通过快速GSEA （fgsea）算法产生的GSEA假阳性频率随着池和样本量的增加而增加，并且在产生0 deg的模拟中最高，核糖体基因集显著富集，频率最高。我们描述了GSEA产生这些假阳性结果的两种不同机制，这两种机制都最有可能在表达差异特别低的条件下产生显著的基因集。最后，使用1种备选GSEA算法（CERNO）和11种不同的排名统计重复GSEA分析。在几乎每一个分析中，池大小最大的地方显著结果的数量最多，核糖体作为更频繁富集的基因集，表明我们的观察结果可推广到GSEA的不同实现。斑马鱼胚胎的这些结果表明，在没有DEGs的对照中，解释任何GSEA结果都要谨慎。

{"title":"Gene Set Enrichment Analysis in Zebrafish Embryos Is Susceptible to False-Positive Results in the Absence of Differentially Expressed Genes.","authors":"John Dh Stead, Hyojin Lee, Andrew Williams, Sergio A Cortés Ramírez, Ella Atlas, Jan A Mennigen, Jason M O'Brien, Carole Yauk","doi":"10.1177/11779322251321071","DOIUrl":"10.1177/11779322251321071","url":null,"abstract":"High-throughput gene expression studies commonly employ pathway analyses to infer biological meaning from lists of differentially expressed genes (DEGs). In toxicology and pharmacology studies, treatment groups are analysed against vehicle controls to identify DEGs and altered pathways. Previously, we empirically quantified false-positive rates of DEGs in gene expression data from pools of vehicle-treated zebrafish embryos to determine appropriate study designs (sample and pool size). Here, the same data were subject to Over-Representation Analysis (ORA) and Gene Set Enrichment Analysis (GSEA) to identify false-positive enriched pathways. As expected, the number of false-positive ORA results was lowest where pool and sample sizes were largest (conditions which also generated the fewest significant DEGs). In contrast, the frequency of GSEA false-positives generated through the fast GSEA (fgsea) algorithm increased with pool and sample size and was highest for simulations that generated 0 DEGs, with ribosomal gene sets significantly enriched with the highest frequency. We describe 2 distinct mechanisms by which GSEA generated these false-positive results, both of which are most likely to generate significant gene sets under conditions where expression differences are particularly low. Finally, GSEA analyses were repeated using 1 alternative GSEA algorithm (CERNO) and 11 different ranking statistics. In almost every analysis, the number of significant results was highest where pool size was highest, with ribosome as the more frequently enriched gene set, suggesting our observations to be generalizable to different implementations of GSEA. These results from zebrafish embryos suggest caution in interpreting any GSEA results in contrasts where there are no DEGs.","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251321071"},"PeriodicalIF":2.3,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11877468/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143555870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computational Development of Transmission-Blocking Vaccine Candidates Based on Fused Antigens of Pre- and Post-fertilization Gametocytes Against Plasmodium falciparum. 基于受精前和受精后配子细胞融合抗原的恶性疟原虫传播阻断候选疫苗的计算开发。

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS

Bioinformatics and Biology Insights

Pub Date : 2025-03-03 eCollection Date: 2025-01-01 DOI: 10.1177/11779322241306215

Matthew A Adeleke

Plasmodium falciparum is the most fatal species of malaria parasites in humans. Attempts at developing vaccines against the malaria parasites have not been very successful even after the approval of the RTS, S/AS01 vaccine. There is a continuous need for more effective vaccines including sexual-stage antigens that could block the transmission of malaria parasites between mosquitoes and humans. Low immunogenicity, expression, and stability are some of the challenges of transmission-blocking vaccine (TBV). This study was designed to computationally identify TBV candidates based on fused antigens by combining highly antigenic peptides from prefertilization (Pfs230, Pfs48/45) and postfertilization (Pfs25, Pfs28) gametocytes. The peptides were selected based on their antigenicity, nonallergenicity, and lack of similarity with the human proteome. Two fused antigens vaccine candidates (FAVCs) were constructed using Flagellin Salmonella enterica (FAVC-FSE) and Cholera toxin B (FAVC-CTB) as adjuvants. The constructs were evaluated for their physicochemical properties, structural stability, immunogenicity, and potential to elicit cross-protection across multiple Plasmodium species. The results yielded antigenic peptides, with antigenicity scores between 0.7589 and 1.1821. The structural analysis of FAVC-FSE and FAVC-CTB showed a Z-score of -6.70 and -4.79, a Ramachandran plot of 96.94% and 94.86% with overall quality of 94.20% and 89.85%, respectively. The FAVCs contained CD8⁺, CD4⁺, and linear B-cell epitopes with antigenicity scores between 1.2089 and 2.8623, 0.5663 and 2.4132, and 1.5196 and 2.2212, respectively. Each FAVC generated 6 conformational B-cells. High population coverage values were recorded for the FAVCs. The ability of the FAVCs to trigger immune response was evaluated through an in silico immune stimulation. The low-binding interaction energy that resulted from molecular docking and dynamics simulations showed a strong affinity of FAVCs to Toll-like receptor 5 (TLR5). The results indicate that the FAVC-FSE vaccine candidate is more promising to interrupt P falciparum transmission and provides a baseline for experimental validation.

恶性疟原虫是人类中最致命的疟疾寄生虫。即使在RTS， S/AS01疫苗获得批准之后，开发疟疾寄生虫疫苗的尝试也不是很成功。持续需要更有效的疫苗，包括可阻止疟疾寄生虫在蚊子和人之间传播的性阶段抗原。低免疫原性、表达和稳定性是传播阻断疫苗（TBV）面临的一些挑战。本研究旨在通过结合来自优选（Pfs230, Pfs48/45）和受精后（Pfs25, Pfs28）配子体的高抗原肽，基于融合抗原计算鉴定TBV候选者。这些肽是根据它们的抗原性、非过敏原性和与人类蛋白质组缺乏相似性来选择的。以肠鞭毛沙门氏菌（FAVC-FSE）和霍乱毒素B （FAVC-CTB）为佐剂，构建了两种融合抗原候选疫苗（FAVCs）。对这些构建物的理化性质、结构稳定性、免疫原性以及在多种疟原虫中引起交叉保护的潜力进行了评估。所得抗原肽的抗原性评分在0.7589 ~ 1.1821之间。结构分析显示，FAVC-FSE和FAVC-CTB的Z-score分别为-6.70和-4.79，Ramachandran图分别为96.94%和94.86%，总体质量分别为94.20%和89.85%。FAVCs含有CD8+、CD4+和线性b细胞表位，抗原性评分分别在1.2089 ~ 2.8623、0.5663 ~ 2.4132和1.5196 ~ 2.2212之间。每个FAVC生成6个构象b细胞。最发达地区的人口覆盖率较高。通过计算机免疫刺激来评估FAVCs触发免疫应答的能力。通过分子对接和动力学模拟得到的低结合相互作用能表明，FAVCs对toll样受体5 （TLR5）具有很强的亲和力。结果表明，FAVC-FSE候选疫苗更有希望阻断恶性疟原虫的传播，并为实验验证提供了基线。

{"title":"Computational Development of Transmission-Blocking Vaccine Candidates Based on Fused Antigens of Pre- and Post-fertilization Gametocytes Against Plasmodium falciparum.","authors":"Matthew A Adeleke","doi":"10.1177/11779322241306215","DOIUrl":"10.1177/11779322241306215","url":null,"abstract":"Plasmodium falciparum is the most fatal species of malaria parasites in humans. Attempts at developing vaccines against the malaria parasites have not been very successful even after the approval of the RTS, S/AS01 vaccine. There is a continuous need for more effective vaccines including sexual-stage antigens that could block the transmission of malaria parasites between mosquitoes and humans. Low immunogenicity, expression, and stability are some of the challenges of transmission-blocking vaccine (TBV). This study was designed to computationally identify TBV candidates based on fused antigens by combining highly antigenic peptides from prefertilization (Pfs230, Pfs48/45) and postfertilization (Pfs25, Pfs28) gametocytes. The peptides were selected based on their antigenicity, nonallergenicity, and lack of similarity with the human proteome. Two fused antigens vaccine candidates (FAVCs) were constructed using Flagellin Salmonella enterica (FAVC-FSE) and Cholera toxin B (FAVC-CTB) as adjuvants. The constructs were evaluated for their physicochemical properties, structural stability, immunogenicity, and potential to elicit cross-protection across multiple Plasmodium species. The results yielded antigenic peptides, with antigenicity scores between 0.7589 and 1.1821. The structural analysis of FAVC-FSE and FAVC-CTB showed a Z-score of -6.70 and -4.79, a Ramachandran plot of 96.94% and 94.86% with overall quality of 94.20% and 89.85%, respectively. The FAVCs contained CD8+, CD4+, and linear B-cell epitopes with antigenicity scores between 1.2089 and 2.8623, 0.5663 and 2.4132, and 1.5196 and 2.2212, respectively. Each FAVC generated 6 conformational B-cells. High population coverage values were recorded for the FAVCs. The ability of the FAVCs to trigger immune response was evaluated through an in silico immune stimulation. The low-binding interaction energy that resulted from molecular docking and dynamics simulations showed a strong affinity of FAVCs to Toll-like receptor 5 (TLR5). The results indicate that the FAVC-FSE vaccine candidate is more promising to interrupt P falciparum transmission and provides a baseline for experimental validation.","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322241306215"},"PeriodicalIF":2.3,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873872/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143540074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bioinformatics-Driven Investigations of Signature Biomarkers for Triple-Negative Breast Cancer. 三阴性乳腺癌标志性生物标志物的生物信息学研究。

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS

Bioinformatics and Biology Insights

Pub Date : 2025-03-02 eCollection Date: 2025-01-01 DOI: 10.1177/11779322241271565

Shristi Handa, Sanjeev Puri, Mary Chatterjee, Veena Puri

Breast cancer is a highly heterogeneous disorder characterized by dysregulated expression of number of genes and their cascades. It is one of the most common types of cancer in women posing serious health concerns globally. Recent developments and discovery of specific prognostic biomarkers have enabled its application toward developing personalized therapies. The basic premise of this study was to investigate key signature genes and signaling pathways involved in triple-negative breast cancer using bioinformatics approach. Microarray data set GSE65194 from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus was used for identification of differentially expressed genes (DEGs) using R software. Gene ontology and Kyoto Encyclopedia of Genes and Genome (KEGG) pathway enrichment analyses were carried out using the ClueGO plugin in Cytoscape software. The up-regulated DEGs were primarily engaged in the regulation of cell cycle, overexpression of spindle assembly checkpoint, and so on, whereas down-regulated DEGs were employed in alteration to major signaling pathways and metabolic reprogramming. The hub genes were identified using cytoHubba from protein-protein interaction (PPI) network for top up-regulated and down-regulated DEG's plugin in Cytoscape software. The hub genes were validated as potential signature biomarkers by evaluating the overall survival percentage in breast cancer patients.

乳腺癌是一种高度异质性的疾病，其特征是许多基因及其级联反应表达失调。它是女性中最常见的癌症类型之一，在全球范围内构成严重的健康问题。最近的发展和特定预后生物标志物的发现使其应用于开发个性化治疗。本研究的基本前提是利用生物信息学方法研究三阴性乳腺癌的关键特征基因和信号通路。微阵列数据集GSE65194来自国家生物技术信息中心（NCBI）基因表达Omnibus，使用R软件鉴定差异表达基因（DEGs）。使用Cytoscape软件中的ClueGO插件进行基因本体和京都基因与基因组百科全书（KEGG）途径富集分析。上调的deg主要参与调控细胞周期、纺锤体组装检查点过表达等，而下调的deg则参与主要信号通路的改变和代谢重编程。利用Cytoscape软件中蛋白-蛋白相互作用（PPI）网络中的cytoHubba，对上调和下调的DEG's插件进行枢纽基因鉴定。通过评估乳腺癌患者的总生存率，中心基因被验证为潜在的标志性生物标志物。

{"title":"Bioinformatics-Driven Investigations of Signature Biomarkers for Triple-Negative Breast Cancer.","authors":"Shristi Handa, Sanjeev Puri, Mary Chatterjee, Veena Puri","doi":"10.1177/11779322241271565","DOIUrl":"10.1177/11779322241271565","url":null,"abstract":"Breast cancer is a highly heterogeneous disorder characterized by dysregulated expression of number of genes and their cascades. It is one of the most common types of cancer in women posing serious health concerns globally. Recent developments and discovery of specific prognostic biomarkers have enabled its application toward developing personalized therapies. The basic premise of this study was to investigate key signature genes and signaling pathways involved in triple-negative breast cancer using bioinformatics approach. Microarray data set GSE65194 from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus was used for identification of differentially expressed genes (DEGs) using R software. Gene ontology and Kyoto Encyclopedia of Genes and Genome (KEGG) pathway enrichment analyses were carried out using the ClueGO plugin in Cytoscape software. The up-regulated DEGs were primarily engaged in the regulation of cell cycle, overexpression of spindle assembly checkpoint, and so on, whereas down-regulated DEGs were employed in alteration to major signaling pathways and metabolic reprogramming. The hub genes were identified using cytoHubba from protein-protein interaction (PPI) network for top up-regulated and down-regulated DEG's plugin in Cytoscape software. The hub genes were validated as potential signature biomarkers by evaluating the overall survival percentage in breast cancer patients.","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322241271565"},"PeriodicalIF":2.3,"publicationDate":"2025-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873876/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143540073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A "Dock-Work" Orange: A Dual-Receptor Biochemical Theory on the Deterrence Induced by Citrusy Aroma on Elephant Traffic Central to a Conservation Effort. “码头工作”橙：柑橘香气对大象交通威慑的双受体生化理论是保护工作的核心。

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS

Bioinformatics and Biology Insights

Pub Date : 2025-02-28 eCollection Date: 2025-01-01 DOI: 10.1177/11779322251315922

Dilantha Gunawardana

Conservation of elephants requires physical, chemical, and biological approaches to ensure the protection of these gargantuan pachyderms. One such approach is using orange plants (as biofencing) for the repellence of elephants, which precludes catastrophic events related to the encroachment of elephants into human habitats. Elephants have sensitive olfactory discrimination of plant volatile compounds for foraging and other behavior using G-protein-coupled receptors (GPCRs). However, 2 such receptors are the A2A and A2B receptors mediating olfaction elicited by a host of ligands, including limonene, the main volatile compound in citrus plants, which is hypothesized to be the chief repelling agent. Bioinformatics at the protein and mRNA levels (BLAST/Multiple Sequence Alignments) were employed to explore the multiple expression products of A2B receptors, namely full-length and truncated proteins produced by isoform mRNAs translated from multiple methionines, while the comparison of the limonene-binding pockets of human and elephant A2B receptors and prediction servers [Netphos 3.1; Protter] was used to focus, respectively, on the contacts limonene binding entails and the post-translational modifications that are involved in cell signaling. Finally, the link between limonene and antifeedant activity was explored by considering limonene content on trees that are preferentially foraged or avoided as part of the feeding behavior by elephants. The African bush elephant (Loxodonta africana) possesses a full-length A2A receptor but unlike most mammals, expresses a highly truncated A2B receptor isoform possessing only transmembrane helices 5, 6, and 7. Truncation may lead to higher traffic and expression of the A2B receptor in olfactory interfaces/pathways and aid stronger activation. In addition, all residues in the putative limonene-binding cleft are perfectly conserved between the human and African bush elephant A2B receptors, both full length and truncated. Shallow activation sites require micromolar affinity and fewer side-chain interactions, which is speculated to be the case for the truncated A2B receptor. An N-terminal extremity N-glycosylation motif is indicative of membrane localization of the truncated A2B receptor following accurate folding. A combination of truncation, indels, substitutions, and transcript isoforms are the attributed roles in the evolution of the L. africana A2B receptor, out of which limonene receptivity may be the key. It is also inferred how limonene may act as a dietary repellent/antifeedant to a generalist herbivore, with the documented limonene content being absent in some dietary favorites including the iconic Sclerocarya birrea.

大象的保护需要物理、化学和生物的方法来确保这些巨大的厚皮动物得到保护。其中一种方法是使用橙色植物（作为生物围栏）来驱赶大象，从而避免了与大象侵入人类栖息地有关的灾难性事件。大象利用g蛋白偶联受体（gpcr）对植物挥发性化合物的觅食和其他行为具有敏感的嗅觉辨别能力。然而，这两个受体是A2A和A2B受体，它们介导由一系列配体引起的嗅觉，包括柠檬烯，柑橘植物中主要的挥发性化合物，被假设为主要的驱避剂。利用蛋白和mRNA水平的生物信息学（BLAST/Multiple Sequence Alignments）方法探索A2B受体的多种表达产物，即由多个甲硫氨酸翻译的异构体mRNA产生的全长和短切蛋白，同时比较人和大象A2B受体柠檬烯结合囊和预测服务器[Netphos 3.1；proteter]分别用于关注柠檬烯结合所需的接触和涉及细胞信号传导的翻译后修饰。最后，通过考虑大象优先觅食或避免作为觅食行为的一部分的树木上的柠檬烯含量，探讨了柠檬烯与拒食行为之间的联系。非洲丛林象（Loxodonta africana）具有全长A2A受体，但与大多数哺乳动物不同，它表达高度截断的A2B受体异构体，仅具有跨膜螺旋5、6和7。截断可能导致嗅觉界面/通路中A2B受体的流量和表达增加，并有助于增强激活。此外，假定的柠檬烯结合裂缝中的所有残基在人类和非洲丛林象A2B受体之间都是完全保守的，无论是全长还是截断。浅激活位点需要微摩尔亲和力和较少的侧链相互作用，这被推测为截断的A2B受体的情况。n末端n糖基化基序表明截断的A2B受体在精确折叠后的膜定位。截断、插入、替换和转录异构体的组合在非洲L. A2B受体的进化中起着重要作用，其中柠檬烯的接受性可能是关键。本文还推断了柠檬烯是如何作为一种通才食草动物的饮食驱虫剂/拒食剂的，因为在一些饮食中，包括标志性的硬核虫（Sclerocarya birrea）中没有柠檬烯含量。

{"title":"A \"Dock-Work\" Orange: A Dual-Receptor Biochemical Theory on the Deterrence Induced by Citrusy Aroma on Elephant Traffic Central to a Conservation Effort.","authors":"Dilantha Gunawardana","doi":"10.1177/11779322251315922","DOIUrl":"https://doi.org/10.1177/11779322251315922","url":null,"abstract":"Conservation of elephants requires physical, chemical, and biological approaches to ensure the protection of these gargantuan pachyderms. One such approach is using orange plants (as biofencing) for the repellence of elephants, which precludes catastrophic events related to the encroachment of elephants into human habitats. Elephants have sensitive olfactory discrimination of plant volatile compounds for foraging and other behavior using G-protein-coupled receptors (GPCRs). However, 2 such receptors are the A2A and A2B receptors mediating olfaction elicited by a host of ligands, including limonene, the main volatile compound in citrus plants, which is hypothesized to be the chief repelling agent. Bioinformatics at the protein and mRNA levels (BLAST/Multiple Sequence Alignments) were employed to explore the multiple expression products of A2B receptors, namely full-length and truncated proteins produced by isoform mRNAs translated from multiple methionines, while the comparison of the limonene-binding pockets of human and elephant A2B receptors and prediction servers [Netphos 3.1; Protter] was used to focus, respectively, on the contacts limonene binding entails and the post-translational modifications that are involved in cell signaling. Finally, the link between limonene and antifeedant activity was explored by considering limonene content on trees that are preferentially foraged or avoided as part of the feeding behavior by elephants. The African bush elephant (Loxodonta africana) possesses a full-length A2A receptor but unlike most mammals, expresses a highly truncated A2B receptor isoform possessing only transmembrane helices 5, 6, and 7. Truncation may lead to higher traffic and expression of the A2B receptor in olfactory interfaces/pathways and aid stronger activation. In addition, all residues in the putative limonene-binding cleft are perfectly conserved between the human and African bush elephant A2B receptors, both full length and truncated. Shallow activation sites require micromolar affinity and fewer side-chain interactions, which is speculated to be the case for the truncated A2B receptor. An N-terminal extremity N-glycosylation motif is indicative of membrane localization of the truncated A2B receptor following accurate folding. A combination of truncation, indels, substitutions, and transcript isoforms are the attributed roles in the evolution of the L. africana A2B receptor, out of which limonene receptivity may be the key. It is also inferred how limonene may act as a dietary repellent/antifeedant to a generalist herbivore, with the documented limonene content being absent in some dietary favorites including the iconic Sclerocarya birrea.","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251315922"},"PeriodicalIF":2.3,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11869256/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143540072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predicting TF-Target Gene Association Using a Heterogeneous Network and Enhanced Negative Sampling. 利用异质网络和增强负抽样预测tf靶基因关联。

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS

Bioinformatics and Biology Insights

Pub Date : 2025-02-25 eCollection Date: 2025-01-01 DOI: 10.1177/11779322251316130

Thanh Tuoi Le, Xuan Tho Dang

Identifying interactions between transcription factors (TFs) and target genes is crucial for understanding the molecular mechanisms involved in biological processes and diseases. Traditional biological experiments used to determine these interactions are often time-consuming, costly, and limited in scale. Current computational methods mainly predict binding sites rather than direct interactions. Although recent studies have achieved high performance in predicting TF-target gene associations, they still face a significant challenge related to constructing a robust dataset of positive and negative samples. Currently, methods do not adequately focus on selecting negative samples, resulting in incomplete coverage of potential TF-target gene relationships. This article proposes a method to select enhanced negative samples to improve the prediction performance of TF-target gene interactions. Experimental results show that the proposed method achieves an average area under the curve (AUC) value of 0.9024 ± 0.0008 through 5-fold cross-validation. These results demonstrate the model's high efficiency and accuracy, confirming its potential application in predicting TF-target gene interactions across various datasets and paving the way for large-scale biomedical research.

确定转录因子（TFs）和靶基因之间的相互作用对于理解参与生物过程和疾病的分子机制至关重要。用于确定这些相互作用的传统生物学实验通常耗时、昂贵且规模有限。目前的计算方法主要是预测结合位点而不是直接相互作用。尽管最近的研究在预测tf靶基因关联方面取得了很高的成绩，但他们仍然面临着与构建一个强大的阳性和阴性样本数据集相关的重大挑战。目前，方法没有充分关注选择阴性样本，导致潜在tf靶基因关系的不完全覆盖。本文提出了一种选择增强负样本的方法，以提高tf -靶基因相互作用的预测性能。实验结果表明，通过5次交叉验证，该方法的平均曲线下面积（AUC）值为0.9024±0.0008。这些结果证明了该模型的高效率和准确性，证实了其在预测tf靶基因相互作用方面的潜在应用，并为大规模生物医学研究铺平了道路。

{"title":"Predicting TF-Target Gene Association Using a Heterogeneous Network and Enhanced Negative Sampling.","authors":"Thanh Tuoi Le, Xuan Tho Dang","doi":"10.1177/11779322251316130","DOIUrl":"10.1177/11779322251316130","url":null,"abstract":"Identifying interactions between transcription factors (TFs) and target genes is crucial for understanding the molecular mechanisms involved in biological processes and diseases. Traditional biological experiments used to determine these interactions are often time-consuming, costly, and limited in scale. Current computational methods mainly predict binding sites rather than direct interactions. Although recent studies have achieved high performance in predicting TF-target gene associations, they still face a significant challenge related to constructing a robust dataset of positive and negative samples. Currently, methods do not adequately focus on selecting negative samples, resulting in incomplete coverage of potential TF-target gene relationships. This article proposes a method to select enhanced negative samples to improve the prediction performance of TF-target gene interactions. Experimental results show that the proposed method achieves an average area under the curve (AUC) value of 0.9024 ± 0.0008 through 5-fold cross-validation. These results demonstrate the model's high efficiency and accuracy, confirming its potential application in predicting TF-target gene interactions across various datasets and paving the way for large-scale biomedical research.","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251316130"},"PeriodicalIF":2.3,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11863233/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143514589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Phenazine Scaffolds as a Potential Allosteric Inhibitor of LasR Protein in Pseudomonas aeruginosa. 非那嗪支架作为铜绿假单胞菌LasR蛋白的潜在变构抑制剂。

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS

Bioinformatics and Biology Insights

Pub Date : 2025-02-20 eCollection Date: 2025-01-01 DOI: 10.1177/11779322251319594

Prisca Baah Nketia, Prince Manu, Priscilla Osei-Poku, Alexander Kwarteng

Millions of individuals suffer from chronic infections caused by bacterial biofilms, resulting in significant loss of life. Pseudomonas aeruginosa stands out as a major culprit in causing such chronic infections, largely due to its antibiotic resistance. This pathogen poses a considerable threat in healthcare settings, particularly to critically ill and immunocompromised patients. The persistence of chronic and recurrent bacterial infections is often attributed to bacterial biofilms. Therefore, there is an urgent need to discover novel small molecules capable of efficiently eliminating biofilms independent of bacterial growth. In this project, an in silico drug discovery approach was employed to identify nine halogenated-phenazine compounds as allosteric inhibitors of the LasR protein. The LasR is a key transcription factor that triggers other quorum-sensing systems and plays a crucial role in biofilm formation and activation of virulence genes. By inhibiting LasR, specifically targeting its allosteric site, the dimerization of LasR and subsequent biofilm formation could be prevented. Molecular docking and simulations, coupled with binding energy calculations, identified five compounds with potential as anti-biofilm agents. These compounds exhibited higher binding affinities to the distal site, suggesting their structural capability to interact with allosteric site residues of the LasR protein. Based on these findings, it is proposed that these compounds could serve as promising leads for the treatment of biofilm and quorum-sensing-related infections.

数百万人遭受由细菌生物膜引起的慢性感染，造成重大生命损失。铜绿假单胞菌（Pseudomonas aeruginosa）是引起这种慢性感染的罪魁祸首，主要是由于其抗生素耐药性。这种病原体在卫生保健环境中构成相当大的威胁，特别是对危重病人和免疫功能低下的病人。慢性和复发性细菌感染的持续存在通常归因于细菌生物膜。因此，迫切需要发现能够独立于细菌生长而有效消除生物膜的新型小分子。在本项目中，采用计算机药物发现方法鉴定了9种卤化-非那嗪化合物作为LasR蛋白的变构抑制剂。LasR是触发其他群体感应系统的关键转录因子，在生物膜形成和毒力基因激活中起着至关重要的作用。通过抑制LasR，特别是针对其变构位点，可以阻止LasR的二聚化和随后的生物膜形成。分子对接和模拟，结合结合能计算，确定了五种可能作为抗生物膜剂的化合物。这些化合物对远端位点表现出更高的结合亲和力，表明它们的结构能力与LasR蛋白的变构位点残基相互作用。基于这些发现，我们建议这些化合物可以作为治疗生物膜和群体感应相关感染的有希望的先导物。

{"title":"Phenazine Scaffolds as a Potential Allosteric Inhibitor of LasR Protein in Pseudomonas aeruginosa.","authors":"Prisca Baah Nketia, Prince Manu, Priscilla Osei-Poku, Alexander Kwarteng","doi":"10.1177/11779322251319594","DOIUrl":"10.1177/11779322251319594","url":null,"abstract":"Millions of individuals suffer from chronic infections caused by bacterial biofilms, resulting in significant loss of life. Pseudomonas aeruginosa stands out as a major culprit in causing such chronic infections, largely due to its antibiotic resistance. This pathogen poses a considerable threat in healthcare settings, particularly to critically ill and immunocompromised patients. The persistence of chronic and recurrent bacterial infections is often attributed to bacterial biofilms. Therefore, there is an urgent need to discover novel small molecules capable of efficiently eliminating biofilms independent of bacterial growth. In this project, an in silico drug discovery approach was employed to identify nine halogenated-phenazine compounds as allosteric inhibitors of the LasR protein. The LasR is a key transcription factor that triggers other quorum-sensing systems and plays a crucial role in biofilm formation and activation of virulence genes. By inhibiting LasR, specifically targeting its allosteric site, the dimerization of LasR and subsequent biofilm formation could be prevented. Molecular docking and simulations, coupled with binding energy calculations, identified five compounds with potential as anti-biofilm agents. These compounds exhibited higher binding affinities to the distal site, suggesting their structural capability to interact with allosteric site residues of the LasR protein. Based on these findings, it is proposed that these compounds could serve as promising leads for the treatment of biofilm and quorum-sensing-related infections.","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251319594"},"PeriodicalIF":2.3,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11843726/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143482212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0