The recombination plays a key role in promoting evolution of RNA viruses and emergence of potentially epidemic variants. Some studies investigated the recombination occurrence among SARS-CoV-2, without exploring its impact on virus-host interaction. In the aim to investigate the burden of recombination in terms of frequency and distribution, the occurrence of recombination was first explored in 44 230 Omicron sequences among BQ subvariants and the under investigation "ML" (Multiple Lineages) denoted sequences, using 3seq software. Second, the recombination impact on interaction between the Spike protein and ACE2 receptor as well as neutralizing antibodies (nAbs), was analyzed using docking tools. Recombination was detected in 56.91% and 82.20% of BQ and ML strains, respectively. It took place mainly in spike and ORF1a genes. For BQ recombinant strains, the docking analysis showed that the spike interacted strongly with ACE2 and weakly with nAbs. The mutations S373P, S375F and T376A constitute a residue network that enhances the RBD interaction with ACE2. Thirteen mutations in RBD (S373P, S375F, T376A, D405N, R408S, K417N, N440K, S477N, P494S, Q498R, N501Y, and Y505H) and NTD (Y240H) seem to be implicated in immune evasion of recombinants by altering spike interaction with nAbs. In conclusion, this "in silico" study demonstrated that the recombination mechanism is frequent among Omicron BQ and ML variants. It highlights new key mutations, that potentially implicated in enhancement of spike binding to ACE2 (F376A) and escape from nAbs (RBD: F376A, D405N, R408S, N440K, S477N, P494S, and Y505H; NTD: Y240H). Our findings present considerable insights for the elaboration of effective prophylaxis and therapeutic strategies against future SARS-CoV-2 waves.
{"title":"Recombination Events Among SARS-CoV-2 Omicron Subvariants: Impact on Spike Interaction With ACE2 Receptor and Neutralizing Antibodies.","authors":"Marwa Arbi, Marwa Khedhiri, Kaouther Ayouni, Oussema Souiai, Samar Dhouib, Nidhal Ghanmi, Alia Benkahla, Henda Triki, Sondes Haddad-Boubaker","doi":"10.1177/11769343241272415","DOIUrl":"10.1177/11769343241272415","url":null,"abstract":"<p><p>The recombination plays a key role in promoting evolution of RNA viruses and emergence of potentially epidemic variants. Some studies investigated the recombination occurrence among SARS-CoV-2, without exploring its impact on virus-host interaction. In the aim to investigate the burden of recombination in terms of frequency and distribution, the occurrence of recombination was first explored in 44 230 Omicron sequences among BQ subvariants and the under investigation \"ML\" (Multiple Lineages) denoted sequences, using 3seq software. Second, the recombination impact on interaction between the Spike protein and ACE2 receptor as well as neutralizing antibodies (nAbs), was analyzed using docking tools. Recombination was detected in 56.91% and 82.20% of BQ and ML strains, respectively. It took place mainly in spike and ORF1a genes. For BQ recombinant strains, the docking analysis showed that the spike interacted strongly with ACE2 and weakly with nAbs. The mutations S373P, S375F and T376A constitute a residue network that enhances the RBD interaction with ACE2. Thirteen mutations in RBD (S373P, S375F, T376A, D405N, R408S, K417N, N440K, S477N, P494S, Q498R, N501Y, and Y505H) and NTD (Y240H) seem to be implicated in immune evasion of recombinants by altering spike interaction with nAbs. In conclusion, this \"in silico\" study demonstrated that the recombination mechanism is frequent among Omicron BQ and ML variants. It highlights new key mutations, that potentially implicated in enhancement of spike binding to ACE2 (F376A) and escape from nAbs (RBD: F376A, D405N, R408S, N440K, S477N, P494S, and Y505H; NTD: Y240H). Our findings present considerable insights for the elaboration of effective prophylaxis and therapeutic strategies against future SARS-CoV-2 waves.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343241272415"},"PeriodicalIF":1.7,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11325312/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141989369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-14eCollection Date: 2024-01-01DOI: 10.1177/11769343241272413
Yili Luo, Jianpeng Liu, Wangqiang Feng, Da Lin, Mengji Chen, Haihua Zheng
Background: Age-related Macular Degeneration (AMD) poses a growing global health concern as the leading cause of central vision loss in elderly people.
Objection: This study focuses on unraveling the intricate involvement of Natural Killer (NK) cells in AMD, shedding light on their immune responses and cytokine regulatory roles.
Methods: Transcriptomic data from the Gene Expression Omnibus database were utilized, employing single-cell RNA-seq analysis. High-dimensional weighted gene co-expression network analysis (hdWGCNA) and single-cell regulatory network inference and clustering (SCENIC) analysis were applied to reveal the regulatory mechanisms of NK cells in early-stage AMD patients. Machine learning models, such as random forests and decision trees, were employed to screen hub genes and key transcription factors (TFs) associated with AMD.
Results: Distinct cell clusters were identified in the present study, especially the T/NK cluster, with a notable increase in NK cell abundance observed in AMD. Cell-cell communication analyses revealed altered interactions, particularly in NK cells, indicating their potential role in AMD pathogenesis. HdWGCNA highlighted the turquoise module, enriched in inflammation-related pathways, as significantly associated with AMD in NK cells. The SCENIC analysis identified key TFs in NK cell regulatory networks. The integration of hub genes and TFs identified CREM, FOXP1, IRF1, NFKB2, and USF2 as potential predictors for AMD through machine learning.
Conclusion: This comprehensive approach enhances our understanding of NK cell dynamics, signaling alterations, and potential predictive models for AMD. The identified TFs provide new avenues for molecular interventions and highlight the intricate relationship between NK cells and AMD pathogenesis. Overall, this study contributes valuable insights for advancing our understanding and management of AMD.
背景:年龄相关性黄斑变性(AMD)是导致老年人中心视力丧失的主要原因,已成为全球日益关注的健康问题:本研究的重点是揭示自然杀伤细胞(NK)在AMD中的复杂参与,阐明其免疫反应和细胞因子的调控作用:方法:利用单细胞RNA-seq分析基因表达总库(Gene Expression Omnibus)的转录组数据。应用高维加权基因共表达网络分析(hdWGCNA)和单细胞调控网络推断与聚类分析(SCENIC)揭示早期AMD患者NK细胞的调控机制。采用随机森林和决策树等机器学习模型筛选与AMD相关的枢纽基因和关键转录因子(TFs):结果:本研究发现了不同的细胞群,尤其是T/NK细胞群,观察到AMD患者的NK细胞数量明显增加。细胞-细胞通讯分析表明,细胞间的相互作用发生了改变,特别是在NK细胞中,这表明它们在AMD发病机制中的潜在作用。HdWGCNA突出显示了绿松石模块,该模块富含炎症相关通路,与NK细胞中的AMD显著相关。SCENIC 分析确定了 NK 细胞调控网络中的关键 TFs。通过机器学习,整合枢纽基因和TFs确定了CREM、FOXP1、IRF1、NFKB2和USF2是AMD的潜在预测因子:这一综合方法增强了我们对 NK 细胞动态、信号改变和 AMD 潜在预测模型的了解。鉴定出的TFs为分子干预提供了新途径,并凸显了NK细胞与AMD发病机制之间错综复杂的关系。总之,这项研究为促进我们对 AMD 的了解和管理提供了宝贵的见解。
{"title":"Single-cell RNA Sequencing Identifies Natural Kill Cell-Related Transcription Factors Associated With Age-Related Macular Degeneration.","authors":"Yili Luo, Jianpeng Liu, Wangqiang Feng, Da Lin, Mengji Chen, Haihua Zheng","doi":"10.1177/11769343241272413","DOIUrl":"10.1177/11769343241272413","url":null,"abstract":"<p><strong>Background: </strong>Age-related Macular Degeneration (AMD) poses a growing global health concern as the leading cause of central vision loss in elderly people.</p><p><strong>Objection: </strong>This study focuses on unraveling the intricate involvement of Natural Killer (NK) cells in AMD, shedding light on their immune responses and cytokine regulatory roles.</p><p><strong>Methods: </strong>Transcriptomic data from the Gene Expression Omnibus database were utilized, employing single-cell RNA-seq analysis. High-dimensional weighted gene co-expression network analysis (hdWGCNA) and single-cell regulatory network inference and clustering (SCENIC) analysis were applied to reveal the regulatory mechanisms of NK cells in early-stage AMD patients. Machine learning models, such as random forests and decision trees, were employed to screen hub genes and key transcription factors (TFs) associated with AMD.</p><p><strong>Results: </strong>Distinct cell clusters were identified in the present study, especially the T/NK cluster, with a notable increase in NK cell abundance observed in AMD. Cell-cell communication analyses revealed altered interactions, particularly in NK cells, indicating their potential role in AMD pathogenesis. HdWGCNA highlighted the turquoise module, enriched in inflammation-related pathways, as significantly associated with AMD in NK cells. The SCENIC analysis identified key TFs in NK cell regulatory networks. The integration of hub genes and TFs identified <i>CREM, FOXP1, IRF1, NFKB2</i>, and <i>USF2</i> as potential predictors for AMD through machine learning.</p><p><strong>Conclusion: </strong>This comprehensive approach enhances our understanding of NK cell dynamics, signaling alterations, and potential predictive models for AMD. The identified TFs provide new avenues for molecular interventions and highlight the intricate relationship between NK cells and AMD pathogenesis. Overall, this study contributes valuable insights for advancing our understanding and management of AMD.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343241272413"},"PeriodicalIF":1.7,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11325330/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141989370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-24eCollection Date: 2024-01-01DOI: 10.1177/11769343241263230
Arthur Casulli de Oliveira, Luiz Augusto Bovolenta, Lucas Figueiredo, Amanda De Oliveira Ribeiro, Beatriz Jacinto Alves Pereira, Talita Roberto Aleixo de Almeida, Vinicius Farias Campos, James G Patton, Danillo Pinhal
In metazoans, microRNAs (miRNAs) are essential regulators of gene expression, affecting critical cellular processes from differentiation and proliferation, to homeostasis. During miRNA biogenesis, the miRNA strand that loads onto the RNA-induced Silencing Complex (RISC) can vary, leading to changes in gene targeting and modulation of biological pathways. To investigate the impact of these "arm switching" events on gene regulation, we analyzed a diverse range of tissues and developmental stages in zebrafish by comparing 5p and 3p arms accumulation dynamics between embryonic developmental stages, adult tissues, and sexes. We also compared variable arm usage patterns observed in zebrafish to other vertebrates including arm switching data from fish, birds, and mammals. Our comprehensive analysis revealed that variable arm usage events predominantly take place during embryonic development. It is also noteworthy that isomiR occurrence correlates to changes in arm selection evidencing an important role of microRNA distinct isoforms in reinforcing and modifying gene regulation by promoting dynamics switches on miRNA 5p and 3p arms accumulation. Our results shed new light on the emergence and coordination of gene expression regulation and pave the way for future investigations in this field.
{"title":"MicroRNA Transcriptomes Reveal Prevalence of Rare and Species-Specific Arm Switching Events During Zebrafish Ontogenesis.","authors":"Arthur Casulli de Oliveira, Luiz Augusto Bovolenta, Lucas Figueiredo, Amanda De Oliveira Ribeiro, Beatriz Jacinto Alves Pereira, Talita Roberto Aleixo de Almeida, Vinicius Farias Campos, James G Patton, Danillo Pinhal","doi":"10.1177/11769343241263230","DOIUrl":"10.1177/11769343241263230","url":null,"abstract":"<p><p>In metazoans, microRNAs (miRNAs) are essential regulators of gene expression, affecting critical cellular processes from differentiation and proliferation, to homeostasis. During miRNA biogenesis, the miRNA strand that loads onto the RNA-induced Silencing Complex (RISC) can vary, leading to changes in gene targeting and modulation of biological pathways. To investigate the impact of these \"arm switching\" events on gene regulation, we analyzed a diverse range of tissues and developmental stages in zebrafish by comparing 5p and 3p arms accumulation dynamics between embryonic developmental stages, adult tissues, and sexes. We also compared variable arm usage patterns observed in zebrafish to other vertebrates including arm switching data from fish, birds, and mammals. Our comprehensive analysis revealed that variable arm usage events predominantly take place during embryonic development. It is also noteworthy that isomiR occurrence correlates to changes in arm selection evidencing an important role of microRNA distinct isoforms in reinforcing and modifying gene regulation by promoting dynamics switches on miRNA 5p and 3p arms accumulation. Our results shed new light on the emergence and coordination of gene expression regulation and pave the way for future investigations in this field.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343241263230"},"PeriodicalIF":1.7,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11271096/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141762332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-14eCollection Date: 2024-01-01DOI: 10.1177/11769343241261814
Linrong Wan, Siyuan Su, Jinyun Liu, Bangxing Zou, Yaming Jiang, Beibei Jiao, Shaokuan Tang, Youhong Zhang, Cao Deng, Wenfu Xiao
Background: Pseudogenes are sequences that have lost the ability to transcribe RNA molecules or encode truncated but possibly functional proteins. While they were once considered to be meaningless remnants of evolution, recent researches have shown that pseudogenes play important roles in various biological processes. However, the studies of pseudogenes in the silkworm, an important model organism, are limited and have focused on single or only a few specific genes.
Objective: To fill these gaps, we present a systematic genome-wide studies of pseudogenes in the silkworm.
Methods: We identified the pseudogenes in the silkworm using the silkworm genome assemblies, transcriptome, protein sequences from silkworm and its related species. Then we used transcriptome datasets from 832 RNA-seq analyses to construct spatio-temporal expression profiles for these pseudogenes. Additionally, we identified tissue-specifically expressed and differentially expressed pseudogenes to further understand their characteristics. Finally, the functional roles of pseudogenes as lncRNAs were systematically analyzed.
Results: We identified a total of 4410 pseudogenes, which were grouped into 4 groups, including duplications (DUPs), unitary pseudogenes (Unitary), processed pseudogenes (retropseudogenes, RETs), and fragments (FRAGs). The most of pseudogenes in the domestic silkworm were generated before the divergence of wild and domestic silkworm, however, the domestication may also involve in the accumulation of pseudogenes. These pseudogenes were clearly divided into 2 cluster, a highly expressed and a lowly expressed, and the posterior silk gland was the tissue with the most tissue-specific pseudogenes (199), implying these pseudogenes may be involved in the development and function of silkgland. We identified 3299 lncRNAs in these pseudogenes, and the target genes of these lncRNAs in silkworm pseudogenes were enriched in the egg formation and olfactory function.
Conclusions: This study replenishes the genome annotations for silkworm, provide valuable insights into the biological roles of pseudogenes. It will also contribute to our understanding of the complex gene regulatory networks in the silkworm and will potentially have implications for other organisms as well.
{"title":"The Spatio-Temporal Expression Profiles of Silkworm Pseudogenes Provide Valuable Insights into Their Biological Roles.","authors":"Linrong Wan, Siyuan Su, Jinyun Liu, Bangxing Zou, Yaming Jiang, Beibei Jiao, Shaokuan Tang, Youhong Zhang, Cao Deng, Wenfu Xiao","doi":"10.1177/11769343241261814","DOIUrl":"10.1177/11769343241261814","url":null,"abstract":"<p><strong>Background: </strong>Pseudogenes are sequences that have lost the ability to transcribe RNA molecules or encode truncated but possibly functional proteins. While they were once considered to be meaningless remnants of evolution, recent researches have shown that pseudogenes play important roles in various biological processes. However, the studies of pseudogenes in the silkworm, an important model organism, are limited and have focused on single or only a few specific genes.</p><p><strong>Objective: </strong>To fill these gaps, we present a systematic genome-wide studies of pseudogenes in the silkworm.</p><p><strong>Methods: </strong>We identified the pseudogenes in the silkworm using the silkworm genome assemblies, transcriptome, protein sequences from silkworm and its related species. Then we used transcriptome datasets from 832 RNA-seq analyses to construct spatio-temporal expression profiles for these pseudogenes. Additionally, we identified tissue-specifically expressed and differentially expressed pseudogenes to further understand their characteristics. Finally, the functional roles of pseudogenes as lncRNAs were systematically analyzed.</p><p><strong>Results: </strong>We identified a total of 4410 pseudogenes, which were grouped into 4 groups, including duplications (DUPs), unitary pseudogenes (Unitary), processed pseudogenes (retropseudogenes, RETs), and fragments (FRAGs). The most of pseudogenes in the domestic silkworm were generated before the divergence of wild and domestic silkworm, however, the domestication may also involve in the accumulation of pseudogenes. These pseudogenes were clearly divided into 2 cluster, a highly expressed and a lowly expressed, and the posterior silk gland was the tissue with the most tissue-specific pseudogenes (199), implying these pseudogenes may be involved in the development and function of silkgland. We identified 3299 lncRNAs in these pseudogenes, and the target genes of these lncRNAs in silkworm pseudogenes were enriched in the egg formation and olfactory function.</p><p><strong>Conclusions: </strong>This study replenishes the genome annotations for silkworm, provide valuable insights into the biological roles of pseudogenes. It will also contribute to our understanding of the complex gene regulatory networks in the silkworm and will potentially have implications for other organisms as well.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343241261814"},"PeriodicalIF":2.6,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11179516/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141332419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-30DOI: 10.1177/11769343241257344
Chao He, Bin Zhu, Wenwen Gao, Qianjin Wu, Changshui Zhang
In diploid organisms, half of the chromosomes in each cell come from the father and half from the mother. Through previous studies, it was found that the paternal chromosome and the maternal chromosome can be regulated and expressed independently, leading to the emergence of allele specific expression (ASE). In this study, we analyzed the differential expression of alleles in the high-altitude population and the normal population based on the RNA sequencing data. Through gene cluster analysis and protein interaction network analysis, we found some changes occurred at the gene level, and some negative effects. During the study, we realized that the calmodulin homology domain may have a certain correlation with long-term survival at high altitude. The plateau environment is characterized by hypoxia, low air pressure, strong ultraviolet radiation, and low temperature. Accordingly, the genetic changes in the process of adaptation are mainly reflected in these characteristics. High altitude generation living is also highly related to cancer, immune disease, cardiovascular disease, neurological disease, endocrine disease, and other diseases. Therefore, the medical system in high altitude areas should pay more attention to these diseases.
{"title":"Study on Allele Specific Expression of Long-Term Residents in High Altitude Areas","authors":"Chao He, Bin Zhu, Wenwen Gao, Qianjin Wu, Changshui Zhang","doi":"10.1177/11769343241257344","DOIUrl":"https://doi.org/10.1177/11769343241257344","url":null,"abstract":"In diploid organisms, half of the chromosomes in each cell come from the father and half from the mother. Through previous studies, it was found that the paternal chromosome and the maternal chromosome can be regulated and expressed independently, leading to the emergence of allele specific expression (ASE). In this study, we analyzed the differential expression of alleles in the high-altitude population and the normal population based on the RNA sequencing data. Through gene cluster analysis and protein interaction network analysis, we found some changes occurred at the gene level, and some negative effects. During the study, we realized that the calmodulin homology domain may have a certain correlation with long-term survival at high altitude. The plateau environment is characterized by hypoxia, low air pressure, strong ultraviolet radiation, and low temperature. Accordingly, the genetic changes in the process of adaptation are mainly reflected in these characteristics. High altitude generation living is also highly related to cancer, immune disease, cardiovascular disease, neurological disease, endocrine disease, and other diseases. Therefore, the medical system in high altitude areas should pay more attention to these diseases.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"80 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141190451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10eCollection Date: 2024-01-01DOI: 10.1177/11769343241249916
Syed Shah Muhammad, Muhammad Shoaib, Muhammad Tariq Pervez
Single nucleotide polymorphisms are most common type of genetic variation in human genome. Analyzing genetic variants can help us better understand the genetic basis of diseases and develop predictive models which are useful to identify individuals who are at increased risk for certain diseases. Several SNP analysis tools have already been developed. For running these tools, the user needs to collect data from various databases. Secondly, often researchers have to use multiple variant analysis tools for cross validating their results and increase confidence in their findings. Extracting data from multiple databases and running multiple tools at a time, increases complexity and time required for analysis. There are some web-based tools that integrate multiple genetic variant databases and provide variant annotations for a few tools. These approaches have some limitations such as retrieving annotation information, filtering common pathogenic variants. The proposed web-based tool, namely IPSNP: An Integrated Platform for Predicting Impact of SNPs is written in Django which is a python-based framework. It uses RESTful API of MyVariant.info to extract annotation information of variants associated with a given gene, rsID, HGVS format variants specified in a VCF file for 29 tools. The results are in the form of a CSV file of predictions (1) derived from the consensus decision, (2) a file having annotations for the variants associated with the given gene, (3) a file showing variants declared as pathogenic commonly by the selected tools, and (4) a CSV file containing chromosome coordinates based on GRCh37 and GRCh38 genome assemblies, rsIDs and proteomic data, so that users may use tools of their choice and avoiding manual parameter collection for each tool. IPSNP is a valuable resource for researchers and clinicians and it can help to save time and effort in discovering the novel disease-associated variants and the development of personalized treatments.
{"title":"An Integrated Framework for Analysis and Prediction of Impact of Single Nucleotide Polymorphism Associated with Human Diseases.","authors":"Syed Shah Muhammad, Muhammad Shoaib, Muhammad Tariq Pervez","doi":"10.1177/11769343241249916","DOIUrl":"10.1177/11769343241249916","url":null,"abstract":"<p><p>Single nucleotide polymorphisms are most common type of genetic variation in human genome. Analyzing genetic variants can help us better understand the genetic basis of diseases and develop predictive models which are useful to identify individuals who are at increased risk for certain diseases. Several SNP analysis tools have already been developed. For running these tools, the user needs to collect data from various databases. Secondly, often researchers have to use multiple variant analysis tools for cross validating their results and increase confidence in their findings. Extracting data from multiple databases and running multiple tools at a time, increases complexity and time required for analysis. There are some web-based tools that integrate multiple genetic variant databases and provide variant annotations for a few tools. These approaches have some limitations such as retrieving annotation information, filtering common pathogenic variants. The proposed web-based tool, namely IPSNP: An Integrated Platform for Predicting Impact of SNPs is written in Django which is a python-based framework. It uses RESTful API of MyVariant.info to extract annotation information of variants associated with a given gene, rsID, HGVS format variants specified in a VCF file for 29 tools. The results are in the form of a CSV file of predictions (1) derived from the consensus decision, (2) a file having annotations for the variants associated with the given gene, (3) a file showing variants declared as pathogenic commonly by the selected tools, and (4) a CSV file containing chromosome coordinates based on GRCh37 and GRCh38 genome assemblies, rsIDs and proteomic data, so that users may use tools of their choice and avoiding manual parameter collection for each tool. IPSNP is a valuable resource for researchers and clinicians and it can help to save time and effort in discovering the novel disease-associated variants and the development of personalized treatments.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343241249916"},"PeriodicalIF":2.6,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11088291/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140913243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-26DOI: 10.1177/11769343241249017
Yihang Zhao, Hong Tang, Jianhua Xu, Feifei Sun, Yuanyuan Zhao, Yang Li
Background:Intestinal metaplasia (IM) of gastric epithelium has traditionally been regarded as an irreversible stage in the process of the Correa cascade. Exploring the potential molecular mechanism of IM is significant for effective gastric cancer prevention.Methods:The GSE78523 dataset, obtained from the Gene Expression Omnibus (GEO) database, was analyzed using RStudio software to identify the differently expressed genes (DEGs) between IM tissues and normal gastric epithelial tissues. Subsequently, gene ontology (GO) analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, Gene Set Enrichment Analysis (GESA), and protein-protein interaction (PPI) analysis were used to find potential genes. Additionally, the screened genes were analyzed for clinical, immunological, and genetic correlation aspects using single gene clinical correlation analysis (UALCAN), Tumor–Immune System Interactions Database (TISIDB), and validated through western blot experiments.Results:Enrichment analysis showed that the lipid metabolic pathway was significantly associated with IM tissues and the apolipoprotein B ( APOB) gene was identified in the subsequent analysis. Experiment results and correlation analysis showed that the expression of APOB was higher in IM tissues than in normal tissues. This elevated expression of APOB was also found to be associated with the expression levels of hepatocyte nuclear factor 4A ( HNF4A) gene. HNF4A was also found to be associated with immune cell infiltration to gastric cancer and was linked to the prognosis of gastric cancer patients. Moreover, HNF4A was also highly expressed in both IM tissues and gastric cancer cells.Conclusion:Our findings indicate that HNF4A regulates the microenvironment of lipid metabolism in IM tissues by targeting APOB. Higher expression of HNF4A tends to lead to a worse prognosis in gastric cancer patients implying it may serve as a predictive indicator for the progression from IM to gastric cancer.
背景:胃上皮的肠化生(Intestinal metaplasia,IM)传统上被认为是科雷亚级联过程中的一个不可逆阶段。方法:使用 RStudio 软件分析从基因表达总库(GEO)数据库中获得的 GSE78523 数据集,以确定 IM 组织与正常胃上皮组织之间的差异表达基因(DEGs)。随后,利用基因本体(GO)分析、京都基因组百科全书(KEGG)富集分析、基因组富集分析(GESA)和蛋白-蛋白相互作用(PPI)分析来寻找潜在基因。结果:富集分析表明,脂质代谢通路与IM组织显著相关,并在随后的分析中发现了载脂蛋白B(APOB)基因。实验结果和相关分析表明,IM 组织中 APOB 的表达高于正常组织。研究还发现,APOB 的高表达与肝细胞核因子 4A (HNF4A)基因的表达水平有关。研究还发现,HNF4A 与胃癌的免疫细胞浸润有关,并与胃癌患者的预后有关。结论:我们的研究结果表明,HNF4A 通过靶向 APOB 调节 IM 组织中脂质代谢的微环境。结论:我们的研究结果表明,HNF4A通过靶向APOB调节IM组织中的脂质代谢微环境,HNF4A表达越高,胃癌患者的预后越差。
{"title":"HNF4A-Bridging the Gap Between Intestinal Metaplasia and Gastric Cancer","authors":"Yihang Zhao, Hong Tang, Jianhua Xu, Feifei Sun, Yuanyuan Zhao, Yang Li","doi":"10.1177/11769343241249017","DOIUrl":"https://doi.org/10.1177/11769343241249017","url":null,"abstract":"Background:Intestinal metaplasia (IM) of gastric epithelium has traditionally been regarded as an irreversible stage in the process of the Correa cascade. Exploring the potential molecular mechanism of IM is significant for effective gastric cancer prevention.Methods:The GSE78523 dataset, obtained from the Gene Expression Omnibus (GEO) database, was analyzed using RStudio software to identify the differently expressed genes (DEGs) between IM tissues and normal gastric epithelial tissues. Subsequently, gene ontology (GO) analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, Gene Set Enrichment Analysis (GESA), and protein-protein interaction (PPI) analysis were used to find potential genes. Additionally, the screened genes were analyzed for clinical, immunological, and genetic correlation aspects using single gene clinical correlation analysis (UALCAN), Tumor–Immune System Interactions Database (TISIDB), and validated through western blot experiments.Results:Enrichment analysis showed that the lipid metabolic pathway was significantly associated with IM tissues and the apolipoprotein B ( APOB) gene was identified in the subsequent analysis. Experiment results and correlation analysis showed that the expression of APOB was higher in IM tissues than in normal tissues. This elevated expression of APOB was also found to be associated with the expression levels of hepatocyte nuclear factor 4A ( HNF4A) gene. HNF4A was also found to be associated with immune cell infiltration to gastric cancer and was linked to the prognosis of gastric cancer patients. Moreover, HNF4A was also highly expressed in both IM tissues and gastric cancer cells.Conclusion:Our findings indicate that HNF4A regulates the microenvironment of lipid metabolism in IM tissues by targeting APOB. Higher expression of HNF4A tends to lead to a worse prognosis in gastric cancer patients implying it may serve as a predictive indicator for the progression from IM to gastric cancer.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"52 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140803734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-05DOI: 10.1177/11769343241240558
Ahmed Kabir Refaya, Umashankar Vetrivel, Kannan Palaniyandi
Mycobacterium orygis, a subspecies of the Mycobacterium tuberculosis complex (MTBC), has emerged as a significant concern in the context of One Health, with implications for zoonosis or zooanthroponosis or both. MTBC strains are characterized by the unique insertion element IS 6110, which is widely used as a diagnostic marker. IS 6110 transposition drives genetic modifications in MTBC, imparting genome plasticity and profound biological consequences. While IS 6110 insertions are customarily found in the MTBC genomes, the evolutionary trajectory of strains seems to correlate with the number of IS 6110 copies, indicating enhanced adaptability with increasing copy numbers. Here, we present a comprehensive analysis of IS 6110 insertions in the M. orygis genome, utilizing ISMapper, and elucidate their genetic consequences in promoting successful host adaptation. Our study encompasses a panel of 67 paired-end reads, comprising 11 isolates from our laboratory and 56 sequences downloaded from public databases. Among these sequences, 91% exhibited high-copy, 4.5% low-copy, and 4.5% lacked IS 6110 insertions. We identified 255 insertion loci, including 141 intragenic and 114 intergenic insertions. Most of these loci were either unique or shared among a limited number of isolates, potentially influencing strain behavior. Furthermore, we conducted gene ontology and pathway analysis, using eggNOG-mapper 5.0, on the protein sequences disrupted by IS 6110 insertions, revealing 63 genes involved in diverse functions of Gene Ontology and 45 genes participating in various KEGG pathways. Our findings offer novel insights into IS 6110 insertions, their preferential insertion regions, and their impact on metabolic processes and pathways, providing valuable knowledge on the genetic changes underpinning IS 6110 transposition in M. orygis.
倭黑猩猩分枝杆菌是结核分枝杆菌复合体(MTBC)的一个亚种,已成为 "一体健康 "背景下的一个重大问题,对人畜共患病或动物传染病或两者都有影响。MTBC 菌株以独特的插入元件 IS 6110 为特征,该元件被广泛用作诊断标记。IS 6110 的转座驱动了 MTBC 的基因修饰,赋予了基因组可塑性和深远的生物学影响。虽然 IS 6110 插入元件通常出现在 MTBC 基因组中,但菌株的进化轨迹似乎与 IS 6110 的拷贝数相关,这表明随着拷贝数的增加,适应性也会增强。在这里,我们利用 ISMapper 对 M. orygis 基因组中的 IS 6110 插入物进行了全面分析,并阐明了它们在促进成功适应宿主方面的遗传后果。我们的研究涵盖了 67 个成对末端读数,包括我们实验室的 11 个分离株和从公共数据库下载的 56 个序列。在这些序列中,91%表现为高拷贝,4.5%为低拷贝,4.5%缺乏IS 6110插入。我们确定了 255 个插入位点,包括 141 个基因内插入和 114 个基因间插入。这些位点中的大多数要么是唯一的,要么是少数分离株共享的,可能会影响菌株的行为。此外,我们使用 eggNOG-mapper 5.0 对被 IS 6110 插入破坏的蛋白质序列进行了基因本体和通路分析,发现了 63 个参与基因本体不同功能的基因和 45 个参与各种 KEGG 通路的基因。我们的研究结果为IS 6110插入、其优先插入区域及其对新陈代谢过程和通路的影响提供了新的见解,为IS 6110转座在M. orygis中的遗传变化提供了有价值的知识。
{"title":"Genomic Characterization of IS6110 Insertions in Mycobacterium orygis","authors":"Ahmed Kabir Refaya, Umashankar Vetrivel, Kannan Palaniyandi","doi":"10.1177/11769343241240558","DOIUrl":"https://doi.org/10.1177/11769343241240558","url":null,"abstract":"Mycobacterium orygis, a subspecies of the Mycobacterium tuberculosis complex (MTBC), has emerged as a significant concern in the context of One Health, with implications for zoonosis or zooanthroponosis or both. MTBC strains are characterized by the unique insertion element IS 6110, which is widely used as a diagnostic marker. IS 6110 transposition drives genetic modifications in MTBC, imparting genome plasticity and profound biological consequences. While IS 6110 insertions are customarily found in the MTBC genomes, the evolutionary trajectory of strains seems to correlate with the number of IS 6110 copies, indicating enhanced adaptability with increasing copy numbers. Here, we present a comprehensive analysis of IS 6110 insertions in the M. orygis genome, utilizing ISMapper, and elucidate their genetic consequences in promoting successful host adaptation. Our study encompasses a panel of 67 paired-end reads, comprising 11 isolates from our laboratory and 56 sequences downloaded from public databases. Among these sequences, 91% exhibited high-copy, 4.5% low-copy, and 4.5% lacked IS 6110 insertions. We identified 255 insertion loci, including 141 intragenic and 114 intergenic insertions. Most of these loci were either unique or shared among a limited number of isolates, potentially influencing strain behavior. Furthermore, we conducted gene ontology and pathway analysis, using eggNOG-mapper 5.0, on the protein sequences disrupted by IS 6110 insertions, revealing 63 genes involved in diverse functions of Gene Ontology and 45 genes participating in various KEGG pathways. Our findings offer novel insights into IS 6110 insertions, their preferential insertion regions, and their impact on metabolic processes and pathways, providing valuable knowledge on the genetic changes underpinning IS 6110 transposition in M. orygis.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"5 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140579546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-25eCollection Date: 2024-01-01DOI: 10.1177/11769343241239463
Eshan Bundhoo, Anisah W Ghoorah, Yasmina Jaufeerally-Fakim
Mycobacterium tuberculosis (Mtb) is the causative agent of tuberculosis (TB), an infectious disease that is a major killer worldwide. Due to selection pressure caused by the use of antibacterial drugs, Mtb is characterised by mutational events that have given rise to multi drug resistant (MDR) and extensively drug resistant (XDR) phenotypes. The rate at which mutations occur is an important factor in the study of molecular evolution, and it helps understand gene evolution. Within the same species, different protein-coding genes evolve at different rates. To estimate the rates of molecular evolution of protein-coding genes, a commonly used parameter is the ratio dN/dS, where dN is the rate of non-synonymous substitutions and dS is the rate of synonymous substitutions. Here, we determined the estimated rates of molecular evolution of select biological processes and molecular functions across 264 strains of Mtb. We also investigated the molecular evolutionary rates of core genes of Mtb by computing the dN/dS values, and estimated the pan genome of the 264 strains of Mtb. Our results show that the cellular amino acid metabolic process and the kinase activity function evolve at a significantly higher rate, while the carbohydrate metabolic process evolves at a significantly lower rate for M. tuberculosis. These high rates of evolution correlate well with Mtb physiology and pathogenicity. We further propose that the core genome of M. tuberculosis likely experiences varying rates of molecular evolution which may drive an interplay between core genome and accessory genome during M. tuberculosis evolution.
{"title":"Large-scale Pan Genomic Analysis of <i>Mycobacterium tuberculosis</i> Reveals Key Insights Into Molecular Evolutionary Rate of Specific Processes and Functions.","authors":"Eshan Bundhoo, Anisah W Ghoorah, Yasmina Jaufeerally-Fakim","doi":"10.1177/11769343241239463","DOIUrl":"10.1177/11769343241239463","url":null,"abstract":"<p><p><i>Mycobacterium tuberculosis</i> (Mtb) is the causative agent of tuberculosis (TB), an infectious disease that is a major killer worldwide. Due to selection pressure caused by the use of antibacterial drugs, Mtb is characterised by mutational events that have given rise to multi drug resistant (MDR) and extensively drug resistant (XDR) phenotypes. The rate at which mutations occur is an important factor in the study of molecular evolution, and it helps understand gene evolution. Within the same species, different protein-coding genes evolve at different rates. To estimate the rates of molecular evolution of protein-coding genes, a commonly used parameter is the ratio <i>d</i>N/<i>d</i>S, where <i>d</i>N is the rate of non-synonymous substitutions and <i>d</i>S is the rate of synonymous substitutions. Here, we determined the estimated rates of molecular evolution of select biological processes and molecular functions across 264 strains of Mtb. We also investigated the molecular evolutionary rates of core genes of Mtb by computing the <i>d</i>N/<i>d</i>S values, and estimated the pan genome of the 264 strains of Mtb. Our results show that the cellular amino acid metabolic process and the kinase activity function evolve at a significantly higher rate, while the carbohydrate metabolic process evolves at a significantly lower rate for <i>M. tuberculosi</i>s. These high rates of evolution correlate well with Mtb physiology and pathogenicity. We further propose that the core genome of <i>M. tuberculosis</i> likely experiences varying rates of molecular evolution which may drive an interplay between core genome and accessory genome during <i>M. tuberculosis</i> evolution.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343241239463"},"PeriodicalIF":1.7,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10964447/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140295209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pantoea sp. strain MHSD4 is a bacterial endophyte isolated from the leaves of the medicinal plant Pellaea calomelanos. Here, we report on strain MHSD4 draft whole genome sequence and annotation. The draft genome size of Pantoea sp. strain MHSD4 is 4 647 677 bp with a G+C content of 54.2% and 41 contigs. The National Center for Biotechnology Information Prokaryotic Genome Annotation Pipeline tool predicted a total of 4395 genes inclusive of 4235 protein-coding genes, 87 total RNA genes, 14 non-coding (nc) RNAs and 70 tRNAs, and 73 pseudogenes. Biosynthesis pathways for naphthalene and anthracene degradation were identified. Putative genes involved in bioremediation such as copA, copD, cueO, cueR, glnGm, and trxC were identified. Putative genes involved in copper homeostasis and tolerance were identified which may suggest that Pantoea sp. strain MHSD4 has biotechnological potential for bioremediation of heavy metals.
{"title":"Draft Genome Sequence of <i>Pantoea sp.</i> Strain MHSD4, a Bacterial Endophyte With Bioremediation Potential.","authors":"Dimpho Michelle Morobane, Khuthadzo Tshishonga, Mahloro Hope Serepa-Dlamini","doi":"10.1177/11769343231217908","DOIUrl":"10.1177/11769343231217908","url":null,"abstract":"<p><p><i>Pantoea</i> sp. strain MHSD4 is a bacterial endophyte isolated from the leaves of the medicinal plant <i>Pellaea calomelanos.</i> Here, we report on strain MHSD4 draft whole genome sequence and annotation. The draft genome size of <i>Pantoea</i> sp. strain MHSD4 is 4 647 677 bp with a G+C content of 54.2% and 41 contigs. The National Center for Biotechnology Information Prokaryotic Genome Annotation Pipeline tool predicted a total of 4395 genes inclusive of 4235 protein-coding genes, 87 total RNA genes, 14 non-coding (nc) RNAs and 70 tRNAs, and 73 pseudogenes. Biosynthesis pathways for naphthalene and anthracene degradation were identified. Putative genes involved in bioremediation such as <i>copA, copD, cueO, cueR, glnGm</i>, and <i>trxC</i> were identified. Putative genes involved in copper homeostasis and tolerance were identified which may suggest that <i>Pantoea</i> sp. strain MHSD4 has biotechnological potential for bioremediation of heavy metals.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343231217908"},"PeriodicalIF":2.6,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10938601/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140133135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}