Pub Date : 2024-05-10eCollection Date: 2024-01-01DOI: 10.1177/11769343241249916
Syed Shah Muhammad, Muhammad Shoaib, Muhammad Tariq Pervez
Single nucleotide polymorphisms are most common type of genetic variation in human genome. Analyzing genetic variants can help us better understand the genetic basis of diseases and develop predictive models which are useful to identify individuals who are at increased risk for certain diseases. Several SNP analysis tools have already been developed. For running these tools, the user needs to collect data from various databases. Secondly, often researchers have to use multiple variant analysis tools for cross validating their results and increase confidence in their findings. Extracting data from multiple databases and running multiple tools at a time, increases complexity and time required for analysis. There are some web-based tools that integrate multiple genetic variant databases and provide variant annotations for a few tools. These approaches have some limitations such as retrieving annotation information, filtering common pathogenic variants. The proposed web-based tool, namely IPSNP: An Integrated Platform for Predicting Impact of SNPs is written in Django which is a python-based framework. It uses RESTful API of MyVariant.info to extract annotation information of variants associated with a given gene, rsID, HGVS format variants specified in a VCF file for 29 tools. The results are in the form of a CSV file of predictions (1) derived from the consensus decision, (2) a file having annotations for the variants associated with the given gene, (3) a file showing variants declared as pathogenic commonly by the selected tools, and (4) a CSV file containing chromosome coordinates based on GRCh37 and GRCh38 genome assemblies, rsIDs and proteomic data, so that users may use tools of their choice and avoiding manual parameter collection for each tool. IPSNP is a valuable resource for researchers and clinicians and it can help to save time and effort in discovering the novel disease-associated variants and the development of personalized treatments.
{"title":"An Integrated Framework for Analysis and Prediction of Impact of Single Nucleotide Polymorphism Associated with Human Diseases.","authors":"Syed Shah Muhammad, Muhammad Shoaib, Muhammad Tariq Pervez","doi":"10.1177/11769343241249916","DOIUrl":"10.1177/11769343241249916","url":null,"abstract":"<p><p>Single nucleotide polymorphisms are most common type of genetic variation in human genome. Analyzing genetic variants can help us better understand the genetic basis of diseases and develop predictive models which are useful to identify individuals who are at increased risk for certain diseases. Several SNP analysis tools have already been developed. For running these tools, the user needs to collect data from various databases. Secondly, often researchers have to use multiple variant analysis tools for cross validating their results and increase confidence in their findings. Extracting data from multiple databases and running multiple tools at a time, increases complexity and time required for analysis. There are some web-based tools that integrate multiple genetic variant databases and provide variant annotations for a few tools. These approaches have some limitations such as retrieving annotation information, filtering common pathogenic variants. The proposed web-based tool, namely IPSNP: An Integrated Platform for Predicting Impact of SNPs is written in Django which is a python-based framework. It uses RESTful API of MyVariant.info to extract annotation information of variants associated with a given gene, rsID, HGVS format variants specified in a VCF file for 29 tools. The results are in the form of a CSV file of predictions (1) derived from the consensus decision, (2) a file having annotations for the variants associated with the given gene, (3) a file showing variants declared as pathogenic commonly by the selected tools, and (4) a CSV file containing chromosome coordinates based on GRCh37 and GRCh38 genome assemblies, rsIDs and proteomic data, so that users may use tools of their choice and avoiding manual parameter collection for each tool. IPSNP is a valuable resource for researchers and clinicians and it can help to save time and effort in discovering the novel disease-associated variants and the development of personalized treatments.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343241249916"},"PeriodicalIF":2.6,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11088291/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140913243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-26DOI: 10.1177/11769343241249017
Yihang Zhao, Hong Tang, Jianhua Xu, Feifei Sun, Yuanyuan Zhao, Yang Li
Background:Intestinal metaplasia (IM) of gastric epithelium has traditionally been regarded as an irreversible stage in the process of the Correa cascade. Exploring the potential molecular mechanism of IM is significant for effective gastric cancer prevention.Methods:The GSE78523 dataset, obtained from the Gene Expression Omnibus (GEO) database, was analyzed using RStudio software to identify the differently expressed genes (DEGs) between IM tissues and normal gastric epithelial tissues. Subsequently, gene ontology (GO) analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, Gene Set Enrichment Analysis (GESA), and protein-protein interaction (PPI) analysis were used to find potential genes. Additionally, the screened genes were analyzed for clinical, immunological, and genetic correlation aspects using single gene clinical correlation analysis (UALCAN), Tumor–Immune System Interactions Database (TISIDB), and validated through western blot experiments.Results:Enrichment analysis showed that the lipid metabolic pathway was significantly associated with IM tissues and the apolipoprotein B ( APOB) gene was identified in the subsequent analysis. Experiment results and correlation analysis showed that the expression of APOB was higher in IM tissues than in normal tissues. This elevated expression of APOB was also found to be associated with the expression levels of hepatocyte nuclear factor 4A ( HNF4A) gene. HNF4A was also found to be associated with immune cell infiltration to gastric cancer and was linked to the prognosis of gastric cancer patients. Moreover, HNF4A was also highly expressed in both IM tissues and gastric cancer cells.Conclusion:Our findings indicate that HNF4A regulates the microenvironment of lipid metabolism in IM tissues by targeting APOB. Higher expression of HNF4A tends to lead to a worse prognosis in gastric cancer patients implying it may serve as a predictive indicator for the progression from IM to gastric cancer.
背景:胃上皮的肠化生(Intestinal metaplasia,IM)传统上被认为是科雷亚级联过程中的一个不可逆阶段。方法:使用 RStudio 软件分析从基因表达总库(GEO)数据库中获得的 GSE78523 数据集,以确定 IM 组织与正常胃上皮组织之间的差异表达基因(DEGs)。随后,利用基因本体(GO)分析、京都基因组百科全书(KEGG)富集分析、基因组富集分析(GESA)和蛋白-蛋白相互作用(PPI)分析来寻找潜在基因。结果:富集分析表明,脂质代谢通路与IM组织显著相关,并在随后的分析中发现了载脂蛋白B(APOB)基因。实验结果和相关分析表明,IM 组织中 APOB 的表达高于正常组织。研究还发现,APOB 的高表达与肝细胞核因子 4A (HNF4A)基因的表达水平有关。研究还发现,HNF4A 与胃癌的免疫细胞浸润有关,并与胃癌患者的预后有关。结论:我们的研究结果表明,HNF4A 通过靶向 APOB 调节 IM 组织中脂质代谢的微环境。结论:我们的研究结果表明,HNF4A通过靶向APOB调节IM组织中的脂质代谢微环境,HNF4A表达越高,胃癌患者的预后越差。
{"title":"HNF4A-Bridging the Gap Between Intestinal Metaplasia and Gastric Cancer","authors":"Yihang Zhao, Hong Tang, Jianhua Xu, Feifei Sun, Yuanyuan Zhao, Yang Li","doi":"10.1177/11769343241249017","DOIUrl":"https://doi.org/10.1177/11769343241249017","url":null,"abstract":"Background:Intestinal metaplasia (IM) of gastric epithelium has traditionally been regarded as an irreversible stage in the process of the Correa cascade. Exploring the potential molecular mechanism of IM is significant for effective gastric cancer prevention.Methods:The GSE78523 dataset, obtained from the Gene Expression Omnibus (GEO) database, was analyzed using RStudio software to identify the differently expressed genes (DEGs) between IM tissues and normal gastric epithelial tissues. Subsequently, gene ontology (GO) analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, Gene Set Enrichment Analysis (GESA), and protein-protein interaction (PPI) analysis were used to find potential genes. Additionally, the screened genes were analyzed for clinical, immunological, and genetic correlation aspects using single gene clinical correlation analysis (UALCAN), Tumor–Immune System Interactions Database (TISIDB), and validated through western blot experiments.Results:Enrichment analysis showed that the lipid metabolic pathway was significantly associated with IM tissues and the apolipoprotein B ( APOB) gene was identified in the subsequent analysis. Experiment results and correlation analysis showed that the expression of APOB was higher in IM tissues than in normal tissues. This elevated expression of APOB was also found to be associated with the expression levels of hepatocyte nuclear factor 4A ( HNF4A) gene. HNF4A was also found to be associated with immune cell infiltration to gastric cancer and was linked to the prognosis of gastric cancer patients. Moreover, HNF4A was also highly expressed in both IM tissues and gastric cancer cells.Conclusion:Our findings indicate that HNF4A regulates the microenvironment of lipid metabolism in IM tissues by targeting APOB. Higher expression of HNF4A tends to lead to a worse prognosis in gastric cancer patients implying it may serve as a predictive indicator for the progression from IM to gastric cancer.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"52 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140803734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-05DOI: 10.1177/11769343241240558
Ahmed Kabir Refaya, Umashankar Vetrivel, Kannan Palaniyandi
Mycobacterium orygis, a subspecies of the Mycobacterium tuberculosis complex (MTBC), has emerged as a significant concern in the context of One Health, with implications for zoonosis or zooanthroponosis or both. MTBC strains are characterized by the unique insertion element IS 6110, which is widely used as a diagnostic marker. IS 6110 transposition drives genetic modifications in MTBC, imparting genome plasticity and profound biological consequences. While IS 6110 insertions are customarily found in the MTBC genomes, the evolutionary trajectory of strains seems to correlate with the number of IS 6110 copies, indicating enhanced adaptability with increasing copy numbers. Here, we present a comprehensive analysis of IS 6110 insertions in the M. orygis genome, utilizing ISMapper, and elucidate their genetic consequences in promoting successful host adaptation. Our study encompasses a panel of 67 paired-end reads, comprising 11 isolates from our laboratory and 56 sequences downloaded from public databases. Among these sequences, 91% exhibited high-copy, 4.5% low-copy, and 4.5% lacked IS 6110 insertions. We identified 255 insertion loci, including 141 intragenic and 114 intergenic insertions. Most of these loci were either unique or shared among a limited number of isolates, potentially influencing strain behavior. Furthermore, we conducted gene ontology and pathway analysis, using eggNOG-mapper 5.0, on the protein sequences disrupted by IS 6110 insertions, revealing 63 genes involved in diverse functions of Gene Ontology and 45 genes participating in various KEGG pathways. Our findings offer novel insights into IS 6110 insertions, their preferential insertion regions, and their impact on metabolic processes and pathways, providing valuable knowledge on the genetic changes underpinning IS 6110 transposition in M. orygis.
倭黑猩猩分枝杆菌是结核分枝杆菌复合体(MTBC)的一个亚种,已成为 "一体健康 "背景下的一个重大问题,对人畜共患病或动物传染病或两者都有影响。MTBC 菌株以独特的插入元件 IS 6110 为特征,该元件被广泛用作诊断标记。IS 6110 的转座驱动了 MTBC 的基因修饰,赋予了基因组可塑性和深远的生物学影响。虽然 IS 6110 插入元件通常出现在 MTBC 基因组中,但菌株的进化轨迹似乎与 IS 6110 的拷贝数相关,这表明随着拷贝数的增加,适应性也会增强。在这里,我们利用 ISMapper 对 M. orygis 基因组中的 IS 6110 插入物进行了全面分析,并阐明了它们在促进成功适应宿主方面的遗传后果。我们的研究涵盖了 67 个成对末端读数,包括我们实验室的 11 个分离株和从公共数据库下载的 56 个序列。在这些序列中,91%表现为高拷贝,4.5%为低拷贝,4.5%缺乏IS 6110插入。我们确定了 255 个插入位点,包括 141 个基因内插入和 114 个基因间插入。这些位点中的大多数要么是唯一的,要么是少数分离株共享的,可能会影响菌株的行为。此外,我们使用 eggNOG-mapper 5.0 对被 IS 6110 插入破坏的蛋白质序列进行了基因本体和通路分析,发现了 63 个参与基因本体不同功能的基因和 45 个参与各种 KEGG 通路的基因。我们的研究结果为IS 6110插入、其优先插入区域及其对新陈代谢过程和通路的影响提供了新的见解,为IS 6110转座在M. orygis中的遗传变化提供了有价值的知识。
{"title":"Genomic Characterization of IS6110 Insertions in Mycobacterium orygis","authors":"Ahmed Kabir Refaya, Umashankar Vetrivel, Kannan Palaniyandi","doi":"10.1177/11769343241240558","DOIUrl":"https://doi.org/10.1177/11769343241240558","url":null,"abstract":"Mycobacterium orygis, a subspecies of the Mycobacterium tuberculosis complex (MTBC), has emerged as a significant concern in the context of One Health, with implications for zoonosis or zooanthroponosis or both. MTBC strains are characterized by the unique insertion element IS 6110, which is widely used as a diagnostic marker. IS 6110 transposition drives genetic modifications in MTBC, imparting genome plasticity and profound biological consequences. While IS 6110 insertions are customarily found in the MTBC genomes, the evolutionary trajectory of strains seems to correlate with the number of IS 6110 copies, indicating enhanced adaptability with increasing copy numbers. Here, we present a comprehensive analysis of IS 6110 insertions in the M. orygis genome, utilizing ISMapper, and elucidate their genetic consequences in promoting successful host adaptation. Our study encompasses a panel of 67 paired-end reads, comprising 11 isolates from our laboratory and 56 sequences downloaded from public databases. Among these sequences, 91% exhibited high-copy, 4.5% low-copy, and 4.5% lacked IS 6110 insertions. We identified 255 insertion loci, including 141 intragenic and 114 intergenic insertions. Most of these loci were either unique or shared among a limited number of isolates, potentially influencing strain behavior. Furthermore, we conducted gene ontology and pathway analysis, using eggNOG-mapper 5.0, on the protein sequences disrupted by IS 6110 insertions, revealing 63 genes involved in diverse functions of Gene Ontology and 45 genes participating in various KEGG pathways. Our findings offer novel insights into IS 6110 insertions, their preferential insertion regions, and their impact on metabolic processes and pathways, providing valuable knowledge on the genetic changes underpinning IS 6110 transposition in M. orygis.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"5 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140579546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-25eCollection Date: 2024-01-01DOI: 10.1177/11769343241239463
Eshan Bundhoo, Anisah W Ghoorah, Yasmina Jaufeerally-Fakim
Mycobacterium tuberculosis (Mtb) is the causative agent of tuberculosis (TB), an infectious disease that is a major killer worldwide. Due to selection pressure caused by the use of antibacterial drugs, Mtb is characterised by mutational events that have given rise to multi drug resistant (MDR) and extensively drug resistant (XDR) phenotypes. The rate at which mutations occur is an important factor in the study of molecular evolution, and it helps understand gene evolution. Within the same species, different protein-coding genes evolve at different rates. To estimate the rates of molecular evolution of protein-coding genes, a commonly used parameter is the ratio dN/dS, where dN is the rate of non-synonymous substitutions and dS is the rate of synonymous substitutions. Here, we determined the estimated rates of molecular evolution of select biological processes and molecular functions across 264 strains of Mtb. We also investigated the molecular evolutionary rates of core genes of Mtb by computing the dN/dS values, and estimated the pan genome of the 264 strains of Mtb. Our results show that the cellular amino acid metabolic process and the kinase activity function evolve at a significantly higher rate, while the carbohydrate metabolic process evolves at a significantly lower rate for M. tuberculosis. These high rates of evolution correlate well with Mtb physiology and pathogenicity. We further propose that the core genome of M. tuberculosis likely experiences varying rates of molecular evolution which may drive an interplay between core genome and accessory genome during M. tuberculosis evolution.
{"title":"Large-scale Pan Genomic Analysis of <i>Mycobacterium tuberculosis</i> Reveals Key Insights Into Molecular Evolutionary Rate of Specific Processes and Functions.","authors":"Eshan Bundhoo, Anisah W Ghoorah, Yasmina Jaufeerally-Fakim","doi":"10.1177/11769343241239463","DOIUrl":"10.1177/11769343241239463","url":null,"abstract":"<p><p><i>Mycobacterium tuberculosis</i> (Mtb) is the causative agent of tuberculosis (TB), an infectious disease that is a major killer worldwide. Due to selection pressure caused by the use of antibacterial drugs, Mtb is characterised by mutational events that have given rise to multi drug resistant (MDR) and extensively drug resistant (XDR) phenotypes. The rate at which mutations occur is an important factor in the study of molecular evolution, and it helps understand gene evolution. Within the same species, different protein-coding genes evolve at different rates. To estimate the rates of molecular evolution of protein-coding genes, a commonly used parameter is the ratio <i>d</i>N/<i>d</i>S, where <i>d</i>N is the rate of non-synonymous substitutions and <i>d</i>S is the rate of synonymous substitutions. Here, we determined the estimated rates of molecular evolution of select biological processes and molecular functions across 264 strains of Mtb. We also investigated the molecular evolutionary rates of core genes of Mtb by computing the <i>d</i>N/<i>d</i>S values, and estimated the pan genome of the 264 strains of Mtb. Our results show that the cellular amino acid metabolic process and the kinase activity function evolve at a significantly higher rate, while the carbohydrate metabolic process evolves at a significantly lower rate for <i>M. tuberculosi</i>s. These high rates of evolution correlate well with Mtb physiology and pathogenicity. We further propose that the core genome of <i>M. tuberculosis</i> likely experiences varying rates of molecular evolution which may drive an interplay between core genome and accessory genome during <i>M. tuberculosis</i> evolution.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343241239463"},"PeriodicalIF":1.7,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10964447/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140295209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pantoea sp. strain MHSD4 is a bacterial endophyte isolated from the leaves of the medicinal plant Pellaea calomelanos. Here, we report on strain MHSD4 draft whole genome sequence and annotation. The draft genome size of Pantoea sp. strain MHSD4 is 4 647 677 bp with a G+C content of 54.2% and 41 contigs. The National Center for Biotechnology Information Prokaryotic Genome Annotation Pipeline tool predicted a total of 4395 genes inclusive of 4235 protein-coding genes, 87 total RNA genes, 14 non-coding (nc) RNAs and 70 tRNAs, and 73 pseudogenes. Biosynthesis pathways for naphthalene and anthracene degradation were identified. Putative genes involved in bioremediation such as copA, copD, cueO, cueR, glnGm, and trxC were identified. Putative genes involved in copper homeostasis and tolerance were identified which may suggest that Pantoea sp. strain MHSD4 has biotechnological potential for bioremediation of heavy metals.
{"title":"Draft Genome Sequence of <i>Pantoea sp.</i> Strain MHSD4, a Bacterial Endophyte With Bioremediation Potential.","authors":"Dimpho Michelle Morobane, Khuthadzo Tshishonga, Mahloro Hope Serepa-Dlamini","doi":"10.1177/11769343231217908","DOIUrl":"10.1177/11769343231217908","url":null,"abstract":"<p><p><i>Pantoea</i> sp. strain MHSD4 is a bacterial endophyte isolated from the leaves of the medicinal plant <i>Pellaea calomelanos.</i> Here, we report on strain MHSD4 draft whole genome sequence and annotation. The draft genome size of <i>Pantoea</i> sp. strain MHSD4 is 4 647 677 bp with a G+C content of 54.2% and 41 contigs. The National Center for Biotechnology Information Prokaryotic Genome Annotation Pipeline tool predicted a total of 4395 genes inclusive of 4235 protein-coding genes, 87 total RNA genes, 14 non-coding (nc) RNAs and 70 tRNAs, and 73 pseudogenes. Biosynthesis pathways for naphthalene and anthracene degradation were identified. Putative genes involved in bioremediation such as <i>copA, copD, cueO, cueR, glnGm</i>, and <i>trxC</i> were identified. Putative genes involved in copper homeostasis and tolerance were identified which may suggest that <i>Pantoea</i> sp. strain MHSD4 has biotechnological potential for bioremediation of heavy metals.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343231217908"},"PeriodicalIF":2.6,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10938601/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140133135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genetic variations in the human genome represent the differences in DNA sequence within individuals. This highlights the important role of whole human genome sequencing which has become the keystone for precision medicine and disease prediction. Morocco is an important hub for studying human population migration and mixing history. This study presents the analysis of 3 Moroccan genomes; the variant analysis revealed 6 379 606 single nucleotide variants (SNVs) and 1 050 577 small InDels. Of those identified SNVs, 219 152 were novel, with 1233 occurring in coding regions, and 5580 non-synonymous single nucleotide variants (nsSNP) variants were predicted to affect protein functions. The InDels produced 1055 coding variants and 454 non-3n length variants, and their size ranged from -49 and 49 bp. We further analysed the gene pathways of 8 novel coding variants found in the 3 genomes and revealed 5 genes involved in various diseases and biological pathways. We found that the Moroccan genomes share 92.78% of African ancestry, and 92.86% of Non-Finnish European ancestry, according to the gnomAD database. Then, population structure inference, by admixture analysis and network-based approach, revealed that the studied genomes form a mixed population structure, highlighting the increased genetic diversity in Morocco.
{"title":"A Comprehensive Analysis of 3 Moroccan Genomes Revealed Contributions From Both African and European Ancestries.","authors":"Nasma Boumajdi, Houda Bendani, Souad Kartti, Tarek Alouane, Lahcen Belyamani, Azeddine Ibrahimi","doi":"10.1177/11769343241229278","DOIUrl":"10.1177/11769343241229278","url":null,"abstract":"<p><p>Genetic variations in the human genome represent the differences in DNA sequence within individuals. This highlights the important role of whole human genome sequencing which has become the keystone for precision medicine and disease prediction. Morocco is an important hub for studying human population migration and mixing history. This study presents the analysis of 3 Moroccan genomes; the variant analysis revealed 6 379 606 single nucleotide variants (SNVs) and 1 050 577 small InDels. Of those identified SNVs, 219 152 were novel, with 1233 occurring in coding regions, and 5580 non-synonymous single nucleotide variants (nsSNP) variants were predicted to affect protein functions. The InDels produced 1055 coding variants and 454 non-3n length variants, and their size ranged from -49 and 49 bp. We further analysed the gene pathways of 8 novel coding variants found in the 3 genomes and revealed 5 genes involved in various diseases and biological pathways. We found that the Moroccan genomes share 92.78% of African ancestry, and 92.86% of Non-Finnish European ancestry, according to the gnomAD database. Then, population structure inference, by admixture analysis and network-based approach, revealed that the studied genomes form a mixed population structure, highlighting the increased genetic diversity in Morocco.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343241229278"},"PeriodicalIF":2.6,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10848790/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139703947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-30eCollection Date: 2024-01-01DOI: 10.1177/11769343241227331
Yingjie Geng, Yu'e Han, Shujuan Wang, Jia Qi, Xiaoli Bi
Aims: Autophagy plays a significant role in the development of acute myocardial infarction (AMI), and cardiomyocyte autophagy is of major importance in maintaining cardiac function. We aimed to identify key genes associated with autophagy in AMI through bioinformatics analysis and verify them through clinical validation.
Materials and methods: We downloaded an AMI expression profile dataset GSE166780 from Gene Expression Omnibus (GEO). Autophagy-associated genes potentially differentially expressed in AMI were screened using R software. Then, to identify key autophagy-related genes, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, protein-protein interaction (PPI) analysis, Receiver Operating Characteristic (ROC) curve analysis, and correlation analysis were performed on the differentially expressed autophagy-related genes in AMI. Finally, we used quantificational real-time polymerase chain reaction (qRT-PCR) to verify the RNA expression of the screened key genes.
Results: TSC2, HSPA8, and HIF1A were screened out as key autophagy-related genes. qRT-PCR results showed that the expression levels of HSPA8 and TSC2 in AMI blood samples were lower, while the expression level of HIF1A was higher than that in the healthy controls.
Conclusions: TSC2, HSPA8, and HIF1A were identified as key autophagy-related genes in this study. They may influence the development of AMI through autophagy. These findings may help deepen our understanding of AMI and may be useful for the treatment of AMI.
目的:自噬在急性心肌梗死(AMI)的发生发展中起着重要作用,而心肌细胞自噬在维持心脏功能方面具有重要意义。我们旨在通过生物信息学分析确定AMI中与自噬相关的关键基因,并通过临床验证这些基因:我们从基因表达总库(Gene Expression Omnibus,GEO)下载了 AMI 表达谱数据集 GSE166780。使用 R 软件筛选 AMI 中可能存在差异表达的自噬相关基因。然后,为了确定关键的自噬相关基因,我们对 AMI 中差异表达的自噬相关基因进行了基因本体(GO)和京都基因与基因组百科全书(KEGG)富集分析、蛋白-蛋白相互作用(PPI)分析、接收者操作特征曲线(ROC)分析和相关性分析。最后,我们使用定量实时聚合酶链反应(qRT-PCR)验证了筛选出的关键基因的 RNA 表达:qRT-PCR结果显示,AMI血样中HSPA8和TSC2的表达水平低于健康对照组,而HIF1A的表达水平高于健康对照组:结论:本研究发现 TSC2、HSPA8 和 HIF1A 是关键的自噬相关基因。结论:本研究发现 TSC2、HSPA8 和 HIF1A 是与自噬相关的关键基因,它们可能通过自噬影响 AMI 的发生。这些发现可能有助于加深我们对 AMI 的了解,并对 AMI 的治疗有所帮助。
{"title":"Screening and Validation of Key Genes of Autophagy in Acute Myocardial Infarction Based on Bioinformatics.","authors":"Yingjie Geng, Yu'e Han, Shujuan Wang, Jia Qi, Xiaoli Bi","doi":"10.1177/11769343241227331","DOIUrl":"10.1177/11769343241227331","url":null,"abstract":"<p><strong>Aims: </strong>Autophagy plays a significant role in the development of acute myocardial infarction (AMI), and cardiomyocyte autophagy is of major importance in maintaining cardiac function. We aimed to identify key genes associated with autophagy in AMI through bioinformatics analysis and verify them through clinical validation.</p><p><strong>Materials and methods: </strong>We downloaded an AMI expression profile dataset GSE166780 from Gene Expression Omnibus (GEO). Autophagy-associated genes potentially differentially expressed in AMI were screened using R software. Then, to identify key autophagy-related genes, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, protein-protein interaction (PPI) analysis, Receiver Operating Characteristic (ROC) curve analysis, and correlation analysis were performed on the differentially expressed autophagy-related genes in AMI. Finally, we used quantificational real-time polymerase chain reaction (qRT-PCR) to verify the RNA expression of the screened key genes.</p><p><strong>Results: </strong>TSC2, HSPA8, and HIF1A were screened out as key autophagy-related genes. qRT-PCR results showed that the expression levels of HSPA8 and TSC2 in AMI blood samples were lower, while the expression level of HIF1A was higher than that in the healthy controls.</p><p><strong>Conclusions: </strong>TSC2, HSPA8, and HIF1A were identified as key autophagy-related genes in this study. They may influence the development of AMI through autophagy. These findings may help deepen our understanding of AMI and may be useful for the treatment of AMI.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"20 ","pages":"11769343241227331"},"PeriodicalIF":2.6,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10832399/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139681764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-14eCollection Date: 2023-01-01DOI: 10.1177/11769343231216914
Rujun Zhai, Qian Wang
Nociception and pain sensation are important neural processes in humans to avoid injury. Many proteins are involved in nociception and pain sensation in humans; however, the evolution of these proteins in animals is unknown. Here, we chose nociception- and pain-related proteins, including G protein-coupled receptors (GPCRs), ion channels (ICs), and neuropeptides (NPs), which are reportedly associated with nociception and pain in humans, and identified their homologs in various animals by BLAST, phylogenetic analysis and protein architecture comparison to reveal their evolution from protozoans to humans. We found that the homologs of transient receptor potential channel A 1 (TRPA1), TRAPM, acid-sensing IC (ASIC), and voltage-dependent calcium channel (VDCC) first appear in Porifera. Substance-P receptor 1 (TACR1) emerged from Coelenterata. Somatostatin receptor type 2 (SSTR2), TRPV1 and voltage-dependent sodium channels (VDSC) appear in Platyhelminthes. Calcitonin gene-related peptide receptor (CGRPR) was first identified in Nematoda. However, opioid receptors (OPRs) and most NPs were discovered only in vertebrates and exist from agnatha to humans. The results demonstrated that homologs of nociception and pain-related ICs exist from lower animal phyla to high animal phyla, and that most of the GPCRs originate from low to high phyla sequentially, whereas OPRs and NPs are newly evolved in vertebrates, which provides hints of the evolution of nociception and pain-related proteins in animals and humans.
{"title":"Phylogenetic Analysis Provides Insight Into the Molecular Evolution of Nociception and Pain-Related Proteins.","authors":"Rujun Zhai, Qian Wang","doi":"10.1177/11769343231216914","DOIUrl":"https://doi.org/10.1177/11769343231216914","url":null,"abstract":"<p><p>Nociception and pain sensation are important neural processes in humans to avoid injury. Many proteins are involved in nociception and pain sensation in humans; however, the evolution of these proteins in animals is unknown. Here, we chose nociception- and pain-related proteins, including G protein-coupled receptors (GPCRs), ion channels (ICs), and neuropeptides (NPs), which are reportedly associated with nociception and pain in humans, and identified their homologs in various animals by BLAST, phylogenetic analysis and protein architecture comparison to reveal their evolution from protozoans to humans. We found that the homologs of transient receptor potential channel A 1 (TRPA1), TRAPM, acid-sensing IC (ASIC), and voltage-dependent calcium channel (VDCC) first appear in Porifera. Substance-P receptor 1 (TACR1) emerged from Coelenterata. Somatostatin receptor type 2 (SSTR2), TRPV1 and voltage-dependent sodium channels (VDSC) appear in Platyhelminthes. Calcitonin gene-related peptide receptor (CGRPR) was first identified in Nematoda. However, opioid receptors (OPRs) and most NPs were discovered only in vertebrates and exist from agnatha to humans. The results demonstrated that homologs of nociception and pain-related ICs exist from lower animal phyla to high animal phyla, and that most of the GPCRs originate from low to high phyla sequentially, whereas OPRs and NPs are newly evolved in vertebrates, which provides hints of the evolution of nociception and pain-related proteins in animals and humans.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"19 ","pages":"11769343231216914"},"PeriodicalIF":2.6,"publicationDate":"2023-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10725132/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138812717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01eCollection Date: 2023-01-01DOI: 10.1177/11769343231217916
Ibrahim H Eissa, Reda G Yousef, Eslam B Elkaeed, Aisha A Alsfouk, Dalal Z Husein, Ibrahim M Ibrahim, Hesham A El-Mahdy, Hazem Elkady, Ahmed M Metwaly
The overexpression of the Epidermal Growth Factor Receptor (EGFR) marks it as a pivotal target in cancer treatment, with the aim of reducing its proliferation and inducing apoptosis. This study aimed at the CADD of a new apoptotic EGFR inhibitor. The natural alkaloid, theobromine, was used as a starting point to obtain a new semisynthetic (di-ortho-chloro acetamide) derivative (T-1-DOCA). Firstly, T-1-DOCA's total electron density, energy gap, reactivity indices, and electrostatic surface potential were determined by DFT calculations, Then, molecular docking studies were carried out to predict the potential of T-1-DOCA against wild and mutant EGFR proteins. T-1-DOCA's correct binding was further confirmed by molecular dynamics (MD) over 100 ns, MM-GPSA, and PLIP experiments. In vitro, T-1-DOCA showed noticeable efficacy compared to erlotinib by suppressing EGFRWT and EGFRT790M with IC50 values of 56.94 and 269.01 nM, respectively. T-1-DOCA inhibited also the proliferation of H1975 and HCT-116 malignant cell lines, exhibiting IC50 values of 14.12 and 23.39 µM, with selectivity indices of 6.8 and 4.1, respectively, indicating its anticancer potential and general safety. The apoptotic effects of T-1-DOCA were indicated by flow cytometric analysis and were further confirmed through its potential to increase the levels of BAX, Casp3, and Casp9, and decrease Bcl-2 levels. In conclusion, T-1-DOCA, a new apoptotic EGFR inhibitor, was designed and evaluated both computationally and experimentally. The results suggest that T-1-DOCA is a promising candidate for further development as an anti-cancer drug.
{"title":"Computer-Assisted Drug Discovery of a Novel Theobromine Derivative as an EGFR Protein-Targeted Apoptosis Inducer.","authors":"Ibrahim H Eissa, Reda G Yousef, Eslam B Elkaeed, Aisha A Alsfouk, Dalal Z Husein, Ibrahim M Ibrahim, Hesham A El-Mahdy, Hazem Elkady, Ahmed M Metwaly","doi":"10.1177/11769343231217916","DOIUrl":"10.1177/11769343231217916","url":null,"abstract":"<p><p>The overexpression of the Epidermal Growth Factor Receptor (EGFR) marks it as a pivotal target in cancer treatment, with the aim of reducing its proliferation and inducing apoptosis. This study aimed at the CADD of a new apoptotic EGFR inhibitor. The natural alkaloid, theobromine, was used as a starting point to obtain a new semisynthetic (di-ortho-chloro acetamide) derivative (<b>T-1-DOCA</b>). Firstly, <b>T-1-DOCA</b>'s total electron density, energy gap, reactivity indices, and electrostatic surface potential were determined by DFT calculations, Then, molecular docking studies were carried out to predict the potential of <b>T-1-DOCA</b> against wild and mutant EGFR proteins. <b>T-1-DOCA</b>'s correct binding was further confirmed by molecular dynamics (MD) over 100 ns, MM-GPSA, and PLIP experiments. In vitro, <b>T-1-DOCA</b> showed noticeable efficacy compared to erlotinib by suppressing EGFR<sup>WT</sup> and EGFR<sup>T790M</sup> with IC<sub>50</sub> values of 56.94 and 269.01 nM, respectively. <b>T-1-DOCA</b> inhibited also the proliferation of H1975 and HCT-116 malignant cell lines, exhibiting IC<sub>50</sub> values of 14.12 and 23.39 µM, with selectivity indices of 6.8 and 4.1, respectively, indicating its anticancer potential and general safety. The apoptotic effects of <b>T-1-DOCA</b> were indicated by flow cytometric analysis and were further confirmed through its potential to increase the levels of BAX, Casp3, and Casp9, and decrease Bcl-2 levels. In conclusion, <b>T-1-DOCA</b>, a new apoptotic EGFR inhibitor, was designed and evaluated both computationally and experimentally. The results suggest that <b>T-1-DOCA</b> is a promising candidate for further development as an anti-cancer drug.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"19 ","pages":"11769343231217916"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10693208/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138479082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: G-quadruplexes (G4s) are secondary structures in DNA and RNA that impact various cellular processes, such as transcription, splicing, and translation. Due to their numerous functions, G4s are involved in many diseases, making their study important. Yet, G4s evolution remains largely unknown, due to their low sequence similarity and the poor quality of their sequence alignments across several species. To address this, we designed a strategy that avoids direct G4s alignment to study G4s evolution in the 3 species kingdoms. We also explored the coevolution between RBPs and G4s. Methods: We retrieved one-to-one orthologous genes from the Ensembl Compara database and computed groups of one-to-one orthologous genes. For each group, we aligned gene sequences and identified G4 families as groups of overlapping G4s in the alignment. We analyzed these G4 families using Count, a tool to infer feature evolution into a gene or a species tree. Additionally, we utilized these G4 families to predict G4s by homology. To establish a control dataset, we performed mono-, di- and tri-nucleotide shuffling. Results: Only a few conserved G4s occur among all living kingdoms. In eukaryotes, G4s exhibit slight conservation among vertebrates, and few are conserved between plants. In archaea and bacteria, at most, only 2 G4s are common. The G4 homology-based prediction increases the number of conserved G4s in common ancestors. The coevolution between RNA-binding proteins and G4s was investigated and revealed a modest impact of RNA-binding proteins evolution on G4 evolution. However, the details of this relationship remain unclear. Conclusion: Even if G4 evolution still eludes us, the present study provides key information to compute groups of homologous G4 and to reveal the evolution history of G4 families.
{"title":"Toward a Better Understanding of G4 Evolution in the 3 Living Kingdoms.","authors":"Anaïs Vannutelli, Aïda Ouangraoua, Jean-Pierre Perreault","doi":"10.1177/11769343231212075","DOIUrl":"10.1177/11769343231212075","url":null,"abstract":"Background: G-quadruplexes (G4s) are secondary structures in DNA and RNA that impact various cellular processes, such as transcription, splicing, and translation. Due to their numerous functions, G4s are involved in many diseases, making their study important. Yet, G4s evolution remains largely unknown, due to their low sequence similarity and the poor quality of their sequence alignments across several species. To address this, we designed a strategy that avoids direct G4s alignment to study G4s evolution in the 3 species kingdoms. We also explored the coevolution between RBPs and G4s. Methods: We retrieved one-to-one orthologous genes from the Ensembl Compara database and computed groups of one-to-one orthologous genes. For each group, we aligned gene sequences and identified G4 families as groups of overlapping G4s in the alignment. We analyzed these G4 families using Count, a tool to infer feature evolution into a gene or a species tree. Additionally, we utilized these G4 families to predict G4s by homology. To establish a control dataset, we performed mono-, di- and tri-nucleotide shuffling. Results: Only a few conserved G4s occur among all living kingdoms. In eukaryotes, G4s exhibit slight conservation among vertebrates, and few are conserved between plants. In archaea and bacteria, at most, only 2 G4s are common. The G4 homology-based prediction increases the number of conserved G4s in common ancestors. The coevolution between RNA-binding proteins and G4s was investigated and revealed a modest impact of RNA-binding proteins evolution on G4 evolution. However, the details of this relationship remain unclear. Conclusion: Even if G4 evolution still eludes us, the present study provides key information to compute groups of homologous G4 and to reveal the evolution history of G4 families.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"19 ","pages":"11769343231212075"},"PeriodicalIF":2.6,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10693206/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138479083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}