首页 > 最新文献

Evolutionary Bioinformatics最新文献

英文 中文
Does Ulcerative Colitis Influence the Inter-individual Heterogeneity of the Human Intestinal Mucosal Microbiome? 溃疡性结肠炎是否影响人类肠道黏膜微生物组的个体间异质性?
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2020-10-10 eCollection Date: 2020-01-01 DOI: 10.1177/1176934320948848
Yang Sun, Lianwei Li, Aiyun Lai, Wanmeng Xiao, Kunhua Wang, Lan Wang, Junkun Niu, Juan Luo, Hongju Chen, Lin Dai, Yinglei Miao

The dysbiosis of the gut microbiome associated with ulcerative colitis (UC) has been extensively studied in recent years. However, the question of whether UC influences the spatial heterogeneity of the human gut mucosal microbiome has not been addressed. Spatial heterogeneity (specifically, the inter-individual heterogeneity in microbial species abundances) is one of the most important characterizations at both population and community scales, and can be assessed and interpreted by Taylor's power law (TPL) and its community-scale extensions (TPLEs). Due to the high mobility of microbes, it is difficult to investigate their spatial heterogeneity explicitly; however, TPLE offers an effective approach to implicitly analyze the microbial communities. Here, we investigated the influence of UC on the spatial heterogeneity of the gut microbiome with intestinal mucosal microbiome samples collected from 28 UC patients and healthy controls. Specifically, we applied Type-I TPLE for measuring community spatial heterogeneity and Type-III TPLE for measuring mixed-species population heterogeneity to evaluate the heterogeneity changes of the mucosal microbiome induced by UC at both the community and species scales. We further used permutation test to determine the possible differences between UC patients and healthy controls in heterogeneity scaling parameters. Results showed that UC did not significantly influence gut mucosal microbiome heterogeneity at either the community or mixed-species levels. These findings demonstrated significant resilience of the human gut microbiome and confirmed a prediction of TPLE: that the inter-subject heterogeneity scaling parameter of the gut microbiome is an intrinsic property to humans, invariant with UC disease.

近年来,与溃疡性结肠炎(UC)相关的肠道微生物群失调已被广泛研究。然而,UC是否影响人类肠道黏膜微生物群的空间异质性的问题尚未得到解决。空间异质性(即微生物物种丰度的个体间异质性)是种群和群落尺度上最重要的特征之一,可以用泰勒幂律(TPL)及其群落尺度扩展(TPLEs)来评价和解释。由于微生物的高流动性,很难明确地研究它们的空间异质性;然而,TPLE提供了一种隐式分析微生物群落的有效方法。在这里,我们通过收集28名UC患者和健康对照者的肠道黏膜微生物组样本,研究UC对肠道微生物组空间异质性的影响。具体而言,我们采用i型TPLE测量群落空间异质性和iii型TPLE测量混合物种种群异质性,在群落和物种尺度上评估UC诱导的粘膜微生物组异质性变化。我们进一步使用置换检验来确定UC患者与健康对照者在异质性标度参数上可能存在的差异。结果表明,UC在群落或混合物种水平上对肠道黏膜微生物组异质性均无显著影响。这些发现证明了人类肠道微生物组具有显著的恢复能力,并证实了TPLE的预测:肠道微生物组的受试者间异质性尺度参数是人类的内在属性,UC疾病不变。
{"title":"Does Ulcerative Colitis Influence the Inter-individual Heterogeneity of the Human Intestinal Mucosal Microbiome?","authors":"Yang Sun,&nbsp;Lianwei Li,&nbsp;Aiyun Lai,&nbsp;Wanmeng Xiao,&nbsp;Kunhua Wang,&nbsp;Lan Wang,&nbsp;Junkun Niu,&nbsp;Juan Luo,&nbsp;Hongju Chen,&nbsp;Lin Dai,&nbsp;Yinglei Miao","doi":"10.1177/1176934320948848","DOIUrl":"https://doi.org/10.1177/1176934320948848","url":null,"abstract":"<p><p>The dysbiosis of the gut microbiome associated with ulcerative colitis (UC) has been extensively studied in recent years. However, the question of whether UC influences the spatial heterogeneity of the human gut mucosal microbiome has not been addressed. Spatial heterogeneity (specifically, the inter-individual heterogeneity in microbial species abundances) is one of the most important characterizations at both population and community scales, and can be assessed and interpreted by Taylor's power law (TPL) and its community-scale extensions (TPLEs). Due to the high mobility of microbes, it is difficult to investigate their spatial heterogeneity explicitly; however, TPLE offers an effective approach to implicitly analyze the microbial communities. Here, we investigated the influence of UC on the spatial heterogeneity of the gut microbiome with intestinal mucosal microbiome samples collected from 28 UC patients and healthy controls. Specifically, we applied Type-I TPLE for measuring community spatial heterogeneity and Type-III TPLE for measuring mixed-species population heterogeneity to evaluate the heterogeneity changes of the mucosal microbiome induced by UC at both the community and species scales. We further used permutation test to determine the possible differences between UC patients and healthy controls in heterogeneity scaling parameters. Results showed that UC did not significantly influence gut mucosal microbiome heterogeneity at either the community or mixed-species levels. These findings demonstrated significant resilience of the human gut microbiome and confirmed a prediction of TPLE: that the inter-subject heterogeneity scaling parameter of the gut microbiome is an intrinsic property to humans, invariant with UC disease.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"16 ","pages":"1176934320948848"},"PeriodicalIF":2.6,"publicationDate":"2020-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176934320948848","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38526965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Analysis of Continuous Mutation and Evolution on Circulating SARS-CoV-2. 循环 SARS-CoV-2 的连续突变和进化分析
IF 1.7 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2020-10-01 eCollection Date: 2020-01-01 DOI: 10.1177/1176934320954870
Jie-Mei Yu, Li-Shu Zhang, Yuan-Hui Fu, Feng-Min Ji, Han-Li Xu, Jia-Qiang Huang, Xiang-Lei Peng, Yan-Peng Zheng, Ying Zhang, Jin-Sheng He

Monitoring the mutation and evolution of the virus is important for tracing its ongoing transmission and facilitating effective vaccine development. A total of 342 complete genomic sequences of SARS-CoV-2 were analyzed in this study. Compared to the reference genome reported in December 2019, 465 mutations were found, among which, 347 occurred in only 1 sequence, while 26 occurred in more than 5 sequences. For these 26 further identified as SNPs, 14 were closely linked and were grouped into 5 profiles. Phylogenetic analysis revealed the sequences formed 2 major groups. Most of the sequences in late period (March and April) constituted the Cluster II, while the sequences before March in this study and the reported S/L and A/B/C types in previous studies were all in Cluster I. The distributions of some mutations were specific geographically or temporally, the potential effect of which on the transmission and pathogenicity of SARS-CoV-2 deserves further evaluation and monitoring. Two mutations were found in the receptor-binding domain (RBD) but outside the receptor-binding motif (RBM), indicating that mutations may only have marginal biological effects but merit further attention. The observed novel sequence divergence is of great significance to the study of the transmission, pathogenicity, and development of an effective vaccine for SARS-CoV-2.

监测病毒的变异和进化对追踪其持续传播和促进有效疫苗的开发非常重要。本研究共分析了 342 个 SARS-CoV-2 的完整基因组序列。与 2019 年 12 月报告的参考基因组相比,共发现 465 个突变,其中 347 个突变仅出现在 1 个序列中,26 个突变出现在 5 个以上序列中。在这26个被进一步鉴定为SNP的突变中,有14个突变是紧密相连的,并被分为5个图谱。系统进化分析表明,这些序列形成了两大类。一些突变的分布具有特定的地域性或时间性,其对 SARS-CoV-2 传播和致病性的潜在影响值得进一步评估和监测。在受体结合域(RBD)内但在受体结合基团(RBM)外发现了两个突变,这表明突变可能只具有微不足道的生物学效应,但值得进一步关注。观察到的新的序列差异对研究 SARS-CoV-2 的传播、致病性和开发有效疫苗具有重要意义。
{"title":"Analysis of Continuous Mutation and Evolution on Circulating SARS-CoV-2.","authors":"Jie-Mei Yu, Li-Shu Zhang, Yuan-Hui Fu, Feng-Min Ji, Han-Li Xu, Jia-Qiang Huang, Xiang-Lei Peng, Yan-Peng Zheng, Ying Zhang, Jin-Sheng He","doi":"10.1177/1176934320954870","DOIUrl":"10.1177/1176934320954870","url":null,"abstract":"<p><p>Monitoring the mutation and evolution of the virus is important for tracing its ongoing transmission and facilitating effective vaccine development. A total of 342 complete genomic sequences of SARS-CoV-2 were analyzed in this study. Compared to the reference genome reported in December 2019, 465 mutations were found, among which, 347 occurred in only 1 sequence, while 26 occurred in more than 5 sequences. For these 26 further identified as SNPs, 14 were closely linked and were grouped into 5 profiles. Phylogenetic analysis revealed the sequences formed 2 major groups. Most of the sequences in late period (March and April) constituted the Cluster II, while the sequences before March in this study and the reported S/L and A/B/C types in previous studies were all in Cluster I. The distributions of some mutations were specific geographically or temporally, the potential effect of which on the transmission and pathogenicity of SARS-CoV-2 deserves further evaluation and monitoring. Two mutations were found in the receptor-binding domain (RBD) but outside the receptor-binding motif (RBM), indicating that mutations may only have marginal biological effects but merit further attention. The observed novel sequence divergence is of great significance to the study of the transmission, pathogenicity, and development of an effective vaccine for SARS-CoV-2.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"16 ","pages":"1176934320954870"},"PeriodicalIF":1.7,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/eb/05/10.1177_1176934320954870.PMC8842338.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39930044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
inCNV: An Integrated Analysis Tool for Copy Number Variation on Whole Exome Sequencing. inCNV:全外显子测序拷贝数变异综合分析工具。
IF 1.7 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2020-09-24 eCollection Date: 2020-01-01 DOI: 10.1177/1176934320956577
Saowwapark Chanwigoon, Sakkayaphab Piwluang, Duangdao Wichadakul

The detection of copy number variations (CNVs) on whole-exome sequencing (WES) represents a cost-effective technique for the study of genetic variants. This approach, however, has encountered an obstacle with high false-positive rates due to biases from exome sequencing capture kits and GC contents. Although plenty of CNV detection tools have been developed, they do not perform well with all types of CNVs. In addition, most tools lack features of genetic annotation, CNV visualization, and flexible installation, requiring users to put much effort into CNV interpretation. Here, we present "inCNV," a web-based application that can accept multiple CNV-tool results, then integrate and prioritize them with user-friendly interfaces. This application helps users analyze the importance of called CNVs by generating CNV annotations from Ensembl, Database of Genomic Variants (DGV), ClinVar, and Online Mendelian Inheritance in Man (OMIM). Moreover, users can select and export CNVs of interest including their flanking sequences for primer design and experimental verification. We demonstrated how inCNV could help users filter and narrow down the called CNVs to a potentially novel CNV, a common CNV within a group of samples of the same disease, or a de novo CNV of a sample within the same family. Besides, we have provided in CNV as a docker image for ease of installation (https://github.com/saowwapark/inCNV).

在全外显子组测序(WES)中检测拷贝数变异(CNV)是研究遗传变异的一种经济有效的技术。然而,由于外显子组测序捕获试剂盒和 GC 含量的偏差,这种方法遇到了高假阳性率的障碍。虽然已经开发了大量 CNV 检测工具,但它们并不能很好地检测所有类型的 CNV。此外,大多数工具缺乏基因注释、CNV 可视化和灵活安装等功能,用户需要花费大量精力进行 CNV 解释。在这里,我们介绍一种基于网络的应用程序 "inCNV",它可以接受多个 CNV 工具的结果,然后通过用户友好的界面对这些结果进行整合和优先排序。该应用程序可从 Ensembl、基因组变异数据库 (DGV)、ClinVar 和在线人类孟德尔遗传 (OMIM) 中生成 CNV 注释,帮助用户分析被调用 CNV 的重要性。此外,用户还可以选择并导出感兴趣的 CNVs,包括其侧翼序列,用于引物设计和实验验证。我们展示了 inCNV 如何帮助用户过滤和缩小被调用的 CNVs,将其筛选为潜在的新 CNV、同类疾病样本中的常见 CNV 或同一家族样本中的新 CNV。此外,为了便于安装,我们还将 CNV 作为 docker 镜像提供(https://github.com/saowwapark/inCNV)。
{"title":"inCNV: An Integrated Analysis Tool for Copy Number Variation on Whole Exome Sequencing.","authors":"Saowwapark Chanwigoon, Sakkayaphab Piwluang, Duangdao Wichadakul","doi":"10.1177/1176934320956577","DOIUrl":"10.1177/1176934320956577","url":null,"abstract":"<p><p>The detection of copy number variations (CNVs) on whole-exome sequencing (WES) represents a cost-effective technique for the study of genetic variants. This approach, however, has encountered an obstacle with high false-positive rates due to biases from exome sequencing capture kits and GC contents. Although plenty of CNV detection tools have been developed, they do not perform well with all types of CNVs. In addition, most tools lack features of genetic annotation, CNV visualization, and flexible installation, requiring users to put much effort into CNV interpretation. Here, we present \"inCNV,\" a web-based application that can accept multiple CNV-tool results, then integrate and prioritize them with user-friendly interfaces. This application helps users analyze the importance of called CNVs by generating CNV annotations from Ensembl, Database of Genomic Variants (DGV), ClinVar, and Online Mendelian Inheritance in Man (OMIM). Moreover, users can select and export CNVs of interest including their flanking sequences for primer design and experimental verification. We demonstrated how inCNV could help users filter and narrow down the called CNVs to a potentially novel CNV, a common CNV within a group of samples of the same disease, or a <i>de novo</i> CNV of a sample within the same family. Besides, we have provided in CNV as a docker image for ease of installation (https://github.com/saowwapark/inCNV).</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"16 ","pages":"1176934320956577"},"PeriodicalIF":1.7,"publicationDate":"2020-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/e5/a0/10.1177_1176934320956577.PMC7520931.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38464945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prognostic Score-based Clinical Factors and Metabolism-related Biomarkers for Predicting the Progression of Hepatocellular Carcinoma. 基于预后评分的临床因素和代谢相关生物标记物预测肝细胞癌的进展情况
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2020-09-22 eCollection Date: 2020-01-01 DOI: 10.1177/1176934320951571
Jia Yan, Ming Shu, Xiang Li, Hua Yu, Shuhuai Chen, Shujie Xie

Hepatocellular carcinoma (HCC) is a common malignant tumor representing more than 90% of primary liver cancer. This study aimed to identify metabolism-related biomarkers with prognostic value by developing the novel prognostic score (PS) model. Transcriptomic profiles derived from TCGA and EBIArray databases were analyzed to identify differentially expressed genes (DEGs) in HCC tumor samples compared with normal samples. The overlapped genes between DEGs and metabolism-related genes (crucial genes) were screened and functionally analyzed. A novel PS model was constructed to identify optimal signature genes. Cox regression analysis was performed to identify independent clinical factors related to prognosis. Nomogram model was constructed to estimate the predictability of clinical factors. Finally, protein expression of crucial genes was explored in different cancer tissues and cell types from the Human Protein Atlas (HPA). We screened a total of 305 overlapped genes (differentially expressed metabolism-related genes). These genes were mainly involved in "oxidation reduction," "steroid hormone biosynthesis," "fatty acid metabolic process," and "linoleic acid metabolism." Furthermore, we screened ten optimal DEGs (CYP2C9, CYP3A4, and TKT, among others) by using the PS model. Two clinical factors of pathologic stage (P < .001, HR: 1.512 [1.219-1.875]) and PS status (P <.001, HR: 2.259 [1.522-3.354]) were independent prognostic predictors by cox regression analysis. Nomogram model showed a high predicted probability of overall survival time, and the AUC value was 0.837. The expression status of 7 proteins was frequently altered in normal or differential tumor tissues, such as liver cancer and stomach cancer samples.We have identified several metabolism-related biomarkers for prognosis prediction of HCC based on the PS model. Two clinical factors were independent prognostic predictors of pathologic stage and PS status (high/low risk). The prognosis prediction model described in this study is a useful and stable method for novel biomarker identification.

肝细胞癌(HCC)是一种常见的恶性肿瘤,占原发性肝癌的90%以上。本研究旨在通过建立新的预后评分(PS)模型,确定具有预后价值的代谢相关生物标志物。研究人员分析了来自TCGA和EBIArray数据库的转录组图谱,以确定与正常样本相比,HCC肿瘤样本中的差异表达基因(DEGs)。筛选了 DEGs 与代谢相关基因(关键基因)之间的重叠基因,并对其进行了功能分析。构建了一个新的 PS 模型,以确定最佳特征基因。通过 Cox 回归分析确定了与预后相关的独立临床因素。建立了提名图模型来估计临床因素的可预测性。最后,从人类蛋白质图谱(HPA)中探索了不同癌症组织和细胞类型中关键基因的蛋白质表达。我们共筛选出 305 个重叠基因(差异表达的代谢相关基因)。这些基因主要涉及 "氧化还原"、"类固醇激素生物合成"、"脂肪酸代谢过程 "和 "亚油酸代谢"。此外,我们还利用 PS 模型筛选出了 10 个最佳 DEGs(CYP2C9、CYP3A4 和 TKT 等)。病理分期(P < .001,HR:1.512 [1.219-1.875])和 PS 状态(P
{"title":"Prognostic Score-based Clinical Factors and Metabolism-related Biomarkers for Predicting the Progression of Hepatocellular Carcinoma.","authors":"Jia Yan, Ming Shu, Xiang Li, Hua Yu, Shuhuai Chen, Shujie Xie","doi":"10.1177/1176934320951571","DOIUrl":"10.1177/1176934320951571","url":null,"abstract":"<p><p>Hepatocellular carcinoma (HCC) is a common malignant tumor representing more than 90% of primary liver cancer. This study aimed to identify metabolism-related biomarkers with prognostic value by developing the novel prognostic score (PS) model. Transcriptomic profiles derived from TCGA and EBIArray databases were analyzed to identify differentially expressed genes (DEGs) in HCC tumor samples compared with normal samples. The overlapped genes between DEGs and metabolism-related genes (crucial genes) were screened and functionally analyzed. A novel PS model was constructed to identify optimal signature genes. Cox regression analysis was performed to identify independent clinical factors related to prognosis. Nomogram model was constructed to estimate the predictability of clinical factors. Finally, protein expression of crucial genes was explored in different cancer tissues and cell types from the Human Protein Atlas (HPA). We screened a total of 305 overlapped genes (differentially expressed metabolism-related genes). These genes were mainly involved in \"oxidation reduction,\" \"steroid hormone biosynthesis,\" \"fatty acid metabolic process,\" and \"linoleic acid metabolism.\" Furthermore, we screened ten optimal DEGs (CYP2C9, CYP3A4, and TKT, among others) by using the PS model. Two clinical factors of pathologic stage (P < .001, HR: 1.512 [1.219-1.875]) and PS status (P <.001, HR: 2.259 [1.522-3.354]) were independent prognostic predictors by cox regression analysis. Nomogram model showed a high predicted probability of overall survival time, and the AUC value was 0.837. The expression status of 7 proteins was frequently altered in normal or differential tumor tissues, such as liver cancer and stomach cancer samples.We have identified several metabolism-related biomarkers for prognosis prediction of HCC based on the PS model. Two clinical factors were independent prognostic predictors of pathologic stage and PS status (high/low risk). The prognosis prediction model described in this study is a useful and stable method for novel biomarker identification.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"16 ","pages":"1176934320951571"},"PeriodicalIF":2.6,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/c6/94/10.1177_1176934320951571.PMC7518001.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38452432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome-Wide Identification and Characterization of the SHI-Related Sequence Gene Family in Rice. 水稻shi相关序列基因家族的全基因组鉴定与特征分析。
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2020-09-11 eCollection Date: 2020-01-01 DOI: 10.1177/1176934320941495
Jun Yang, Peng Xu, Diqiu Yu

Rice (Oryza sativa) yield is correlated to various factors. Transcription regulators are important factors, such as the typical SHORT INTERNODES-related sequences (SRSs), which encode proteins with single zinc finger motifs. Nevertheless, knowledge regarding the evolutionary and functional characteristics of the SRS gene family members in rice is insufficient. Therefore, we performed a genome-wide screening and characterization of the OsSRS gene family in Oryza sativa japonica rice. We also examined the SRS proteins from 11 rice sub-species, consisting of 3 cultivars, 6 wild varieties, and 2 other genome types. SRS members from maize, sorghum, Brachypodium distachyon, and Arabidopsis were also investigated. All these SRS proteins exhibited species-specific characteristics, as well as monocot- and dicot-specific characteristics, as assessed by phylogenetic analysis, which was further validated by gene structure and motif analyses. Genome comparisons revealed that segmental duplications may have played significant roles in the recombination of the OsSRS gene family and their expression levels. The family was mainly subjected to purifying selective pressure. In addition, the expression data demonstrated the distinct responses of OsSRS genes to various abiotic stresses and hormonal treatments, indicating their functional divergence. Our study provides a good reference for elucidating the functions of SRS genes in rice.

水稻(Oryza sativa)产量与多种因素相关。转录调控因子是重要的因子,如典型的短internodes相关序列(SRSs),其编码具有单个锌指基序的蛋白质。然而,关于水稻SRS基因家族成员的进化和功能特征的知识是不足的。因此,我们对水稻OsSRS基因家族进行了全基因组筛选和鉴定。我们还检测了11个水稻亚种的SRS蛋白,其中包括3个栽培品种,6个野生品种和2个其他基因组类型。对玉米、高粱、长柄短茅和拟南芥的SRS成员也进行了研究。系统发育分析表明,所有SRS蛋白均具有物种特异性、单株特异性和双株特异性,基因结构和基序分析进一步证实了这一点。基因组比较表明,片段重复可能在OsSRS基因家族重组及其表达水平中发挥重要作用。家庭主要受到净化选择压力。此外,表达数据显示OsSRS基因对各种非生物胁迫和激素处理的反应不同,表明它们的功能分化。本研究为阐明水稻SRS基因的功能提供了良好的参考。
{"title":"Genome-Wide Identification and Characterization of the SHI-Related Sequence Gene Family in Rice.","authors":"Jun Yang,&nbsp;Peng Xu,&nbsp;Diqiu Yu","doi":"10.1177/1176934320941495","DOIUrl":"https://doi.org/10.1177/1176934320941495","url":null,"abstract":"<p><p>Rice (<i>Oryza sativa</i>) yield is correlated to various factors. Transcription regulators are important factors, such as the typical SHORT INTERNODES-related sequences (SRSs), which encode proteins with single zinc finger motifs. Nevertheless, knowledge regarding the evolutionary and functional characteristics of the <i>SRS</i> gene family members in rice is insufficient. Therefore, we performed a genome-wide screening and characterization of the <i>OsSRS</i> gene family in <i>Oryza sativa</i> japonica rice. We also examined the SRS proteins from 11 rice sub-species, consisting of 3 cultivars, 6 wild varieties, and 2 other genome types. SRS members from maize, sorghum, <i>Brachypodium distachyon</i>, and <i>Arabidopsis</i> were also investigated. All these SRS proteins exhibited species-specific characteristics, as well as monocot- and dicot-specific characteristics, as assessed by phylogenetic analysis, which was further validated by gene structure and motif analyses. Genome comparisons revealed that segmental duplications may have played significant roles in the recombination of the <i>OsSRS</i> gene family and their expression levels. The family was mainly subjected to purifying selective pressure. In addition, the expression data demonstrated the distinct responses of <i>OsSRS</i> genes to various abiotic stresses and hormonal treatments, indicating their functional divergence. Our study provides a good reference for elucidating the functions of <i>SRS</i> genes in rice.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"16 ","pages":"1176934320941495"},"PeriodicalIF":2.6,"publicationDate":"2020-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176934320941495","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38408336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
ZMAT2 in Humans and Other Primates: A Highly Conserved and Understudied Gene. ZMAT2在人类和其他灵长类动物:一个高度保守和研究不足的基因。
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2020-09-02 eCollection Date: 2020-01-01 DOI: 10.1177/1176934320941500
Kabita Baral, Peter Rotwein

Recent advances in genetics present unique opportunities for enhancing our understanding of human physiology and disease predisposition through detailed analysis of gene structure, expression, and population variation via examination of data in publicly accessible genome and gene expression repositories. Yet, the vast majority of human genes remain understudied. Here, we show the scope of these genomic and genetic resources by evaluating ZMAT2, a member of a 5-gene family that through May 2020 had been the focus of only 4 peer-reviewed scientific publications. Using analysis of information extracted from public databases, we show that human ZMAT2 is a 6-exon gene and find that it exhibits minimal genetic variation in human populations and in disease states, including cancer. We further demonstrate that the gene and its encoded protein are highly conserved among nonhuman primates and define a cohort of ZMAT2 pseudogenes in the marmoset genome. Collectively, our investigations illustrate how complementary use of genomic, gene expression, and population genetic resources can lead to new insights about human and mammalian biology and evolution, and when coupled with data supporting key roles for ZMAT2 in keratinocyte differentiation and pre-RNA splicing argue that this gene is worthy of further study.

遗传学的最新进展提供了独特的机会,通过检查可公开访问的基因组和基因表达库中的数据,对基因结构、表达和群体变异进行详细分析,从而增强我们对人类生理学和疾病易感性的理解。然而,绝大多数人类基因仍未得到充分研究。在这里,我们通过评估ZMAT2来展示这些基因组和遗传资源的范围,ZMAT2是一个5基因家族的成员,到2020年5月,只有4篇同行评议的科学出版物关注了ZMAT2。通过分析从公共数据库中提取的信息,我们发现人类ZMAT2是一个6外显子基因,并发现它在人类群体和疾病状态(包括癌症)中表现出最小的遗传变异。我们进一步证明了该基因及其编码蛋白在非人灵长类动物中高度保守,并在狨猴基因组中定义了一组ZMAT2假基因。总的来说,我们的研究说明了基因组,基因表达和群体遗传资源的互补使用如何导致对人类和哺乳动物生物学和进化的新见解,并且当结合支持ZMAT2在角化细胞分化和前rna剪接中的关键作用的数据时,认为该基因值得进一步研究。
{"title":"<i>ZMAT2</i> in Humans and Other Primates: A Highly Conserved and Understudied Gene.","authors":"Kabita Baral,&nbsp;Peter Rotwein","doi":"10.1177/1176934320941500","DOIUrl":"https://doi.org/10.1177/1176934320941500","url":null,"abstract":"<p><p>Recent advances in genetics present unique opportunities for enhancing our understanding of human physiology and disease predisposition through detailed analysis of gene structure, expression, and population variation via examination of data in publicly accessible genome and gene expression repositories. Yet, the vast majority of human genes remain understudied. Here, we show the scope of these genomic and genetic resources by evaluating <i>ZMAT2</i>, a member of a 5-gene family that through May 2020 had been the focus of only 4 peer-reviewed scientific publications. Using analysis of information extracted from public databases, we show that human <i>ZMAT2</i> is a 6-exon gene and find that it exhibits minimal genetic variation in human populations and in disease states, including cancer. We further demonstrate that the gene and its encoded protein are highly conserved among nonhuman primates and define a cohort of <i>ZMAT2</i> pseudogenes in the marmoset genome. Collectively, our investigations illustrate how complementary use of genomic, gene expression, and population genetic resources can lead to new insights about human and mammalian biology and evolution, and when coupled with data supporting key roles for ZMAT2 in keratinocyte differentiation and pre-RNA splicing argue that this gene is worthy of further study.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"16 ","pages":"1176934320941500"},"PeriodicalIF":2.6,"publicationDate":"2020-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176934320941500","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38496158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Identification of Metastasis-Associated Genes in Triple-Negative Breast Cancer Using Weighted Gene Co-expression Network Analysis. 利用加权基因共表达网络分析鉴定三阴性乳腺癌转移相关基因。
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2020-09-01 eCollection Date: 2020-01-01 DOI: 10.1177/1176934320954868
Wenting Xie, Zhongshi Du, Yijie Chen, Naxiang Liu, Zhaoming Zhong, Youhong Shen, Lina Tang

Triple-negative breast cancer (TNBC) is the most aggressive and fatal sub-type of breast cancer. This study aimed to identify metastasis-associated genes that could serve as biomarkers for TNBC diagnosis and prognosis. RNA-seq data and clinical information on TNBC from the Cancer Genome Atlas were used to conduct analyses. Expression data were used to establish co-expression modules using average linkage hierarchical clustering. We used weighted gene co-expression network analysis to explore the associations between gene sets and clinical features and to identify metastasis-associated candidate biomarkers. The K-M plotter website was used to explore the association between the expression of candidate biomarkers and patient survival. In addition, receiver operating characteristic curve analysis was used to illustrate the diagnostic performance of candidate genes. The pale turquoise module was significantly associated with the occurrence of metastasis. In this module, 64 genes were identified, and its functional enrichment analysis revealed that they were mainly associated with transcriptional misregulation in cancer, microRNAs in cancer, and negative regulation of angiogenesis. Further, 4 genes, IGSF10, RUNX1T1, XIST, and TSHZ2, which were negatively associated with relapse-free survival and have seldom been reported before in TNBC, were selected. In addition, the mRNA expression levels of the 4 candidate genes were significantly lower in TNBC tumor tissues compared with healthy tissues. Based on the K-M plotter, these 4 genes were correlated with poor prognosis of TNBC. The area under the curve of IGSF10, RUNX1T1, TSHZ2, and XIST was 0.918, 0.957, 0.977, and 0.749. These findings provide new insight into TNBC metastasis. IGSF10, RUNX1T1, TSHZ2, and XIST could be used as candidate biomarkers for the diagnosis and prognosis of TNBC metastasis.

三阴性乳腺癌(TNBC)是最具侵袭性和致命性的乳腺癌亚型。本研究旨在鉴定可作为TNBC诊断和预后生物标志物的转移相关基因。使用来自癌症基因组图谱的RNA-seq数据和TNBC的临床信息进行分析。利用表达数据建立共表达模块,采用平均链接分层聚类。我们使用加权基因共表达网络分析来探索基因集与临床特征之间的关系,并确定与转移相关的候选生物标志物。使用K-M绘图仪网站探索候选生物标志物的表达与患者生存之间的关系。此外,采用受试者工作特征曲线分析来说明候选基因的诊断性能。淡蓝绿色模块与转移的发生显著相关。该模块共鉴定出64个基因,功能富集分析显示,这些基因主要与肿瘤中的转录失调、肿瘤中的microrna以及血管生成的负调控有关。此外,我们还选择了4个基因IGSF10、RUNX1T1、XIST和TSHZ2,这4个基因与TNBC的无复发生存呈负相关,之前很少有报道。此外,与健康组织相比,这4个候选基因在TNBC肿瘤组织中的mRNA表达水平显著降低。基于K-M绘图仪,这4个基因与TNBC预后不良相关。IGSF10、RUNX1T1、TSHZ2、XIST的曲线下面积分别为0.918、0.957、0.977、0.749。这些发现为TNBC转移提供了新的认识。IGSF10、RUNX1T1、TSHZ2和XIST可作为TNBC转移诊断和预后的候选生物标志物。
{"title":"Identification of Metastasis-Associated Genes in Triple-Negative Breast Cancer Using Weighted Gene Co-expression Network Analysis.","authors":"Wenting Xie,&nbsp;Zhongshi Du,&nbsp;Yijie Chen,&nbsp;Naxiang Liu,&nbsp;Zhaoming Zhong,&nbsp;Youhong Shen,&nbsp;Lina Tang","doi":"10.1177/1176934320954868","DOIUrl":"https://doi.org/10.1177/1176934320954868","url":null,"abstract":"<p><p>Triple-negative breast cancer (TNBC) is the most aggressive and fatal sub-type of breast cancer. This study aimed to identify metastasis-associated genes that could serve as biomarkers for TNBC diagnosis and prognosis. RNA-seq data and clinical information on TNBC from the Cancer Genome Atlas were used to conduct analyses. Expression data were used to establish co-expression modules using average linkage hierarchical clustering. We used weighted gene co-expression network analysis to explore the associations between gene sets and clinical features and to identify metastasis-associated candidate biomarkers. The K-M plotter website was used to explore the association between the expression of candidate biomarkers and patient survival. In addition, receiver operating characteristic curve analysis was used to illustrate the diagnostic performance of candidate genes. The pale turquoise module was significantly associated with the occurrence of metastasis. In this module, 64 genes were identified, and its functional enrichment analysis revealed that they were mainly associated with transcriptional misregulation in cancer, microRNAs in cancer, and negative regulation of angiogenesis. Further, 4 genes, <i>IGSF10, RUNX1T1, XIST</i>, and <i>TSHZ2</i>, which were negatively associated with relapse-free survival and have seldom been reported before in TNBC, were selected. In addition, the mRNA expression levels of the 4 candidate genes were significantly lower in TNBC tumor tissues compared with healthy tissues. Based on the K-M plotter, these 4 genes were correlated with poor prognosis of TNBC. The area under the curve of <i>IGSF10, RUNX1T1, TSHZ2</i>, and <i>XIST</i> was 0.918, 0.957, 0.977, and 0.749. These findings provide new insight into TNBC metastasis. <i>IGSF10, RUNX1T1, TSHZ2</i>, and <i>XIST</i> could be used as candidate biomarkers for the diagnosis and prognosis of TNBC metastasis.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"16 ","pages":"1176934320954868"},"PeriodicalIF":2.6,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176934320954868","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38496159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Complete Genome and Comparative Genome Analysis of Lactobacillus reuteri YSJL-12, a Potential Probiotics Strain Isolated From Healthy Sow Fresh Feces. 健康母猪新鲜粪便中潜在益生菌罗伊氏乳杆菌YSJL-12的全基因组及比较基因组分析
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2020-07-27 eCollection Date: 2020-01-01 DOI: 10.1177/1176934320942192
Su Xu, Jianjun Cheng, Xiangchen Meng, Yan Xu, Ying Mu

Lactobacillus reuteri YSJL-12 was isolated from healthy sow fresh feces and used as probiotics additives previously. To investigate the genetic basis on probiotic potential and identify the genes in the strain, the complete genome of YSJL-12 was sequenced. Then comparative genome analysis on 9 strains of Lactobacillus reuteri was performed. The genome of YSJL-12 consisted of a circular 2,084,748 bp chromosome and 2 circular plasmids (51,906 and 15,134 bp). From among the 2065 protein-coding sequences (CDSs), the genes resistant to the environmental stress were identified. The function of COG (Clusters of Orthologous Group) protein genes was predicted, and the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways were analyzed. The comparative genome analysis indicated that the pan-genome contained a core genome of 1257 orthologous gene clusters, an accessory genome of 1064 orthologous gene clusters, and 1148 strain-specific genes, and the antibacterial mechanism among Lactobacillus reuteri strains might be different. The phylogenetic analysis and genomic collinearity revealed that the phylogenetic relationship among 9 strains of Lactobacillus reuteri was connected with host species and showed host specificity. The research could help us to better predict genes function and understand genetic basis on adapting to host gut in Lactobacillus reuteri YSJL-12.

罗伊氏乳杆菌YSJL-12是从健康母猪新鲜粪便中分离得到的,曾作为益生菌添加剂使用。为了研究该菌株益生菌潜力的遗传基础和鉴定菌株的基因,对YSJL-12进行了全基因组测序。对9株罗伊氏乳杆菌进行比较基因组分析。YSJL-12基因组由一条环状2084748 bp的染色体和两个环状质粒(51906 bp和15134 bp)组成。从2065个蛋白质编码序列(CDSs)中鉴定出抗环境胁迫的基因。预测了COG (Clusters of Orthologous Group)蛋白基因的功能,并分析了KEGG (Kyoto Encyclopedia of genes and Genomes)通路。比较基因组分析表明,该泛基因组包含1257个同源基因簇的核心基因组,1064个同源基因簇的辅助基因组,以及1148个菌株特异性基因,菌株间的抑菌机制可能存在差异。系统发育分析和基因组共线性分析表明,9株罗伊氏乳杆菌的系统发育关系与宿主种类有关,具有宿主特异性。本研究有助于更好地预测罗伊氏乳杆菌YSJL-12的基因功能,了解其适应宿主肠道的遗传基础。
{"title":"Complete Genome and Comparative Genome Analysis of <i>Lactobacillus reuteri</i> YSJL-12, a Potential Probiotics Strain Isolated From Healthy Sow Fresh Feces.","authors":"Su Xu,&nbsp;Jianjun Cheng,&nbsp;Xiangchen Meng,&nbsp;Yan Xu,&nbsp;Ying Mu","doi":"10.1177/1176934320942192","DOIUrl":"https://doi.org/10.1177/1176934320942192","url":null,"abstract":"<p><p><i>Lactobacillus reuteri</i> YSJL-12 was isolated from healthy sow fresh feces and used as probiotics additives previously. To investigate the genetic basis on probiotic potential and identify the genes in the strain, the complete genome of YSJL-12 was sequenced. Then comparative genome analysis on 9 strains of <i>Lactobacillus reuteri</i> was performed. The genome of YSJL-12 consisted of a circular 2,084,748 bp chromosome and 2 circular plasmids (51,906 and 15,134 bp). From among the 2065 protein-coding sequences (CDSs), the genes resistant to the environmental stress were identified. The function of COG (Clusters of Orthologous Group) protein genes was predicted, and the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways were analyzed. The comparative genome analysis indicated that the pan-genome contained a core genome of 1257 orthologous gene clusters, an accessory genome of 1064 orthologous gene clusters, and 1148 strain-specific genes, and the antibacterial mechanism among <i>Lactobacillus reuteri</i> strains might be different. The phylogenetic analysis and genomic collinearity revealed that the phylogenetic relationship among 9 strains of <i>Lactobacillus reuteri</i> was connected with host species and showed host specificity. The research could help us to better predict genes function and understand genetic basis on adapting to host gut in <i>Lactobacillus reuteri</i> YSJL-12.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"16 ","pages":"1176934320942192"},"PeriodicalIF":2.6,"publicationDate":"2020-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176934320942192","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38262586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Generation of Cry11 Variants of Bacillus thuringiensis by Heuristic Computational Modeling. 基于启发式计算模型的苏云金芽孢杆菌Cry11变体的生成
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2020-07-27 eCollection Date: 2020-01-01 DOI: 10.1177/1176934320924681
Efraín Hernando Pinzón-Reyes, Daniel Alfonso Sierra-Bueno, Miguel Orlando Suarez-Barrera, Nohora Juliana Rueda-Forero, Sebastián Abaunza-Villamizar, Paola Rondón-Villareal

Directed evolution methods mimic in vitro Darwinian evolution, inducing random mutations and selective pressure in genes to obtain proteins with enhanced characteristics. These techniques are developed using trial-and-error testing at an experimental level with a high degree of uncertainty. Therefore, in silico modeling of directed evolution is required to support experimental assays. Several in silico approaches have reproduced directed evolution, using statistical, thermodynamic, and kinetic models in an attempt to recreate experimental conditions. Likewise, optimization techniques using heuristic models have been used to understand and find the best scenarios of directed evolution. Our study uses an in silico model named HeurIstics DirecteD EvolutioN, which is based on a genetic algorithm designed to generate chimeric libraries from 2 parental genes, cry11Aa and cry11Ba, of Bacillus thuringiensis. These genes encode crystal-shaped δ-endotoxins with 3 conserved domains. Cry11 toxins are of biotechnological interest because they have shown to be effective as biopesticides for disease-spreading vectors. With our heuristic model, we considered experimental parameters such as DNA fragmentation length, number of generations or simulation cycles, and mutation rate, to get characteristics of Cry11 chimeric libraries such as percentage of population identity, truncation of variants obtained from the presence of internal stop codons, percentage of thermodynamic diversity, and stability of variants. Our study allowed us to focus on experimental conditions that may be useful for the design of in vitro and in silico experiments of directed evolution with Cry toxins of 3 conserved domains. Furthermore, we obtained in silico libraries of Cry11 variants, in which structural characteristics of wild Cry families were observed in a review of a sample of in silico sequences. We consider that future studies could use our in silico libraries and heuristic computational models, as the one suggested here, to support in vitro experiments of directed evolution.

定向进化方法模拟体外达尔文进化,诱导基因的随机突变和选择压力,以获得具有增强特性的蛋白质。这些技术是在具有高度不确定性的实验水平上使用试错测试开发的。因此,定向进化的计算机模拟需要支持实验分析。一些计算机方法利用统计、热力学和动力学模型再现了定向进化,试图重现实验条件。同样,使用启发式模型的优化技术已被用于理解和找到定向进化的最佳方案。本研究采用了启发式定向进化(HeurIstics DirecteD EvolutioN)的计算机模型,该模型基于遗传算法,从苏云金芽孢杆菌cry11Aa和cry11Ba两个亲本基因中生成嵌合文库。这些基因编码具有3个保守结构域的晶体状δ-内毒素。Cry11毒素在生物技术方面具有重要意义,因为它们已被证明是防治疾病传播媒介的有效生物农药。利用我们的启发模型,我们考虑了DNA片段长度、代数或模拟周期以及突变率等实验参数,以获得Cry11嵌合文库的特征,如群体身份的百分比、内部终止密码子的存在所获得的变体的截断、热力学多样性的百分比和变体的稳定性。我们的研究使我们能够专注于实验条件,这可能有助于设计具有3个保守结构域的Cry毒素定向进化的体外和计算机实验。此外,我们获得了Cry11变异体的计算机文库,其中野生Cry家族的结构特征在计算机序列样本的回顾中被观察到。我们认为未来的研究可以使用我们的芯片库和启发式计算模型,正如这里所建议的那样,来支持定向进化的体外实验。
{"title":"Generation of Cry11 Variants of <i>Bacillus thuringiensis</i> by Heuristic Computational Modeling.","authors":"Efraín Hernando Pinzón-Reyes,&nbsp;Daniel Alfonso Sierra-Bueno,&nbsp;Miguel Orlando Suarez-Barrera,&nbsp;Nohora Juliana Rueda-Forero,&nbsp;Sebastián Abaunza-Villamizar,&nbsp;Paola Rondón-Villareal","doi":"10.1177/1176934320924681","DOIUrl":"https://doi.org/10.1177/1176934320924681","url":null,"abstract":"<p><p>Directed evolution methods mimic in vitro Darwinian evolution, inducing random mutations and selective pressure in genes to obtain proteins with enhanced characteristics. These techniques are developed using trial-and-error testing at an experimental level with a high degree of uncertainty. Therefore, in silico modeling of directed evolution is required to support experimental assays. Several in silico approaches have reproduced directed evolution, using statistical, thermodynamic, and kinetic models in an attempt to recreate experimental conditions. Likewise, optimization techniques using heuristic models have been used to understand and find the best scenarios of directed evolution. Our study uses an in silico model named HeurIstics DirecteD EvolutioN, which is based on a genetic algorithm designed to generate chimeric libraries from 2 parental genes, <i>cry11Aa</i> and <i>cry11Ba</i>, of <i>Bacillus thuringiensis</i>. These genes encode crystal-shaped δ-endotoxins with 3 conserved domains. <i>Cry11</i> toxins are of biotechnological interest because they have shown to be effective as biopesticides for disease-spreading vectors. With our heuristic model, we considered experimental parameters such as DNA fragmentation length, number of generations or simulation cycles, and mutation rate, to get characteristics of <i>Cry11</i> chimeric libraries such as percentage of population identity, truncation of variants obtained from the presence of internal stop codons, percentage of thermodynamic diversity, and stability of variants. Our study allowed us to focus on experimental conditions that may be useful for the design of in vitro and in silico experiments of directed evolution with <i>Cry</i> toxins of 3 conserved domains. Furthermore, we obtained in silico libraries of <i>Cry11</i> variants, in which structural characteristics of wild <i>Cry</i> families were observed in a review of a sample of in silico sequences. We consider that future studies could use our in silico libraries and heuristic computational models, as the one suggested here, to support in vitro experiments of directed evolution.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"16 ","pages":"1176934320924681"},"PeriodicalIF":2.6,"publicationDate":"2020-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176934320924681","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38262585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Exploiting Homoplasy in Genome-Wide Association Studies to Enhance Identification of Antibiotic-Resistance Mutations in Bacterial Genomes. 利用全基因组关联研究中的同质性来加强细菌基因组中抗生素耐药突变的鉴定。
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2020-07-27 eCollection Date: 2020-01-01 DOI: 10.1177/1176934320944932
Yi-Pin Lai, Thomas R Ioerger

Many antibacterial drugs have multiple mechanisms of resistance, which are often represented simultaneously by a mixture of resistance mutations (some more frequent than others) in a clinical population. This presents a challenge for Genome-Wide Association Studies (GWAS) methods, making it difficult to detect less prevalent resistance mechanisms purely through (weak) statistical associations. Homoplasy, or the occurrence of multiple independent mutations at the same site, is often observed with drug resistance mutations and can be a strong indicator of positive selection. However, traditional GWAS methods, such as those based on allele counting or linear regression, are not designed to take homoplasy into account. In this article, we present a new method, called ECAT (for Evolutionary Cluster-based Association Test), that extends traditional regression-based GWAS methods with the ability to take advantage of homoplasy. This is achieved through a preprocessing step which identifies hypervariable regions in the genome exhibiting statistically significant clusters of distinct evolutionary changes, to which association testing by a linear mixed model (LMM) is applied using GEMMA (a well-established LMM-based GWAS tool). Thus, the approach can be viewed as extending GEMMA from the usual site- or gene-level analysis to focusing on clustered regions of mutations. This approach was evaluated on a large collection of more than 600 clinical isolates of multidrug-resistant (MDR) Mycobacterium tuberculosis from Lima, Peru. We show that ECAT does a better job of detecting known resistance mutations for several antitubercular drugs (including less prevalent mutations with weaker associations), compared with (site- or gene-based) GEMMA, as representative of existing GWAS methods. The power of the multiphase approach in ECAT comes from focusing association testing on the hypervariable regions of the genome, which reduces complexity in the model and increases statistical power.

许多抗菌药物具有多种耐药机制,通常在临床人群中同时表现为耐药突变的混合(有些比其他更频繁)。这对全基因组关联研究(GWAS)方法提出了挑战,使得仅通过(弱)统计关联很难检测到不太普遍的耐药机制。在耐药突变中经常观察到同源性,或在同一位点发生多个独立突变,这可能是阳性选择的一个强有力的指标。然而,传统的GWAS方法,如基于等位基因计数或线性回归的方法,并没有考虑到同质性。在本文中,我们提出了一种新的方法,称为ECAT(基于进化聚类的关联测试),它扩展了传统的基于回归的GWAS方法,能够利用同质性。这是通过一个预处理步骤来实现的,该步骤识别出基因组中表现出统计学上显著的不同进化变化集群的高变量区域,并使用GEMMA(一种成熟的基于LMM的GWAS工具)应用线性混合模型(LMM)进行关联测试。因此,该方法可以被视为将GEMMA从通常的位点或基因水平分析扩展到关注突变的聚集区域。该方法在秘鲁利马收集的600多株耐多药结核分枝杆菌临床分离株中进行了评估。我们表明,作为现有GWAS方法的代表,与(基于位点或基因的)GEMMA相比,ECAT在检测几种抗结核药物的已知耐药突变(包括相关性较弱的不太普遍的突变)方面做得更好。ECAT中多阶段方法的强大之处在于将关联测试集中在基因组的高可变区域,这降低了模型的复杂性并提高了统计能力。
{"title":"Exploiting Homoplasy in Genome-Wide Association Studies to Enhance Identification of Antibiotic-Resistance Mutations in Bacterial Genomes.","authors":"Yi-Pin Lai,&nbsp;Thomas R Ioerger","doi":"10.1177/1176934320944932","DOIUrl":"https://doi.org/10.1177/1176934320944932","url":null,"abstract":"<p><p>Many antibacterial drugs have multiple mechanisms of resistance, which are often represented simultaneously by a mixture of resistance mutations (some more frequent than others) in a clinical population. This presents a challenge for Genome-Wide Association Studies (GWAS) methods, making it difficult to detect less prevalent resistance mechanisms purely through (weak) statistical associations. Homoplasy, or the occurrence of multiple independent mutations at the same site, is often observed with drug resistance mutations and can be a strong indicator of positive selection. However, traditional GWAS methods, such as those based on allele counting or linear regression, are not designed to take homoplasy into account. In this article, we present a new method, called ECAT (for Evolutionary Cluster-based Association Test), that extends traditional regression-based GWAS methods with the ability to take advantage of homoplasy. This is achieved through a preprocessing step which identifies hypervariable regions in the genome exhibiting statistically significant clusters of distinct evolutionary changes, to which association testing by a linear mixed model (LMM) is applied using GEMMA (a well-established LMM-based GWAS tool). Thus, the approach can be viewed as extending GEMMA from the usual site- or gene-level analysis to focusing on clustered regions of mutations. This approach was evaluated on a large collection of more than 600 clinical isolates of multidrug-resistant (MDR) <i>Mycobacterium tuberculosis</i> from Lima, Peru. We show that ECAT does a better job of detecting known resistance mutations for several antitubercular drugs (including less prevalent mutations with weaker associations), compared with (site- or gene-based) GEMMA, as representative of existing GWAS methods. The power of the multiphase approach in ECAT comes from focusing association testing on the hypervariable regions of the genome, which reduces complexity in the model and increases statistical power.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"16 ","pages":"1176934320944932"},"PeriodicalIF":2.6,"publicationDate":"2020-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176934320944932","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38255196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
Evolutionary Bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1