首页 > 最新文献

Evolutionary Bioinformatics最新文献

英文 中文
Utilizing In Silico Approaches to Investigate the Signaling Pathway’s Crucial Function in Pennisetum glaucum Under Thermal Stress 利用计算机方法研究热胁迫下白狼尾草信号通路的关键功能
4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769343231211072
Faten Dhawi
Pearl millet (Pennisetum glaucum (L.)) is a remarkable cereal crop known for its ability to thrive in challenging environmental conditions. Despite its resilience, the intricate molecular mechanisms behind its toughness remain a mystery. To address this knowledge gap, we conducted advanced next-generation RNA sequencing. This approach allowed us to compare the gene expression profiles of pearl millet seedlings exposed to heat stress with those grown under standard conditions. Our main focus was on the shoots of 13-day-old pearl millet plants, which we subjected to a brief heat stress episode at 50°C for 60 seconds. Within the vast genomic landscape comprising 36 041 genes, we successfully identified a set of 10 genes that exhibited significant fold changes, ranging from 11 to 14-fold compared to the control conditions. These 10 genes were previously unknown to have such substantial changes in expression compared to the control. To uncover the functional significance hidden within these transcriptomic findings, we utilized computational tools such as MEME, String, and phylogenetic tree analysis. These efforts collectively revealed conserved domains within the transcriptomic landscape, hinting at potential functions associated with these genetic sequences. Of particular note, the distinct transcriptomic patterns specific to pearl millet leaves under thermal stress shed light on intricate connections to fundamental biological processes. These processes included the Ethylene-activated signaling pathway, Regulation of intracellular signal transduction, Negative regulation of signal transduction, Protein autophosphorylation, and Intracellular signal transduction. Together, these processes provide insight into the molecular strategies employed by pearl millet to overcome thermal stress challenges. By integrating cutting-edge RNA sequencing techniques and computational analyses, we have embarked on unraveling the genetic components and pathways that empower pearl millet’s resilience in the face of adversity. This newfound understanding has the potential to not only advance our knowledge of plant stress responses but also contribute to enhancing crop resilience in challenging environmental conditions.
珍珠粟(Pennisetum glaucum (L.))是一种非凡的谷类作物,以其在具有挑战性的环境条件下茁壮成长的能力而闻名。尽管它具有弹性,但其韧性背后复杂的分子机制仍然是一个谜。为了解决这一知识差距,我们进行了先进的下一代RNA测序。这种方法使我们能够比较暴露于热胁迫下的珍珠粟幼苗与在标准条件下生长的珍珠粟幼苗的基因表达谱。我们的主要研究对象是13天大的珍珠粟植株的芽,我们在50°C的温度下对其进行了60秒的短暂热应激。在包含36041个基因的庞大基因组景观中,我们成功地鉴定了一组10个基因,与对照条件相比,它们表现出显著的折叠变化,从11到14倍不等。与对照组相比,这10个基因在表达上有如此大的变化,这在以前是未知的。为了揭示隐藏在这些转录组学发现中的功能意义,我们使用了计算工具,如MEME, String和系统发育树分析。这些努力共同揭示了转录组景观中的保守结构域,暗示了与这些基因序列相关的潜在功能。特别值得注意的是,在热胁迫下珍珠粟叶片特有的独特转录组模式揭示了与基本生物过程的复杂联系。这些过程包括乙烯激活信号通路、细胞内信号转导调控、信号转导负调控、蛋白质自磷酸化和细胞内信号转导。总之,这些过程为珍珠粟克服热应力挑战所采用的分子策略提供了见解。通过整合尖端的RNA测序技术和计算分析,我们已经开始揭示赋予珍珠粟在逆境中恢复力的遗传成分和途径。这一新发现不仅有可能提高我们对植物胁迫反应的认识,而且有助于提高作物在具有挑战性的环境条件下的抗逆性。
{"title":"Utilizing In Silico Approaches to Investigate the Signaling Pathway’s Crucial Function in <i>Pennisetum glaucum</i> Under Thermal Stress","authors":"Faten Dhawi","doi":"10.1177/11769343231211072","DOIUrl":"https://doi.org/10.1177/11769343231211072","url":null,"abstract":"Pearl millet (Pennisetum glaucum (L.)) is a remarkable cereal crop known for its ability to thrive in challenging environmental conditions. Despite its resilience, the intricate molecular mechanisms behind its toughness remain a mystery. To address this knowledge gap, we conducted advanced next-generation RNA sequencing. This approach allowed us to compare the gene expression profiles of pearl millet seedlings exposed to heat stress with those grown under standard conditions. Our main focus was on the shoots of 13-day-old pearl millet plants, which we subjected to a brief heat stress episode at 50°C for 60 seconds. Within the vast genomic landscape comprising 36 041 genes, we successfully identified a set of 10 genes that exhibited significant fold changes, ranging from 11 to 14-fold compared to the control conditions. These 10 genes were previously unknown to have such substantial changes in expression compared to the control. To uncover the functional significance hidden within these transcriptomic findings, we utilized computational tools such as MEME, String, and phylogenetic tree analysis. These efforts collectively revealed conserved domains within the transcriptomic landscape, hinting at potential functions associated with these genetic sequences. Of particular note, the distinct transcriptomic patterns specific to pearl millet leaves under thermal stress shed light on intricate connections to fundamental biological processes. These processes included the Ethylene-activated signaling pathway, Regulation of intracellular signal transduction, Negative regulation of signal transduction, Protein autophosphorylation, and Intracellular signal transduction. Together, these processes provide insight into the molecular strategies employed by pearl millet to overcome thermal stress challenges. By integrating cutting-edge RNA sequencing techniques and computational analyses, we have embarked on unraveling the genetic components and pathways that empower pearl millet’s resilience in the face of adversity. This newfound understanding has the potential to not only advance our knowledge of plant stress responses but also contribute to enhancing crop resilience in challenging environmental conditions.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135710040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New Insights Into The Evolution of Chloroplast Genomes in Ochna Species (Ochnaceae, Malpighiales) 桔黄色植物叶绿体基因组进化的新认识(桔黄色科)
4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769343231210756
Nguyen Nhat Nam, Nguyen Hoang Danh, Vu Minh Thiet, Hoang Dang Khoa Do
Ochnaceae DC. includes more than 600 species that exhibit potential values for environmental ecology, ornamental, pharmaceutical, and timber industries. Although studies on phylogeny and phytochemicals have been intensively conducted, chloroplast genome data of Ochnaceae species have not been fully explored. In this study, the next-generation sequencing method was used to sequence the chloroplast genomes of Ochna integerrima and Ochna serrulata which were 157 329 and 157 835 bp in length, respectively. These chloroplast genomes had a quadripartite structure and contained 78 protein-coding genes, 30 tRNAs, and 4 rRNAs. Comparative analysis revealed 8 hypervariable regions, including trnK_UUU-trnQ_UUG, rpoB-psbM, trnS_GGA-rps4, accD-psaI, rpl33-rps18, rpl14-rpl16, ndhF-trnL_UAG, and rps15-ycf1 among 6 Ochnaceae taxa. Additionally, there were shared and unique repeats among 6 examined chloroplast genomes. The notable changes were the loss of rpl32 in Ochna species and the deletion of rps16 exon 2 in O. integerrima compared to other taxa. This study is the first comprehensive comparative genomic analysis of complete chloroplast genomes of Ochna species and related taxa in Ochnaceae. Consequently, the current study provides initial results for further research on genomic evolution, population genetics, and developing molecular markers in Ochnaceae and related taxa.
金莲木科。包括600多种,在环境生态、观赏、制药和木材工业中具有潜在价值。尽管对紫堇科植物的系统发育和植物化学物质的研究已经深入开展,但叶绿体基因组数据尚未得到充分挖掘。本研究采用新一代测序方法对长度分别为157 329 bp和157 835 bp的整叶桔梗(Ochna integerrima)和细叶桔梗(Ochna serrulata)叶绿体基因组进行测序。这些叶绿体基因组具有四部结构,包含78个蛋白质编码基因,30个trna和4个rrna。结果显示,6个桔科分类群中存在trnK_UUU-trnQ_UUG、rpoB-psbM、trnS_GGA-rps4、accD-psaI、rpl33-rps18、rpl14-rpl16、ndhF-trnL_UAG、rps15-ycf1等8个高变区。此外,6个叶绿体基因组存在共享重复序列和独特重复序列。与其他类群相比,中国种属rpl32基因缺失,荷叶花种属rps16外显子2缺失。本研究首次对桔黄色属植物及其相关分类群的叶绿体全基因组进行了比较分析。因此,本研究为进一步研究桔科及相关分类群的基因组进化、群体遗传学和开发分子标记提供了初步的结果。
{"title":"New Insights Into The Evolution of Chloroplast Genomes in <i>Ochna</i> Species (Ochnaceae, Malpighiales)","authors":"Nguyen Nhat Nam, Nguyen Hoang Danh, Vu Minh Thiet, Hoang Dang Khoa Do","doi":"10.1177/11769343231210756","DOIUrl":"https://doi.org/10.1177/11769343231210756","url":null,"abstract":"Ochnaceae DC. includes more than 600 species that exhibit potential values for environmental ecology, ornamental, pharmaceutical, and timber industries. Although studies on phylogeny and phytochemicals have been intensively conducted, chloroplast genome data of Ochnaceae species have not been fully explored. In this study, the next-generation sequencing method was used to sequence the chloroplast genomes of Ochna integerrima and Ochna serrulata which were 157 329 and 157 835 bp in length, respectively. These chloroplast genomes had a quadripartite structure and contained 78 protein-coding genes, 30 tRNAs, and 4 rRNAs. Comparative analysis revealed 8 hypervariable regions, including trnK_UUU-trnQ_UUG, rpoB-psbM, trnS_GGA-rps4, accD-psaI, rpl33-rps18, rpl14-rpl16, ndhF-trnL_UAG, and rps15-ycf1 among 6 Ochnaceae taxa. Additionally, there were shared and unique repeats among 6 examined chloroplast genomes. The notable changes were the loss of rpl32 in Ochna species and the deletion of rps16 exon 2 in O. integerrima compared to other taxa. This study is the first comprehensive comparative genomic analysis of complete chloroplast genomes of Ochna species and related taxa in Ochnaceae. Consequently, the current study provides initial results for further research on genomic evolution, population genetics, and developing molecular markers in Ochnaceae and related taxa.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135710343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effects of Antibiotic Treatment on the Development and Bacterial Community of the Wolbachia-Infected Diamondback Moth. 抗生素处理对感染沃尔巴克氏体的小菜蛾发育和细菌群落的影响。
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769343231175269
Xiangyu Zhu, Ling Zhang, Jinyang Li, Ao He, Minsheng You, Shijun You

Based on the important role of antibiotic treatment in the research of the interaction between Wolbachia and insect hosts, this study aimed to identify the most suitable antibiotic and concentration for Wolbachia elimination in the P. xylostella, and to investigate the effect of Wolbachia and antibiotic treatment on the bacterial community of P. xylostella. Our results showed that the Wolbachia-infected strain was plutWB1 of supergroup B in the P. xylostella population collected in Nepal in this study; 1 mg/mL rifampicin could remove Wolbachia infection in P. xylostella after 1 generation of feeding treatment and the toxic effect was relatively low; among the 29 samples of adult P. xylostella in our study (10 WU samples, 10 WA samples, and 9 WI samples), 52.5% of the sequences were of Firmicutes and 47.5% were of Proteobacteria, with the dominant genera being mainly Carnobacterium (46.2%), Enterobacter (10.1%), and Enterococcus (6.2%); Moreover, antibiotic removal of Wolbachia infection in P. xylostella and transfer to normal conditions for 10 generations no longer significantly affected the bacterial community of P. xylostella. This study provides a theoretical basis for the elimination method of Wolbachia in the P. xylostella, as well as a reference for the elimination method of Wolbachia in other Wolbachia-infected insect species, and a basis for the study of the extent and duration of the effect of antibiotic treatment on the bacterial community of the P. xylostella.

基于抗生素处理在研究沃尔巴克氏体与昆虫宿主相互作用中的重要作用,本研究旨在确定消除小菜假体沃尔巴克氏体最合适的抗生素和浓度,并探讨沃尔巴克氏体和抗生素处理对小菜假体细菌群落的影响。结果表明,在尼泊尔采集的小菜蛾种群中,感染沃尔巴克氏体的菌株为超B群plutWB1;1 mg/mL利福平在饲养处理1代后可清除小菜蛾沃尔巴克氏体感染,毒性作用较低;在29份成虫小菜蛾(WU、WA、WI各10份)中,厚壁菌门(Firmicutes)占52.5%,变形菌门(Proteobacteria)占47.5%,优势属主要为肉杆菌(Carnobacterium, 46.2%)、肠杆菌(Enterobacter, 10.1%)和肠球菌(Enterococcus, 6.2%);此外,抗生素去除小菜蛾沃尔巴克氏体感染并转移到正常条件下10代不再显著影响小菜蛾的细菌群落。本研究为小菜蛾沃尔巴克氏体的消灭方法提供了理论依据,也为其他沃尔巴克氏体感染昆虫种沃尔巴克氏体的消灭方法提供了参考,为研究抗生素处理对小菜蛾细菌群落的影响程度和持续时间提供了依据。
{"title":"Effects of Antibiotic Treatment on the Development and Bacterial Community of the <i>Wolbachia</i>-Infected Diamondback Moth.","authors":"Xiangyu Zhu,&nbsp;Ling Zhang,&nbsp;Jinyang Li,&nbsp;Ao He,&nbsp;Minsheng You,&nbsp;Shijun You","doi":"10.1177/11769343231175269","DOIUrl":"https://doi.org/10.1177/11769343231175269","url":null,"abstract":"<p><p>Based on the important role of antibiotic treatment in the research of the interaction between <i>Wolbachia</i> and insect hosts, this study aimed to identify the most suitable antibiotic and concentration for <i>Wolbachia</i> elimination in the <i>P. xylostella</i>, and to investigate the effect of <i>Wolbachia</i> and antibiotic treatment on the bacterial community of <i>P. xylostella</i>. Our results showed that the <i>Wolbachia</i>-infected strain was <i>plutWB1</i> of supergroup B in the <i>P. xylostella</i> population collected in Nepal in this study; 1 mg/mL rifampicin could remove <i>Wolbachia</i> infection in <i>P. xylostella</i> after 1 generation of feeding treatment and the toxic effect was relatively low; among the 29 samples of adult <i>P. xylostella</i> in our study (10 WU samples, 10 WA samples, and 9 WI samples), 52.5% of the sequences were of Firmicutes and 47.5% were of Proteobacteria, with the dominant genera being mainly <i>Carnobacterium</i> (46.2%), <i>Enterobacter</i> (10.1%), and <i>Enterococcus</i> (6.2%); Moreover, antibiotic removal of <i>Wolbachia</i> infection in <i>P. xylostella</i> and transfer to normal conditions for 10 generations no longer significantly affected the bacterial community of <i>P. xylostella</i>. This study provides a theoretical basis for the elimination method of <i>Wolbachia</i> in the <i>P. xylostella</i>, as well as a reference for the elimination method of <i>Wolbachia</i> in other <i>Wolbachia</i>-infected insect species, and a basis for the study of the extent and duration of the effect of antibiotic treatment on the bacterial community of the <i>P. xylostella</i>.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"19 ","pages":"11769343231175269"},"PeriodicalIF":2.6,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/ce/d6/10.1177_11769343231175269.PMC10265341.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10646869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Prognostic Value of a lncRNA Risk Model Consists of 9 m6A Regulator-Related lncRNAs in Hepatocellular Carcinoma (HCC). 由9个m6A调控因子相关lncRNA组成的lncRNA风险模型在肝细胞癌(HCC)中的预后价值
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769343221142013
Zhen Deng, Jiaxing Hou, Hongbo Xu, Zhao Lei, Zhiqiang Li, Hongwei Zhu, Xiao Yu, Zhi Yang, Xiaoxin Jin, Jichun Sun

Hepatocellular carcinoma (HCC) is the most common primary malignancy of the liver. Although the RNA modification N6-methyladenine (m6A) has been reported to be involved in HCC carcinogenesis, early diagnostic markers and promising personalized therapeutic targets are still lacking. In this study, we identified that 19 m6A regulators and 34 co-expressed lncRNAs were significantly upregulated in HCC samples; based on these factors, we established a prognostic signal of HCC associated with 9 lncRNAs and 19 m6A regulators using LASSO Cox regression analysis. Kaplan-Meier survival estimate revealed correlations between the risk scores and patients' OS in the training and validation dataset. The ROC curve demonstrated that the risk score-based curve has satisfactory prediction efficiency for both training and validation datasets. Multivariate Cox's proportional hazard regression analysis indicated that the risk score was an independent risk factor within the training and validation dataset. In addition, the risk score could distinguish HCC patients from normal non-cancerous samples and HCC samples of different pathological grades. Eventually, 232 mRNAs were co-expressed with these 9 lncRNAs according to GSE101685 and GSE112790; these mRNAs were enriched in cell cycle and cell metabolic activities, drug metabolism, liver disease-related pathways, and some important cancer related pathways such as p53, MAPK, Wnt, RAS and so forth. The expression of the 9 lncRNAs was significantly higher in HCC samples than that in the neighboring non-cancerous samples. Altogether, by using the Consensus Clustering, PCA, ESTIMATE algorithm, LASSO regression model, Kaplan-Meier survival assessment, ROC curve analysis, and multivariate Cox's proportional hazard regression model analysis, we established a prognostic marker consisting of 9 m6A regulator-related lncRNAs that markers may have prognostic and diagnostic potential for HCC.

肝细胞癌(HCC)是肝脏最常见的原发性恶性肿瘤。尽管有报道称RNA修饰n6 -甲基腺嘌呤(m6A)参与了HCC的癌变,但仍缺乏早期诊断标志物和有希望的个性化治疗靶点。在本研究中,我们发现19个m6A调节因子和34个共表达的lncrna在HCC样本中显著上调;基于这些因素,我们使用LASSO Cox回归分析建立了与9个lncrna和19个m6A调节因子相关的HCC预后信号。Kaplan-Meier生存估计揭示了训练和验证数据集中风险评分与患者OS之间的相关性。ROC曲线表明,基于风险评分的曲线对训练数据集和验证数据集都具有满意的预测效率。多变量Cox比例风险回归分析表明,在训练和验证数据集中,风险评分是一个独立的风险因素。此外,风险评分可以区分HCC患者与正常非癌性样本和不同病理分级的HCC样本。最终,根据GSE101685和GSE112790, 232个mrna与这9个lncrna共表达;这些mrna富集于细胞周期和细胞代谢活动、药物代谢、肝脏疾病相关通路以及一些重要的癌症相关通路,如p53、MAPK、Wnt、RAS等。这9种lncrna在HCC样本中的表达明显高于邻近的非癌样本。总之,通过Consensus Clustering、PCA、ESTIMATE算法、LASSO回归模型、Kaplan-Meier生存评估、ROC曲线分析和多变量Cox比例风险回归模型分析,我们建立了一个由9个m6A调控因子相关lncrna组成的预后标志物,这些标志物可能具有HCC的预后和诊断潜力。
{"title":"The Prognostic Value of a lncRNA Risk Model Consists of 9 m6A Regulator-Related lncRNAs in Hepatocellular Carcinoma (HCC).","authors":"Zhen Deng,&nbsp;Jiaxing Hou,&nbsp;Hongbo Xu,&nbsp;Zhao Lei,&nbsp;Zhiqiang Li,&nbsp;Hongwei Zhu,&nbsp;Xiao Yu,&nbsp;Zhi Yang,&nbsp;Xiaoxin Jin,&nbsp;Jichun Sun","doi":"10.1177/11769343221142013","DOIUrl":"https://doi.org/10.1177/11769343221142013","url":null,"abstract":"<p><p>Hepatocellular carcinoma (HCC) is the most common primary malignancy of the liver. Although the RNA modification N6-methyladenine (m6A) has been reported to be involved in HCC carcinogenesis, early diagnostic markers and promising personalized therapeutic targets are still lacking. In this study, we identified that 19 m6A regulators and 34 co-expressed lncRNAs were significantly upregulated in HCC samples; based on these factors, we established a prognostic signal of HCC associated with 9 lncRNAs and 19 m6A regulators using LASSO Cox regression analysis. Kaplan-Meier survival estimate revealed correlations between the risk scores and patients' OS in the training and validation dataset. The ROC curve demonstrated that the risk score-based curve has satisfactory prediction efficiency for both training and validation datasets. Multivariate Cox's proportional hazard regression analysis indicated that the risk score was an independent risk factor within the training and validation dataset. In addition, the risk score could distinguish HCC patients from normal non-cancerous samples and HCC samples of different pathological grades. Eventually, 232 mRNAs were co-expressed with these 9 lncRNAs according to GSE101685 and GSE112790; these mRNAs were enriched in cell cycle and cell metabolic activities, drug metabolism, liver disease-related pathways, and some important cancer related pathways such as p53, MAPK, Wnt, RAS and so forth. The expression of the 9 lncRNAs was significantly higher in HCC samples than that in the neighboring non-cancerous samples. Altogether, by using the Consensus Clustering, PCA, ESTIMATE algorithm, LASSO regression model, Kaplan-Meier survival assessment, ROC curve analysis, and multivariate Cox's proportional hazard regression model analysis, we established a prognostic marker consisting of 9 m6A regulator-related lncRNAs that markers may have prognostic and diagnostic potential for HCC.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"19 ","pages":"11769343221142013"},"PeriodicalIF":2.6,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/1a/64/10.1177_11769343221142013.PMC9841875.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10555929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Characterization and Expression Analysis of B12D-Like Gene From Pearl Millet. 珍珠粟b12d样基因的鉴定与表达分析。
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2022-01-01 DOI: 10.1177/11769343221142285
Zainab M Almutairi

B12D-Like is a member of the B12D domain-containing protein family, which includes several transmembrane proteins in plants. In this study, the cDNA of PgB12D-Like from Pennisetum glaucum subsp. monodii (Maire) Brunken was sequenced and characterized. The 446-bp cDNA for PgB12D-Like encodes for a deduced protein of 95 amino acids. The PgB12D-Like protein contains a B12D domain and a transmembrane helix embedded in the mitochondrial membrane. Cis-regulatory elements analysis reveals binding sites for various transcription factors involved in responses to stress, light, and plant hormones in the putative promoter sequence for PgB12D-Like. Several proteins involved in floral organ development were also found to have binding sites in the PgB12D-Like promoter, such as agamous-like proteins and squamosa promoter binding proteins. Real-time PCR reveals high expression of PgB12D-Like in flowers during heading, whereas its expression in a 4-day-old seedling shoot was the lowest. Moreover, cold, drought, and heat stress were found to upregulate PgB12D-Like, whereas gibberellic acid downregulated its expression in seedlings. The present study helps to uncover the function of the B12D-Like in response to plant hormones and abiotic stress during P. glaucum development.

B12D- like是B12D结构域蛋白家族的一员,该家族包括植物中的几种跨膜蛋白。本研究从狼尾草中提取PgB12D-Like cDNA。对monodii (Maire) Brunken进行了测序和鉴定。PgB12D-Like的446 bp cDNA编码95个氨基酸的推断蛋白。pgb12d样蛋白包含一个B12D结构域和嵌入线粒体膜的跨膜螺旋。顺式调控元件分析揭示了PgB12D-Like的启动子序列中参与应激、光和植物激素应答的各种转录因子的结合位点。一些与花器官发育有关的蛋白也被发现在pgb12d样启动子上有结合位点,如琼脂样蛋白和鳞状启动子结合蛋白。Real-time PCR结果显示,PgB12D-Like在抽穗期间的花中表达量较高,而在4 d苗茎中的表达量最低。此外,寒冷、干旱和热胁迫可上调PgB12D-Like,而赤霉素酸可下调其在幼苗中的表达。本研究有助于揭示B12D-Like蛋白在青光带发育过程中对植物激素和非生物胁迫的响应功能。
{"title":"Characterization and Expression Analysis of <i>B12D-Like</i> Gene From Pearl Millet.","authors":"Zainab M Almutairi","doi":"10.1177/11769343221142285","DOIUrl":"https://doi.org/10.1177/11769343221142285","url":null,"abstract":"<p><p><i>B12D-Like</i> is a member of the B12D domain-containing protein family, which includes several transmembrane proteins in plants. In this study, the cDNA of <i>PgB12D-Like</i> from <i>Pennisetum glaucum subsp. monodii</i> (Maire) Brunken was sequenced and characterized. The 446-bp cDNA for <i>PgB12D-Like</i> encodes for a deduced protein of 95 amino acids. The PgB12D-Like protein contains a B12D domain and a transmembrane helix embedded in the mitochondrial membrane. Cis-regulatory elements analysis reveals binding sites for various transcription factors involved in responses to stress, light, and plant hormones in the putative promoter sequence for <i>PgB12D-Like</i>. Several proteins involved in floral organ development were also found to have binding sites in the <i>PgB12D-Like</i> promoter, such as agamous-like proteins and squamosa promoter binding proteins. Real-time PCR reveals high expression of <i>PgB12D-Like</i> in flowers during heading, whereas its expression in a 4-day-old seedling shoot was the lowest. Moreover, cold, drought, and heat stress were found to upregulate <i>PgB12D-Like</i>, whereas gibberellic acid downregulated its expression in seedlings. The present study helps to uncover the function of the <i>B12D-Like</i> in response to plant hormones and abiotic stress during <i>P. glaucum</i> development.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"18 ","pages":"11769343221142285"},"PeriodicalIF":2.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/f9/c6/10.1177_11769343221142285.PMC9793006.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10455657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Biological Computation and Compatibility Search in the Possibility Space as the Mechanism of Complexity Increase During Progressive Evolution 渐进进化过程中复杂性增加的机制——可能性空间中的生物计算与相容性搜索
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2022-01-01 DOI: 10.1177/11769343221110654
A. Kozlov
The idea of computational processes, which take place in nature, for example, DNA computation, is discussed in the literature. DNA computation that is going on in the immunoglobulin locus of vertebrates shows how the computations in the biological possibility space could operate during evolution. We suggest that the origin of evolutionarily novel genes and genome evolution constitute the original intrinsic computation of the information about new structures in the space of unrealized biological possibilities. Due to DNA computation, the information about future structures is generated and stored in DNA as genetic information. In evolving ontogenies, search algorithms are necessary, which can search for information about evolutionary innovations and morphological novelties. We believe that such algorithms include stochastic gene expression, gene competition, and compatibility search at different levels of structural organization. We formulate the increase in complexity principle in terms of biological computation and hypothesize the possibility of in silico computing of future functions of evolutionarily novel genes.
文献中讨论了自然界中发生的计算过程的概念,例如DNA计算。脊椎动物免疫球蛋白基因座中正在进行的DNA计算表明,生物学可能性空间中的计算在进化过程中是如何运作的。我们认为,进化上新基因的起源和基因组进化构成了在未实现的生物学可能性空间中对新结构信息的原始内在计算。由于DNA计算,关于未来结构的信息被生成并存储在DNA中作为遗传信息。在进化个体中,搜索算法是必要的,它可以搜索关于进化创新和形态新颖性的信息。我们认为,这种算法包括随机基因表达、基因竞争和结构组织不同层次的兼容性搜索。我们从生物学计算的角度阐述了复杂性增加原理,并假设了进化新基因未来功能的计算机计算的可能性。
{"title":"Biological Computation and Compatibility Search in the Possibility Space as the Mechanism of Complexity Increase During Progressive Evolution","authors":"A. Kozlov","doi":"10.1177/11769343221110654","DOIUrl":"https://doi.org/10.1177/11769343221110654","url":null,"abstract":"The idea of computational processes, which take place in nature, for example, DNA computation, is discussed in the literature. DNA computation that is going on in the immunoglobulin locus of vertebrates shows how the computations in the biological possibility space could operate during evolution. We suggest that the origin of evolutionarily novel genes and genome evolution constitute the original intrinsic computation of the information about new structures in the space of unrealized biological possibilities. Due to DNA computation, the information about future structures is generated and stored in DNA as genetic information. In evolving ontogenies, search algorithms are necessary, which can search for information about evolutionary innovations and morphological novelties. We believe that such algorithms include stochastic gene expression, gene competition, and compatibility search at different levels of structural organization. We formulate the increase in complexity principle in terms of biological computation and hypothesize the possibility of in silico computing of future functions of evolutionarily novel genes.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"18 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48990834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Diversity in Expression Biases of Lineage-Specific Genes During Development and Anhydrobiosis Among Tardigrade Species. 缓步动物发育和缺氧过程中谱系特异性基因表达偏倚的多样性。
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2022-01-01 DOI: 10.1177/11769343221140277
Jean-Christophe Metivier, Frédéric J J Chain

Lineage-specific genes can contribute to the emergence and evolution of novel traits and adaptations. Tardigrades are animals that have adapted to tolerate extreme conditions by undergoing a form of cryptobiosis called anhydrobiosis, a physical transformation to an inactive desiccated state. While studies to understand the genetics underlying the interspecies diversity in anhydrobiotic transitions have identified tardigrade-specific genes and family expansions involved in this process, the contributions of species-specific genes to the variation in tardigrade development and cryptobiosis are less clear. We used previously published transcriptomes throughout development and anhydrobiosis (5 embryonic stages, 7 juvenile stages, active adults, and tun adults) to assess the transcriptional biases of different classes of genes between 2 tardigrade species, Hypsibius exemplaris and Ramazzottius varieornatus. We also used the transcriptomes of 2 other tardigrades, Echiniscoides sigismundi and Richtersius coronifer, and data from 3 non-tardigrade species (Adenita vaga, Drosophila melanogaster, and Caenorhabditis elegans) to help identify lineage-specific genes. We found that lineage-specific genes have generally low and narrow expression but are enriched among biased genes in different stages of development depending on the species. Biased genes tend to be specific to early and late development, but there is little overlap in functional enrichment of biased genes between species. Gene expansions in the 2 tardigrades also involve families with different functions despite homologous genes being expressed during anhydrobiosis in both species. Our results demonstrate the interspecific variation in transcriptional contributions and biases of lineage-specific genes during development and anhydrobiosis in 2 tardigrades.

谱系特异性基因可以促进新性状和适应性的出现和进化。缓步动物是一种适应极端条件的动物,它们经历了一种被称为无水生态的隐生状态,一种向不活跃的干燥状态的物理转变。虽然了解无水生物转变中物种间多样性的遗传学研究已经确定了缓步动物特异性基因和这一过程中涉及的家族扩展,但物种特异性基因对缓步动物发育和隐生变化的贡献尚不清楚。我们使用之前发表的转录组在整个发育和无水阶段(5个胚胎期,7个幼年期,活跃成虫和成年虫)来评估2种缓步动物,Hypsibius exemplaris和Ramazzottius varieornatus之间不同类别基因的转录偏倚。我们还使用了另外2种缓步动物——sigismundechiniscoides和Richtersius coronifer的转录组,以及3种非缓步动物(Adenita vaga, Drosophila melanogaster和Caenorhabditis elegans)的数据来帮助鉴定谱系特异性基因。我们发现,谱系特异性基因的表达普遍较低且表达范围较窄,但在不同物种的不同发育阶段,偏倚基因的表达却较为丰富。偏倚基因往往特异于发育的早期和晚期,但物种间偏倚基因的功能富集几乎没有重叠。这两种缓步动物的基因扩增也涉及不同功能的家族,尽管在两种物种的无水共生过程中同源基因都有表达。我们的研究结果表明,在2种缓步动物的发育和缺氧过程中,谱系特异性基因的转录贡献和偏倚在种间存在差异。
{"title":"Diversity in Expression Biases of Lineage-Specific Genes During Development and Anhydrobiosis Among Tardigrade Species.","authors":"Jean-Christophe Metivier,&nbsp;Frédéric J J Chain","doi":"10.1177/11769343221140277","DOIUrl":"https://doi.org/10.1177/11769343221140277","url":null,"abstract":"<p><p>Lineage-specific genes can contribute to the emergence and evolution of novel traits and adaptations. Tardigrades are animals that have adapted to tolerate extreme conditions by undergoing a form of cryptobiosis called anhydrobiosis, a physical transformation to an inactive desiccated state. While studies to understand the genetics underlying the interspecies diversity in anhydrobiotic transitions have identified tardigrade-specific genes and family expansions involved in this process, the contributions of species-specific genes to the variation in tardigrade development and cryptobiosis are less clear. We used previously published transcriptomes throughout development and anhydrobiosis (5 embryonic stages, 7 juvenile stages, active adults, and tun adults) to assess the transcriptional biases of different classes of genes between 2 tardigrade species, <i>Hypsibius exemplaris</i> and <i>Ramazzottius varieornatus</i>. We also used the transcriptomes of 2 other tardigrades, <i>Echiniscoides sigismundi</i> and <i>Richtersius coronifer</i>, and data from 3 non-tardigrade species (<i>Adenita vaga</i>, <i>Drosophila melanogaster</i>, and <i>Caenorhabditis elegans</i>) to help identify lineage-specific genes. We found that lineage-specific genes have generally low and narrow expression but are enriched among biased genes in different stages of development depending on the species. Biased genes tend to be specific to early and late development, but there is little overlap in functional enrichment of biased genes between species. Gene expansions in the 2 tardigrades also involve families with different functions despite homologous genes being expressed during anhydrobiosis in both species. Our results demonstrate the interspecific variation in transcriptional contributions and biases of lineage-specific genes during development and anhydrobiosis in 2 tardigrades.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"18 ","pages":"11769343221140277"},"PeriodicalIF":2.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/51/e1/10.1177_11769343221140277.PMC9791283.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10509846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stability of scRNA-Seq Analysis Workflows is Susceptible to Preprocessing and is Mitigated by Regularized or Supervised Approaches. scRNA-Seq分析工作流程的稳定性容易受到预处理的影响,并且可以通过正则化或监督方法来降低稳定性。
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2022-01-01 DOI: 10.1177/11769343221123050
Arda Durmaz, Jacob G Scott

Background: Statistical methods developed to address various questions in single-cell datasets show increased variability to different parameter regimes. In order to delineate further the robustness of commonly utilized methods for single-cell RNA-Seq, we aimed to comprehensively review scRNA-Seq analysis workflows in the setting of dimension reduction, clustering, and trajectory inference.

Methods: We utilized datasets with temporal single-cell transcriptomics profiles from public repositories. Combining multiple methods at each level of the workflow, we have performed over 6k analysis and evaluated the results of clustering and pseudotime estimation using adjusted rand index and rank correlation metrics. We have further integrated neural network methods to assess whether models with increased complexity can show increased bias/variance trade-off.

Results: Combinatorial workflows showed that utilizing non-linear dimension reduction techniques such as t-SNE and UMAP are sensitive to initial preprocessing steps hence clustering results on dimension reduced space of single-cell datasets should be utilized carefully. Similarly, pseudotime estimation methods that depend on previous non-linear dimension reduction steps can result in highly variable trajectories. In contrast, methods that avoid non-linearity such as WOT can result in repeatable inferences of temporal gene expression dynamics. Furthermore, imputation methods do not improve clustering or trajectory inference results substantially in terms of repeatability. In contrast, the selection of the normalization method shows an increased effect on downstream analysis where ScTransform reduces variability overall.

背景:用于解决单细胞数据集中各种问题的统计方法显示,不同参数制度的可变性增加。为了进一步描述单细胞RNA-Seq常用方法的鲁棒性,我们旨在全面回顾在降维、聚类和轨迹推断方面的scRNA-Seq分析工作流程。方法:我们利用来自公共数据库的单细胞转录组学数据集。在工作流程的每个级别上结合多种方法,我们已经执行了超过6k的分析,并使用调整后的rand指数和秩相关指标评估聚类和伪时间估计的结果。我们进一步集成了神经网络方法来评估复杂性增加的模型是否会显示出增加的偏差/方差权衡。结果:组合工作流表明,利用非线性降维技术(如t-SNE和UMAP)对初始预处理步骤敏感,因此应谨慎利用单细胞数据集降维空间上的聚类结果。类似地,依赖于先前非线性降维步骤的伪时间估计方法可能导致高度可变的轨迹。相比之下,避免非线性的方法,如WOT,可以导致时间基因表达动态的可重复推断。此外,在可重复性方面,imputation方法并不能显著提高聚类或轨迹推断结果。相比之下,规范化方法的选择在下游分析中显示出更大的影响,其中ScTransform总体上减少了可变性。
{"title":"Stability of scRNA-Seq Analysis Workflows is Susceptible to Preprocessing and is Mitigated by Regularized or Supervised Approaches.","authors":"Arda Durmaz,&nbsp;Jacob G Scott","doi":"10.1177/11769343221123050","DOIUrl":"https://doi.org/10.1177/11769343221123050","url":null,"abstract":"<p><strong>Background: </strong>Statistical methods developed to address various questions in single-cell datasets show increased variability to different parameter regimes. In order to delineate further the robustness of commonly utilized methods for single-cell RNA-Seq, we aimed to comprehensively review scRNA-Seq analysis workflows in the setting of dimension reduction, clustering, and trajectory inference.</p><p><strong>Methods: </strong>We utilized datasets with temporal single-cell transcriptomics profiles from public repositories. Combining multiple methods at each level of the workflow, we have performed over 6<i>k</i> analysis and evaluated the results of clustering and pseudotime estimation using adjusted rand index and rank correlation metrics. We have further integrated neural network methods to assess whether models with increased complexity can show increased bias/variance trade-off.</p><p><strong>Results: </strong>Combinatorial workflows showed that utilizing non-linear dimension reduction techniques such as t-SNE and UMAP are sensitive to initial preprocessing steps hence clustering results on dimension reduced space of single-cell datasets should be utilized carefully. Similarly, pseudotime estimation methods that depend on previous non-linear dimension reduction steps can result in highly variable trajectories. In contrast, methods that avoid non-linearity such as WOT can result in repeatable inferences of temporal gene expression dynamics. Furthermore, imputation methods do not improve clustering or trajectory inference results substantially in terms of repeatability. In contrast, the selection of the normalization method shows an increased effect on downstream analysis where ScTransform reduces variability overall.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"18 ","pages":"11769343221123050"},"PeriodicalIF":2.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/07/96/10.1177_11769343221123050.PMC9527995.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9743388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evolutionary Dynamics of Indels in SARS-CoV-2 Spike Glycoprotein. SARS-CoV-2 穗状糖蛋白中吲哚的进化动力学
IF 2.6 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2021-12-06 eCollection Date: 2021-01-01 DOI: 10.1177/11769343211064616
R Shyama Prasad Rao, Nagib Ahsan, Chunhui Xu, Lingtao Su, Jacob Verburgt, Luca Fornelli, Daisuke Kihara, Dong Xu

SARS-CoV-2, responsible for the current COVID-19 pandemic that claimed over 5.0 million lives, belongs to a class of enveloped viruses that undergo quick evolutionary adjustments under selection pressure. Numerous variants have emerged in SARS-CoV-2, posing a serious challenge to the global vaccination effort and COVID-19 management. The evolutionary dynamics of this virus are only beginning to be explored. In this work, we have analysed 1.79 million spike glycoprotein sequences of SARS-CoV-2 and found that the virus is fine-tuning the spike with numerous amino acid insertions and deletions (indels). Indels seem to have a selective advantage as the proportions of sequences with indels steadily increased over time, currently at over 89%, with similar trends across countries/variants. There were as many as 420 unique indel positions and 447 unique combinations of indels. Despite their high frequency, indels resulted in only minimal alteration of N-glycosylation sites, including both gain and loss. As indels and point mutations are positively correlated and sequences with indels have significantly more point mutations, they have implications in the evolutionary dynamics of the SARS-CoV-2 spike glycoprotein.

SARS-CoV-2 是造成当前 COVID-19 大流行并夺走了 500 多万人生命的罪魁祸首,它属于一类在选择压力下快速进化调整的包膜病毒。SARS-CoV-2 出现了许多变种,给全球疫苗接种工作和 COVID-19 的管理带来了严峻挑战。对该病毒进化动态的研究才刚刚开始。在这项工作中,我们分析了 179 万个 SARS-CoV-2 的尖峰糖蛋白序列,发现病毒正在通过大量的氨基酸插入和缺失(indels)对尖峰进行微调。随着时间的推移,带有indels的序列比例稳步上升,目前已超过89%,不同国家/变异体之间的趋势相似,因此indels似乎具有选择优势。有多达 420 个独特的吲哚位置和 447 个独特的吲哚组合。尽管存在高频率的嵌合,但嵌合体仅导致极少量的 N-糖基化位点改变,包括增益和缺失。由于吲哚和点突变呈正相关,而且有吲哚的序列有明显更多的点突变,它们对SARS-CoV-2尖峰糖蛋白的进化动态有影响。
{"title":"Evolutionary Dynamics of Indels in SARS-CoV-2 Spike Glycoprotein.","authors":"R Shyama Prasad Rao, Nagib Ahsan, Chunhui Xu, Lingtao Su, Jacob Verburgt, Luca Fornelli, Daisuke Kihara, Dong Xu","doi":"10.1177/11769343211064616","DOIUrl":"10.1177/11769343211064616","url":null,"abstract":"<p><p>SARS-CoV-2, responsible for the current COVID-19 pandemic that claimed over 5.0 million lives, belongs to a class of enveloped viruses that undergo quick evolutionary adjustments under selection pressure. Numerous variants have emerged in SARS-CoV-2, posing a serious challenge to the global vaccination effort and COVID-19 management. The evolutionary dynamics of this virus are only beginning to be explored. In this work, we have analysed 1.79 million spike glycoprotein sequences of SARS-CoV-2 and found that the virus is fine-tuning the spike with numerous amino acid insertions and deletions (indels). Indels seem to have a selective advantage as the proportions of sequences with indels steadily increased over time, currently at over 89%, with similar trends across countries/variants. There were as many as 420 unique indel positions and 447 unique combinations of indels. Despite their high frequency, indels resulted in only minimal alteration of N-glycosylation sites, including both gain and loss. As indels and point mutations are positively correlated and sequences with indels have significantly more point mutations, they have implications in the evolutionary dynamics of the SARS-CoV-2 spike glycoprotein.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211064616"},"PeriodicalIF":2.6,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/18/95/10.1177_11769343211064616.PMC8655444.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39718297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Power of Universal Contextualized Protein Embeddings in Cross-species Protein Function Prediction. 通用语境化蛋白质嵌入在跨物种蛋白质功能预测中的作用。
IF 1.7 4区 生物学 Q4 EVOLUTIONARY BIOLOGY Pub Date : 2021-12-03 eCollection Date: 2021-01-01 DOI: 10.1177/11769343211062608
Irene van den Bent, Stavros Makrodimitris, Marcel Reinders

Computationally annotating proteins with a molecular function is a difficult problem that is made even harder due to the limited amount of available labeled protein training data. Unsupervised protein embeddings partly circumvent this limitation by learning a universal protein representation from many unlabeled sequences. Such embeddings incorporate contextual information of amino acids, thereby modeling the underlying principles of protein sequences insensitive to the context of species. We used an existing pre-trained protein embedding method and subjected its molecular function prediction performance to detailed characterization, first to advance the understanding of protein language models, and second to determine areas of improvement. Then, we applied the model in a transfer learning task by training a function predictor based on the embeddings of annotated protein sequences of one training species and making predictions on the proteins of several test species with varying evolutionary distance. We show that this approach successfully generalizes knowledge about protein function from one eukaryotic species to various other species, outperforming both an alignment-based and a supervised-learning-based baseline. This implies that such a method could be effective for molecular function prediction in inadequately annotated species from understudied taxonomic kingdoms.

计算标注蛋白质的分子功能是一个困难的问题,由于可用的标注蛋白质训练数据量有限,这个问题变得更加困难。无监督蛋白质嵌入通过从大量无标记序列中学习通用蛋白质表示,部分地规避了这一限制。这种嵌入结合了氨基酸的上下文信息,从而模拟了对物种上下文不敏感的蛋白质序列的基本原理。我们使用了一种现有的预训练蛋白质嵌入方法,并对其分子功能预测性能进行了详细的鉴定,首先是为了加深对蛋白质语言模型的理解,其次是为了确定需要改进的地方。然后,我们将该模型应用于迁移学习任务中,根据一个训练物种的注释蛋白质序列的嵌入训练功能预测器,并对进化距离不同的多个测试物种的蛋白质进行预测。我们的研究表明,这种方法成功地将一个真核生物物种的蛋白质功能知识推广到了其他各种物种,其表现优于基于比对和基于监督学习的基线方法。这意味着,这种方法可以有效地对未充分研究的分类王国中注释不足的物种进行分子功能预测。
{"title":"The Power of Universal Contextualized Protein Embeddings in Cross-species Protein Function Prediction.","authors":"Irene van den Bent, Stavros Makrodimitris, Marcel Reinders","doi":"10.1177/11769343211062608","DOIUrl":"10.1177/11769343211062608","url":null,"abstract":"<p><p>Computationally annotating proteins with a molecular function is a difficult problem that is made even harder due to the limited amount of available labeled protein training data. Unsupervised protein embeddings partly circumvent this limitation by learning a universal protein representation from many unlabeled sequences. Such embeddings incorporate contextual information of amino acids, thereby modeling the underlying principles of protein sequences insensitive to the context of species. We used an existing pre-trained protein embedding method and subjected its molecular function prediction performance to detailed characterization, first to advance the understanding of protein language models, and second to determine areas of improvement. Then, we applied the model in a transfer learning task by training a function predictor based on the embeddings of annotated protein sequences of one training species and making predictions on the proteins of several test species with varying evolutionary distance. We show that this approach successfully generalizes knowledge about protein function from one eukaryotic species to various other species, outperforming both an alignment-based and a supervised-learning-based baseline. This implies that such a method could be effective for molecular function prediction in inadequately annotated species from understudied taxonomic kingdoms.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211062608"},"PeriodicalIF":1.7,"publicationDate":"2021-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8647222/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39957598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Evolutionary Bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1