E. E. Zelenova, A. A. Karlsen, D. V. Avdoshina, K. K. Kyuregyan, M. G. Belikova, I. D. Trotsenko
{"title":"16 型 HPV E6 和 E7 蛋白中的氨基酸替代模式:系统地理学与进化","authors":"E. E. Zelenova, A. A. Karlsen, D. V. Avdoshina, K. K. Kyuregyan, M. G. Belikova, I. D. Trotsenko","doi":"10.1134/s0026893324700213","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The E6 and E7 proteins of the high risk human papillomaviruses (HR HPVs) play a key role in the oncogenesis associated with papillomavirus infection. Data on the variability of these proteins are limited, and the factors affecting their variability are still poorly understood. We analyzed the variability of the currently known sequences of the HPV type 16 (HPV16) E6 and E7 proteins, taking into account their geographic origin and year of sample collection, as well as the direction of their evolution in the major geographic regions of the world. All sequences belonging to HPV16 genome fragments encoding the E6 and E7 oncoproteins were downloaded from the NCBI GenBank database on October 6, 2022. Samples were filtered according to the following parameters: the sequence has to include at least one of the two whole open reading frames, and given date of collection, and the country of origin. A total of 3651 full-genome nucleotide sequences encoding the E6 protein and 4578 full-genome nucleotide sequences encoding the E7 protein were sampled. The nucleotide sequences obtained after sampling and alignment were converted to amino acid sequences and analyzed using the MEGA11, R, RStudio, Jmodeltest 2.1.20, BEAST v1.10.4, Fastcov, and Biostrings software. The highest variability in the E6 protein was recorded for amino acid (AA) residues in the positions 17, 21, 32, 85, and 90. The most variable in E7 were aa positions 28, 29, 51, and 77. The samples were divided geographically into five heterogeneous groups as derived from Africa, Europe, America, South-West and South Asia, and South-East Asia. Unique amino acid substitutions (AA-substitutions) in the E6/E7 proteins of HPV16, presumably characteristic to certain ethnic groups, were identified for a number of countries. They weare mainly localized in the sites of known B- and T-cell epitopes and relatively rarely the domains critical for in structure and protein function. The revealed differences in AA-substitutions in different ethnic groups and their colocalization with the clusters of B- and T-cell epitopes suggested their possible relation to the geographical distribution of alleles and haplotypes of the major histocompatibility complex (HLA). This may lead to the recognition of a different set of B- and T-cell epitopes of the virus in different geographic areas, resulting in the regional differences in the direction of epitopic drift. Phylogenetic analysis of the nucleotide sequences encoding the E6 protein of HPV16 revealed a common ancestor, confirmed regional clustering of the E6 protein sequences sharing common AA-substitutions, and identified cases of reversion of individual AA-substitutions when the change of geographical localization. For the E7 protein, such analysis was not possible due to the high sequence homology. Covariance analysis of the pooled of E6 and E7 sequences revealed that there was no associations between amino acid residues in any aa position within E6 or E7 as well as aa positions of E6 and E7 proteins. The data presented here are important for the development of universal therapeutic vaccines against HPV of high carcinogenic risk.</p>","PeriodicalId":18734,"journal":{"name":"Molecular Biology","volume":"17 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Amino Acid Substitution Patterns in the E6 and E7 Proteins of HPV Type 16: Phylogeography and Evolution\",\"authors\":\"E. E. Zelenova, A. A. Karlsen, D. V. Avdoshina, K. K. Kyuregyan, M. G. Belikova, I. D. Trotsenko\",\"doi\":\"10.1134/s0026893324700213\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">Abstract</h3><p>The E6 and E7 proteins of the high risk human papillomaviruses (HR HPVs) play a key role in the oncogenesis associated with papillomavirus infection. Data on the variability of these proteins are limited, and the factors affecting their variability are still poorly understood. We analyzed the variability of the currently known sequences of the HPV type 16 (HPV16) E6 and E7 proteins, taking into account their geographic origin and year of sample collection, as well as the direction of their evolution in the major geographic regions of the world. All sequences belonging to HPV16 genome fragments encoding the E6 and E7 oncoproteins were downloaded from the NCBI GenBank database on October 6, 2022. Samples were filtered according to the following parameters: the sequence has to include at least one of the two whole open reading frames, and given date of collection, and the country of origin. A total of 3651 full-genome nucleotide sequences encoding the E6 protein and 4578 full-genome nucleotide sequences encoding the E7 protein were sampled. The nucleotide sequences obtained after sampling and alignment were converted to amino acid sequences and analyzed using the MEGA11, R, RStudio, Jmodeltest 2.1.20, BEAST v1.10.4, Fastcov, and Biostrings software. The highest variability in the E6 protein was recorded for amino acid (AA) residues in the positions 17, 21, 32, 85, and 90. The most variable in E7 were aa positions 28, 29, 51, and 77. The samples were divided geographically into five heterogeneous groups as derived from Africa, Europe, America, South-West and South Asia, and South-East Asia. Unique amino acid substitutions (AA-substitutions) in the E6/E7 proteins of HPV16, presumably characteristic to certain ethnic groups, were identified for a number of countries. They weare mainly localized in the sites of known B- and T-cell epitopes and relatively rarely the domains critical for in structure and protein function. The revealed differences in AA-substitutions in different ethnic groups and their colocalization with the clusters of B- and T-cell epitopes suggested their possible relation to the geographical distribution of alleles and haplotypes of the major histocompatibility complex (HLA). This may lead to the recognition of a different set of B- and T-cell epitopes of the virus in different geographic areas, resulting in the regional differences in the direction of epitopic drift. Phylogenetic analysis of the nucleotide sequences encoding the E6 protein of HPV16 revealed a common ancestor, confirmed regional clustering of the E6 protein sequences sharing common AA-substitutions, and identified cases of reversion of individual AA-substitutions when the change of geographical localization. For the E7 protein, such analysis was not possible due to the high sequence homology. Covariance analysis of the pooled of E6 and E7 sequences revealed that there was no associations between amino acid residues in any aa position within E6 or E7 as well as aa positions of E6 and E7 proteins. The data presented here are important for the development of universal therapeutic vaccines against HPV of high carcinogenic risk.</p>\",\"PeriodicalId\":18734,\"journal\":{\"name\":\"Molecular Biology\",\"volume\":\"17 1\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2024-08-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1134/s0026893324700213\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1134/s0026893324700213","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
摘要高危人乳头瘤病毒(HR HPVs)的 E6 和 E7 蛋白在与乳头瘤病毒感染相关的肿瘤发生过程中起着关键作用。有关这些蛋白变异性的数据很有限,而且对影响其变异性的因素仍然知之甚少。我们分析了目前已知的 16 型 HPV(HPV16)E6 和 E7 蛋白序列的变异性,同时考虑了它们的地理来源和样本采集年份,以及它们在世界主要地理区域的进化方向。编码E6和E7肿瘤蛋白的HPV16基因组片段的所有序列于2022年10月6日从NCBI GenBank数据库下载。根据以下参数对样本进行筛选:序列必须至少包括两个全开放阅读框中的一个,并给出采集日期和原产国。共抽取了 3651 个编码 E6 蛋白的全基因组核苷酸序列和 4578 个编码 E7 蛋白的全基因组核苷酸序列。取样和比对后得到的核苷酸序列被转换成氨基酸序列,并使用 MEGA11、R、RStudio、Jmodeltest 2.1.20、BEAST v1.10.4、Fastcov 和 Biostrings 软件进行分析。E6 蛋白中变异最大的是第 17、21、32、85 和 90 位的氨基酸残基。E7 中变异最大的是第 28、29、51 和 77 位的氨基酸残基。样本按地理位置分为五个不同的组,分别来自非洲、欧洲、美洲、西南亚和东南亚。在一些国家,HPV16 的 E6/E7 蛋白中发现了独特的氨基酸置换(AA-置换),这可能是某些种族群体的特征。它们主要定位于已知的 B 细胞和 T 细胞表位,而对结构和蛋白质功能至关重要的结构域则相对较少。所发现的不同种族群体 AA 取代的差异及其与 B 细胞和 T 细胞表位群的共定位表明,它们可能与主要组织相容性复合体(HLA)等位基因和单倍型的地理分布有关。这可能导致在不同的地理区域识别出不同的病毒 B 细胞和 T 细胞表位集,从而造成表位漂移方向的区域差异。对编码 HPV16 E6 蛋白的核苷酸序列进行的系统发育分析表明了一个共同的祖先,证实了具有共同 AA 取代的 E6 蛋白序列的区域聚类,并确定了在地理定位发生变化时个别 AA 取代发生逆转的情况。对于 E7 蛋白,由于序列同源性较高,无法进行此类分析。对汇集的 E6 和 E7 序列进行的协方差分析表明,E6 或 E7 中任何 aa 位置的氨基酸残基与 E6 和 E7 蛋白的 aa 位置之间都没有关联。本文提供的数据对于开发针对高致癌风险人乳头瘤病毒的通用治疗疫苗非常重要。
Amino Acid Substitution Patterns in the E6 and E7 Proteins of HPV Type 16: Phylogeography and Evolution
Abstract
The E6 and E7 proteins of the high risk human papillomaviruses (HR HPVs) play a key role in the oncogenesis associated with papillomavirus infection. Data on the variability of these proteins are limited, and the factors affecting their variability are still poorly understood. We analyzed the variability of the currently known sequences of the HPV type 16 (HPV16) E6 and E7 proteins, taking into account their geographic origin and year of sample collection, as well as the direction of their evolution in the major geographic regions of the world. All sequences belonging to HPV16 genome fragments encoding the E6 and E7 oncoproteins were downloaded from the NCBI GenBank database on October 6, 2022. Samples were filtered according to the following parameters: the sequence has to include at least one of the two whole open reading frames, and given date of collection, and the country of origin. A total of 3651 full-genome nucleotide sequences encoding the E6 protein and 4578 full-genome nucleotide sequences encoding the E7 protein were sampled. The nucleotide sequences obtained after sampling and alignment were converted to amino acid sequences and analyzed using the MEGA11, R, RStudio, Jmodeltest 2.1.20, BEAST v1.10.4, Fastcov, and Biostrings software. The highest variability in the E6 protein was recorded for amino acid (AA) residues in the positions 17, 21, 32, 85, and 90. The most variable in E7 were aa positions 28, 29, 51, and 77. The samples were divided geographically into five heterogeneous groups as derived from Africa, Europe, America, South-West and South Asia, and South-East Asia. Unique amino acid substitutions (AA-substitutions) in the E6/E7 proteins of HPV16, presumably characteristic to certain ethnic groups, were identified for a number of countries. They weare mainly localized in the sites of known B- and T-cell epitopes and relatively rarely the domains critical for in structure and protein function. The revealed differences in AA-substitutions in different ethnic groups and their colocalization with the clusters of B- and T-cell epitopes suggested their possible relation to the geographical distribution of alleles and haplotypes of the major histocompatibility complex (HLA). This may lead to the recognition of a different set of B- and T-cell epitopes of the virus in different geographic areas, resulting in the regional differences in the direction of epitopic drift. Phylogenetic analysis of the nucleotide sequences encoding the E6 protein of HPV16 revealed a common ancestor, confirmed regional clustering of the E6 protein sequences sharing common AA-substitutions, and identified cases of reversion of individual AA-substitutions when the change of geographical localization. For the E7 protein, such analysis was not possible due to the high sequence homology. Covariance analysis of the pooled of E6 and E7 sequences revealed that there was no associations between amino acid residues in any aa position within E6 or E7 as well as aa positions of E6 and E7 proteins. The data presented here are important for the development of universal therapeutic vaccines against HPV of high carcinogenic risk.
期刊介绍:
Molecular Biology is an international peer reviewed journal that covers a wide scope of problems in molecular, cell and computational biology including genomics, proteomics, bioinformatics, molecular virology and immunology, molecular development biology, molecular evolution and related areals. Molecular Biology publishes reviews, experimental and theoretical works. Every year, the journal publishes special issues devoted to most rapidly developing branches of physical-chemical biology and to the most outstanding scientists.