E. E. Zelenova, A. A. Karlsen, D. V. Avdoshina, K. K. Kyuregyan, M. G. Belikova, I. D. Trotsenko
{"title":"Amino Acid Substitution Patterns in the E6 and E7 Proteins of HPV Type 16: Phylogeography and Evolution","authors":"E. E. Zelenova, A. A. Karlsen, D. V. Avdoshina, K. K. Kyuregyan, M. G. Belikova, I. D. Trotsenko","doi":"10.1134/s0026893324700213","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The E6 and E7 proteins of the high risk human papillomaviruses (HR HPVs) play a key role in the oncogenesis associated with papillomavirus infection. Data on the variability of these proteins are limited, and the factors affecting their variability are still poorly understood. We analyzed the variability of the currently known sequences of the HPV type 16 (HPV16) E6 and E7 proteins, taking into account their geographic origin and year of sample collection, as well as the direction of their evolution in the major geographic regions of the world. All sequences belonging to HPV16 genome fragments encoding the E6 and E7 oncoproteins were downloaded from the NCBI GenBank database on October 6, 2022. Samples were filtered according to the following parameters: the sequence has to include at least one of the two whole open reading frames, and given date of collection, and the country of origin. A total of 3651 full-genome nucleotide sequences encoding the E6 protein and 4578 full-genome nucleotide sequences encoding the E7 protein were sampled. The nucleotide sequences obtained after sampling and alignment were converted to amino acid sequences and analyzed using the MEGA11, R, RStudio, Jmodeltest 2.1.20, BEAST v1.10.4, Fastcov, and Biostrings software. The highest variability in the E6 protein was recorded for amino acid (AA) residues in the positions 17, 21, 32, 85, and 90. The most variable in E7 were aa positions 28, 29, 51, and 77. The samples were divided geographically into five heterogeneous groups as derived from Africa, Europe, America, South-West and South Asia, and South-East Asia. Unique amino acid substitutions (AA-substitutions) in the E6/E7 proteins of HPV16, presumably characteristic to certain ethnic groups, were identified for a number of countries. They weare mainly localized in the sites of known B- and T-cell epitopes and relatively rarely the domains critical for in structure and protein function. The revealed differences in AA-substitutions in different ethnic groups and their colocalization with the clusters of B- and T-cell epitopes suggested their possible relation to the geographical distribution of alleles and haplotypes of the major histocompatibility complex (HLA). This may lead to the recognition of a different set of B- and T-cell epitopes of the virus in different geographic areas, resulting in the regional differences in the direction of epitopic drift. Phylogenetic analysis of the nucleotide sequences encoding the E6 protein of HPV16 revealed a common ancestor, confirmed regional clustering of the E6 protein sequences sharing common AA-substitutions, and identified cases of reversion of individual AA-substitutions when the change of geographical localization. For the E7 protein, such analysis was not possible due to the high sequence homology. Covariance analysis of the pooled of E6 and E7 sequences revealed that there was no associations between amino acid residues in any aa position within E6 or E7 as well as aa positions of E6 and E7 proteins. The data presented here are important for the development of universal therapeutic vaccines against HPV of high carcinogenic risk.</p>","PeriodicalId":18734,"journal":{"name":"Molecular Biology","volume":null,"pages":null},"PeriodicalIF":1.5000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1134/s0026893324700213","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The E6 and E7 proteins of the high risk human papillomaviruses (HR HPVs) play a key role in the oncogenesis associated with papillomavirus infection. Data on the variability of these proteins are limited, and the factors affecting their variability are still poorly understood. We analyzed the variability of the currently known sequences of the HPV type 16 (HPV16) E6 and E7 proteins, taking into account their geographic origin and year of sample collection, as well as the direction of their evolution in the major geographic regions of the world. All sequences belonging to HPV16 genome fragments encoding the E6 and E7 oncoproteins were downloaded from the NCBI GenBank database on October 6, 2022. Samples were filtered according to the following parameters: the sequence has to include at least one of the two whole open reading frames, and given date of collection, and the country of origin. A total of 3651 full-genome nucleotide sequences encoding the E6 protein and 4578 full-genome nucleotide sequences encoding the E7 protein were sampled. The nucleotide sequences obtained after sampling and alignment were converted to amino acid sequences and analyzed using the MEGA11, R, RStudio, Jmodeltest 2.1.20, BEAST v1.10.4, Fastcov, and Biostrings software. The highest variability in the E6 protein was recorded for amino acid (AA) residues in the positions 17, 21, 32, 85, and 90. The most variable in E7 were aa positions 28, 29, 51, and 77. The samples were divided geographically into five heterogeneous groups as derived from Africa, Europe, America, South-West and South Asia, and South-East Asia. Unique amino acid substitutions (AA-substitutions) in the E6/E7 proteins of HPV16, presumably characteristic to certain ethnic groups, were identified for a number of countries. They weare mainly localized in the sites of known B- and T-cell epitopes and relatively rarely the domains critical for in structure and protein function. The revealed differences in AA-substitutions in different ethnic groups and their colocalization with the clusters of B- and T-cell epitopes suggested their possible relation to the geographical distribution of alleles and haplotypes of the major histocompatibility complex (HLA). This may lead to the recognition of a different set of B- and T-cell epitopes of the virus in different geographic areas, resulting in the regional differences in the direction of epitopic drift. Phylogenetic analysis of the nucleotide sequences encoding the E6 protein of HPV16 revealed a common ancestor, confirmed regional clustering of the E6 protein sequences sharing common AA-substitutions, and identified cases of reversion of individual AA-substitutions when the change of geographical localization. For the E7 protein, such analysis was not possible due to the high sequence homology. Covariance analysis of the pooled of E6 and E7 sequences revealed that there was no associations between amino acid residues in any aa position within E6 or E7 as well as aa positions of E6 and E7 proteins. The data presented here are important for the development of universal therapeutic vaccines against HPV of high carcinogenic risk.
期刊介绍:
Molecular Biology is an international peer reviewed journal that covers a wide scope of problems in molecular, cell and computational biology including genomics, proteomics, bioinformatics, molecular virology and immunology, molecular development biology, molecular evolution and related areals. Molecular Biology publishes reviews, experimental and theoretical works. Every year, the journal publishes special issues devoted to most rapidly developing branches of physical-chemical biology and to the most outstanding scientists.