Nicholas C Palmateer, James B Munro, Sushma Nagaraj, Jonathan Crabtree, Roger Pelle, Luke Tallon, Vish Nene, Richard Bishop, Joana C Silva
{"title":"寄生线虫的高变Tpr多基因家族,由一个保守的、膜相关的、c端结构域定义,包括在种间具有明确的同源性的几个拷贝。","authors":"Nicholas C Palmateer, James B Munro, Sushma Nagaraj, Jonathan Crabtree, Roger Pelle, Luke Tallon, Vish Nene, Richard Bishop, Joana C Silva","doi":"10.1007/s00239-023-10142-z","DOIUrl":null,"url":null,"abstract":"<p><p>Multigene families often play an important role in host-parasite interactions. One of the largest multigene families in Theileria parva, the causative agent of East Coast fever, is the T. parva repeat (Tpr) gene family. The function of the putative Tpr proteins remains unknown. The initial publication of the T. parva reference genome identified 39 Tpr family open reading frames (ORFs) sharing a conserved C-terminal domain. Twenty-eight of these are clustered in a central region of chromosome 3, termed the \"Tpr locus\", while others are dispersed throughout all four nuclear chromosomes. The Tpr locus contains three of the four assembly gaps remaining in the genome, suggesting the presence of additional, as yet uncharacterized, Tpr gene copies. Here, we describe the use of long-read sequencing to attempt to close the gaps in the reference assembly of T. parva (located among multigene families clusters), characterize the full complement of Tpr family ORFs in the T. parva reference genome, and evaluate their evolutionary relationship with Tpr homologs in other Theileria species. We identify three new Tpr family genes in the T. parva reference genome and show that sequence similarity among paralogs in the Tpr locus is significantly higher than between genes outside the Tpr locus. We also identify sequences homologous to the conserved C-terminal domain in five additional Theileria species. Using these sequences, we show that the evolution of this gene family involves conservation of a few orthologs across species, combined with gene gains/losses, and species-specific expansions.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":"897-911"},"PeriodicalIF":2.1000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10730637/pdf/","citationCount":"0","resultStr":"{\"title\":\"The Hypervariable Tpr Multigene Family of Theileria Parasites, Defined by a Conserved, Membrane-Associated, C-Terminal Domain, Includes Several Copies with Defined Orthology Between Species.\",\"authors\":\"Nicholas C Palmateer, James B Munro, Sushma Nagaraj, Jonathan Crabtree, Roger Pelle, Luke Tallon, Vish Nene, Richard Bishop, Joana C Silva\",\"doi\":\"10.1007/s00239-023-10142-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Multigene families often play an important role in host-parasite interactions. One of the largest multigene families in Theileria parva, the causative agent of East Coast fever, is the T. parva repeat (Tpr) gene family. The function of the putative Tpr proteins remains unknown. The initial publication of the T. parva reference genome identified 39 Tpr family open reading frames (ORFs) sharing a conserved C-terminal domain. Twenty-eight of these are clustered in a central region of chromosome 3, termed the \\\"Tpr locus\\\", while others are dispersed throughout all four nuclear chromosomes. The Tpr locus contains three of the four assembly gaps remaining in the genome, suggesting the presence of additional, as yet uncharacterized, Tpr gene copies. Here, we describe the use of long-read sequencing to attempt to close the gaps in the reference assembly of T. parva (located among multigene families clusters), characterize the full complement of Tpr family ORFs in the T. parva reference genome, and evaluate their evolutionary relationship with Tpr homologs in other Theileria species. We identify three new Tpr family genes in the T. parva reference genome and show that sequence similarity among paralogs in the Tpr locus is significantly higher than between genes outside the Tpr locus. We also identify sequences homologous to the conserved C-terminal domain in five additional Theileria species. Using these sequences, we show that the evolution of this gene family involves conservation of a few orthologs across species, combined with gene gains/losses, and species-specific expansions.</p>\",\"PeriodicalId\":16366,\"journal\":{\"name\":\"Journal of Molecular Evolution\",\"volume\":\" \",\"pages\":\"897-911\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10730637/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Molecular Evolution\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1007/s00239-023-10142-z\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/11/28 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Molecular Evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s00239-023-10142-z","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/11/28 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
多基因家族通常在宿主-寄生虫相互作用中起重要作用。作为东海岸热的病原体,细小芽孢杆菌中最大的多基因家族之一是细小芽孢杆菌重复基因家族(T. parva repeat, Tpr)。推测的Tpr蛋白的功能尚不清楚。首次发表的T. parva参考基因组鉴定出39个Tpr家族开放阅读框(orf)共享一个保守的c端结构域。其中28个聚集在3号染色体的中心区域,称为“Tpr位点”,而其他的分散在所有四条核染色体中。Tpr位点包含基因组中剩余的四个组装间隙中的三个,这表明存在额外的,尚未表征的Tpr基因拷贝。在这里,我们描述了使用长读测序来试图关闭在T. parva(位于多基因家族集群中)参考组装的空白,表征T. parva参考基因组中Tpr家族orf的完整补体,并评估它们与其他Theileria物种中Tpr同源物的进化关系。我们在小孢子虫参考基因组中发现了三个新的Tpr家族基因,并发现Tpr位点内的同源基因之间的序列相似性显著高于Tpr位点外的基因之间的序列相似性。我们还鉴定了另外5个Theileria物种中与保守的c端结构域同源的序列。利用这些序列,我们表明该基因家族的进化涉及到跨物种的几个同源物的保护,结合基因的获得/损失和物种特异性扩展。
The Hypervariable Tpr Multigene Family of Theileria Parasites, Defined by a Conserved, Membrane-Associated, C-Terminal Domain, Includes Several Copies with Defined Orthology Between Species.
Multigene families often play an important role in host-parasite interactions. One of the largest multigene families in Theileria parva, the causative agent of East Coast fever, is the T. parva repeat (Tpr) gene family. The function of the putative Tpr proteins remains unknown. The initial publication of the T. parva reference genome identified 39 Tpr family open reading frames (ORFs) sharing a conserved C-terminal domain. Twenty-eight of these are clustered in a central region of chromosome 3, termed the "Tpr locus", while others are dispersed throughout all four nuclear chromosomes. The Tpr locus contains three of the four assembly gaps remaining in the genome, suggesting the presence of additional, as yet uncharacterized, Tpr gene copies. Here, we describe the use of long-read sequencing to attempt to close the gaps in the reference assembly of T. parva (located among multigene families clusters), characterize the full complement of Tpr family ORFs in the T. parva reference genome, and evaluate their evolutionary relationship with Tpr homologs in other Theileria species. We identify three new Tpr family genes in the T. parva reference genome and show that sequence similarity among paralogs in the Tpr locus is significantly higher than between genes outside the Tpr locus. We also identify sequences homologous to the conserved C-terminal domain in five additional Theileria species. Using these sequences, we show that the evolution of this gene family involves conservation of a few orthologs across species, combined with gene gains/losses, and species-specific expansions.
期刊介绍:
Journal of Molecular Evolution covers experimental, computational, and theoretical work aimed at deciphering features of molecular evolution and the processes bearing on these features, from the initial formation of macromolecular systems through their evolution at the molecular level, the co-evolution of their functions in cellular and organismal systems, and their influence on organismal adaptation, speciation, and ecology. Topics addressed include the evolution of informational macromolecules and their relation to more complex levels of biological organization, including populations and taxa, as well as the molecular basis for the evolution of ecological interactions of species and the use of molecular data to infer fundamental processes in evolutionary ecology. This coverage accommodates such subfields as new genome sequences, comparative structural and functional genomics, population genetics, the molecular evolution of development, the evolution of gene regulation and gene interaction networks, and in vitro evolution of DNA and RNA, molecular evolutionary ecology, and the development of methods and theory that enable molecular evolutionary inference, including but not limited to, phylogenetic methods.