Pub Date : 2024-09-13DOI: 10.1101/2024.09.10.608920
Rob J Dekker, Wim C de Leeuw, Marina van Olst, Wim A Ensink, Selina van Leeuwen, Timo M Breit, Martijs J Jonker
To contribute to the discovery of the global virome, we explored plants from an urban botanic garden in the Netherlands for new and variant plant viruses. Analyzing RNA from 25 plants from the Asparagaceae family with both RNA-seq, as well as smallRNA-seq revealed in six plants the presence of a variant Polerovirus from the Solemoviridae family that shows an overall RNA-sequence identity of 93% to the known Ornithogalum Virus 5 (OV-5). Amino acid sequence comparison of the complex set of proteins produced by the new virus revealed that all but Protein-0, showed high similarity (> 91%) with those of the OV-5 virus. The similarity between the new P0 protein and the OV-5 P0 protein, however, was for all variants less than 83%, well below the ICTV species protein-based demarcation criterium. Hence, we named the new virus Asparagaceae Polerovirus 1 (AspPolV-1).
{"title":"A new Polerovirus species in plants from the Asparagaceae family defined by its RNA-silencing repressor protein P0","authors":"Rob J Dekker, Wim C de Leeuw, Marina van Olst, Wim A Ensink, Selina van Leeuwen, Timo M Breit, Martijs J Jonker","doi":"10.1101/2024.09.10.608920","DOIUrl":"https://doi.org/10.1101/2024.09.10.608920","url":null,"abstract":"To contribute to the discovery of the global virome, we explored plants from an urban botanic garden in the Netherlands for new and variant plant viruses. Analyzing RNA from 25 plants from the <em>Asparagaceae</em> family with both RNA-seq, as well as smallRNA-seq revealed in six plants the presence of a variant Polerovirus from the <em>Solemoviridae</em> family that shows an overall RNA-sequence identity of 93% to the known Ornithogalum Virus 5 (OV-5). Amino acid sequence comparison of the complex set of proteins produced by the new virus revealed that all but Protein-0, showed high similarity (> 91%) with those of the OV-5 virus. The similarity between the new P0 protein and the OV-5 P0 protein, however, was for all variants less than 83%, well below the ICTV species protein-based demarcation criterium. Hence, we named the new virus Asparagaceae Polerovirus 1 (AspPolV-1).","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142260820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-12DOI: 10.1101/2024.09.06.611754
Stefano Secchia, Vera Beilinson, Xiaoting Chen, Zi F. Yang, Joseph A. Wayman, Jasbir Dhaliwal, Ingrid Jurickova, Elizabeth Angerman, Lee A. Denson, Emily R. Miraldi, Matthew T. Weirauch, Kohta Ikegami
Nutrient deprivation induces a reversible cell cycle arrest state termed quiescence, which often accompanies transcriptional silencing and chromatin compaction. Paradoxically, nutrient deprivation is associated with activated fibroblast states in pathological microenvironments in which fibroblasts drive extracellular matrix (ECM) remodeling to alter tissue environments. The relationship between nutrient deprivation and fibroblast activation remains unclear. Here, we report that serum deprivation extensively activates transcription of ECM remodeling genes in cultured fibroblasts, despite the induction of quiescence. Starvation-induced transcriptional activation accompanied large-scale histone acetylation of putative distal enhancers, but not promoters. The starvation-activated putative enhancers were enriched for non-coding genetic risk variants associated with inflammatory bowel disease (IBD), suggesting that the starvation-activated gene regulatory network may contribute to fibroblast activation in IBD. Indeed, the starvation-activated gene PLAU, encoding uPA serine protease for plasminogen and ECM, was upregulated in inflammatory fibroblasts in the intestines of IBD patients. Furthermore, the starvation-activated putative enhancer at PLAU, which harbors an IBD risk variant, gained chromatin accessibility in IBD patient fibroblasts. This study implicates nutrient deprivation in transcriptional activation of ECM remodeling genes in fibroblasts and suggests nutrient deprivation as a potential mechanism for pathological fibroblast activation in IBD.
{"title":"Nutrient starvation activates ECM remodeling gene enhancers associated with inflammatory bowel disease risk in fibroblasts","authors":"Stefano Secchia, Vera Beilinson, Xiaoting Chen, Zi F. Yang, Joseph A. Wayman, Jasbir Dhaliwal, Ingrid Jurickova, Elizabeth Angerman, Lee A. Denson, Emily R. Miraldi, Matthew T. Weirauch, Kohta Ikegami","doi":"10.1101/2024.09.06.611754","DOIUrl":"https://doi.org/10.1101/2024.09.06.611754","url":null,"abstract":"Nutrient deprivation induces a reversible cell cycle arrest state termed quiescence, which often accompanies transcriptional silencing and chromatin compaction. Paradoxically, nutrient deprivation is associated with activated fibroblast states in pathological microenvironments in which fibroblasts drive extracellular matrix (ECM) remodeling to alter tissue environments. The relationship between nutrient deprivation and fibroblast activation remains unclear. Here, we report that serum deprivation extensively activates transcription of ECM remodeling genes in cultured fibroblasts, despite the induction of quiescence. Starvation-induced transcriptional activation accompanied large-scale histone acetylation of putative distal enhancers, but not promoters. The starvation-activated putative enhancers were enriched for non-coding genetic risk variants associated with inflammatory bowel disease (IBD), suggesting that the starvation-activated gene regulatory network may contribute to fibroblast activation in IBD. Indeed, the starvation-activated gene <em>PLAU</em>, encoding uPA serine protease for plasminogen and ECM, was upregulated in inflammatory fibroblasts in the intestines of IBD patients. Furthermore, the starvation-activated putative enhancer at <em>PLAU</em>, which harbors an IBD risk variant, gained chromatin accessibility in IBD patient fibroblasts. This study implicates nutrient deprivation in transcriptional activation of ECM remodeling genes in fibroblasts and suggests nutrient deprivation as a potential mechanism for pathological fibroblast activation in IBD.","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-12DOI: 10.1101/2024.09.12.612198
Dubravka Vucicevic, Che-Wei Hsu, Lorena Sofia Lopez Zepeda, Martin Burkert, Antje Hirsekorn, Ilija Bilic, Nicolai Kastelic, Markus Landthaler, Scott A. Lacadie, Uwe Ohler
Transcriptional enhancers are non-coding DNA elements that regulate gene transcription in a temporal and tissue-specific manner. Despite advances in computational and experimental methods, identifying enhancers and their target genes remains challenging. To identify and functionally perturb enhancers at their endogenous sites, we performed a pooled tiling CRISPR activation (CRISPRa) screen surrounding PHOX2B, a regulator of neuronal differentiation and neuroblastoma, revealing many CRISPRa-responsive-elements (CaREs) that alter cellular growth. To determine CaRE target genes, we developed and applied TESLA-seq (TargEted-SingLe-cell-Activation), which combines CRISPRa screening with targeted single-cell RNA-sequencing and enabled the parallel readout of the effect of hundreds of enhancers on all genes in the locus. While most TESLA-revealed CaRE-gene relationships involved neuroblastoma regulatory elements already active in the system, we found many CaREs and target connections normally active only in other tissue types or with no previous evidence. This highlights the power of TESLA-seq to reveal gene regulatory networks, including edges active outside of a given experimental system.
转录增强子是以时间和组织特异性方式调节基因转录的非编码 DNA 元件。尽管计算和实验方法取得了进步,但识别增强子及其靶基因仍具有挑战性。为了在增强子的内源位点识别增强子并对其进行功能性扰乱,我们围绕神经元分化和神经母细胞瘤的调控因子 PHOX2B 进行了一次集合堆积 CRISPR 激活(CRISPRa)筛选,发现了许多能改变细胞生长的 CRISPRa 响应元件(CaRE)。为了确定 CaRE 靶基因,我们开发并应用了 TESLA-seq(TargEted-SingLe-cell-Activation),它将 CRISPRa 筛选与靶向单细胞 RNA 测序相结合,能够平行读出数百个增强子对基因座中所有基因的影响。虽然 TESLA 揭示的 CaRE 与基因的关系大多涉及神经母细胞瘤系统中已经活跃的调控元件,但我们发现了许多通常只在其他组织类型中活跃或以前没有证据的 CaRE 和目标连接。这凸显了 TESLA-seq 揭示基因调控网络的能力,包括在特定实验系统之外活跃的边缘。
{"title":"Sensitive dissection of a genomic regulatory landscape using bulk and targeted single-cell activation","authors":"Dubravka Vucicevic, Che-Wei Hsu, Lorena Sofia Lopez Zepeda, Martin Burkert, Antje Hirsekorn, Ilija Bilic, Nicolai Kastelic, Markus Landthaler, Scott A. Lacadie, Uwe Ohler","doi":"10.1101/2024.09.12.612198","DOIUrl":"https://doi.org/10.1101/2024.09.12.612198","url":null,"abstract":"Transcriptional enhancers are non-coding DNA elements that regulate gene transcription in a temporal and tissue-specific manner. Despite advances in computational and experimental methods, identifying enhancers and their target genes remains challenging. To identify and functionally perturb enhancers at their endogenous sites, we performed a pooled tiling CRISPR activation (CRISPRa) screen surrounding PHOX2B, a regulator of neuronal differentiation and neuroblastoma, revealing many CRISPRa-responsive-elements (CaREs) that alter cellular growth. To determine CaRE target genes, we developed and applied TESLA-seq (TargEted-SingLe-cell-Activation), which combines CRISPRa screening with targeted single-cell RNA-sequencing and enabled the parallel readout of the effect of hundreds of enhancers on all genes in the locus. While most TESLA-revealed CaRE-gene relationships involved neuroblastoma regulatory elements already active in the system, we found many CaREs and target connections normally active only in other tissue types or with no previous evidence. This highlights the power of TESLA-seq to reveal gene regulatory networks, including edges active outside of a given experimental system.","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Expanding tandem gene arrays facilitates adaptation through dosage effects and gene family formation via sequence diversification. However, experimental induction of such expansions remains challenging. Here we introduce a method termed break-induced replication (BIR)-mediated tandem repeat expansion (BITREx) to address this challenge. BITREx strategically places Cas9 nickase adjacent to a tandem gene array to break the replication fork that has replicated the array, forming a single-end double-strand break. This break is subsequently end-resected to become single-stranded. Since there is no repeat unit downstream of the break, the single-stranded DNA often invades an upstream unit to initiate ectopic BIR, resulting in array expansion. BITREx has successfully expanded gene arrays in budding yeast, with the CUP1 array reaching ~1 Mb. Furthermore, appropriate splint DNA allows BITREx to generate tandem arrays de novo from single-copy genes. We have also demonstrated BITREx in mammalian cells. Therefore, BITREx will find various unique applications in genome engineering.
扩增串联基因阵列可通过剂量效应促进适应,并通过序列多样化形成基因家族。然而,这种扩增的实验诱导仍然具有挑战性。在这里,我们引入了一种称为断裂诱导复制(BIR)介导的串联重复扩增(BITREx)的方法来应对这一挑战。BITREx 将 Cas9 标记酶策略性地置于串联基因阵列附近,以打断复制阵列的复制叉,形成单端双链断裂。这一断裂随后被末端切割成单链。由于断裂处下游没有重复单元,单链 DNA 通常会侵入上游单元,启动异位 BIR,从而导致阵列扩增。BITREx 已经成功扩增了芽殖酵母中的基因阵列,CUP1 阵列达到了 ~1 Mb。此外,BITREx 还能利用适当的拼接 DNA 从单拷贝基因中重新生成串联阵列。我们还在哺乳动物细胞中演示了 BITREx。因此,BITREx 将在基因组工程中找到各种独特的应用。
{"title":"Strategic targeting of Cas9 nickase expands tandem gene arrays","authors":"Hiroaki Takesue, Satoshi Okada, Goro Doi, Yuki Sugiyama, Emiko Kusumoto, Takashi Ito","doi":"10.1101/2024.09.10.612242","DOIUrl":"https://doi.org/10.1101/2024.09.10.612242","url":null,"abstract":"Expanding tandem gene arrays facilitates adaptation through dosage effects and gene family formation via sequence diversification. However, experimental induction of such expansions remains challenging. Here we introduce a method termed break-induced replication (BIR)-mediated tandem repeat expansion (BITREx) to address this challenge. BITREx strategically places Cas9 nickase adjacent to a tandem gene array to break the replication fork that has replicated the array, forming a single-end double-strand break. This break is subsequently end-resected to become single-stranded. Since there is no repeat unit downstream of the break, the single-stranded DNA often invades an upstream unit to initiate ectopic BIR, resulting in array expansion. BITREx has successfully expanded gene arrays in budding yeast, with the CUP1 array reaching ~1 Mb. Furthermore, appropriate splint DNA allows BITREx to generate tandem arrays de novo from single-copy genes. We have also demonstrated BITREx in mammalian cells. Therefore, BITREx will find various unique applications in genome engineering.","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"115 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1101/2024.09.10.612220
Mengzhen Li, Xuanpei Zhai, Jie Li, Shiyan Li, Yanan Du, Jian Zhang, Rong Zhang, Yuan Luo, Wu Wei, Yifan Liu
Microorganisms dominate Earth's ecosystems in both abundance and diversity. Studying these organisms relies on their genome information, but obtaining high-quality genomes has long been challenging due to technical limitations in genomic sequencing. Traditional genome sequencing methods are limited by the need to culture isolated strains, excluding unculturable taxa and restricting the study of complex communities. Metagenomics bypasses this but lacks single-cell resolution and often misses rare, critical species. Droplet-based single-cell genomics offers high-throughput genome library preparation but faces challenges like requiring complex microfluidic setups and low genome coverage. To address these challenges, we introduce CAP-seq, a high-throughput single-cell genomic sequencing method with markedly improved genome coverage and ease of use. CAP-seq employs semi-permeable compartments that allow reagent exchange while retaining large DNA fragments, enabling efficient genome processing. This innovation results in higher-quality single amplified genomes (SAGs) and significantly improves resolution. In validation tests with simple and complex microbial communities, CAP-seq yielded high-quality SAGs with over 50% genome coverage, capturing rare taxa more accurately and providing detailed insights into strain-level variation. CAP-seq thus offers a scalable, high-resolution solution for microbial genomic analysis, overcoming the limitations of droplet-based single-cell genomics and enhancing the study of complex microbial ecosystems.
微生物在地球生态系统的数量和多样性方面都占据着主导地位。研究这些生物依赖于它们的基因组信息,但由于基因组测序的技术限制,获得高质量的基因组一直是个挑战。传统的基因组测序方法受限于培养分离菌株的需要,排除了不可培养的类群,限制了对复杂群落的研究。元基因组学绕过了这一问题,但缺乏单细胞分辨率,往往会遗漏稀有的关键物种。基于液滴的单细胞基因组学可提供高通量基因组文库制备,但面临着需要复杂微流控装置和基因组覆盖率低等挑战。为了应对这些挑战,我们引入了 CAP-seq,这是一种高通量单细胞基因组测序方法,其基因组覆盖率和易用性都有显著提高。CAP-seq 采用半渗透隔室,允许试剂交换,同时保留大的 DNA 片段,从而实现高效的基因组处理。这一创新可获得更高质量的单扩增基因组(SAG),并显著提高分辨率。在简单和复杂微生物群落的验证测试中,CAP-seq 得到了基因组覆盖率超过 50% 的高质量 SAG,更准确地捕捉到了稀有类群,并提供了对菌株级变异的详细了解。因此,CAP-seq 为微生物基因组分析提供了一种可扩展的高分辨率解决方案,克服了基于液滴的单细胞基因组学的局限性,加强了对复杂微生物生态系统的研究。
{"title":"High-coverage, massively parallel sequencing of single-cell genomes with CAP-seq","authors":"Mengzhen Li, Xuanpei Zhai, Jie Li, Shiyan Li, Yanan Du, Jian Zhang, Rong Zhang, Yuan Luo, Wu Wei, Yifan Liu","doi":"10.1101/2024.09.10.612220","DOIUrl":"https://doi.org/10.1101/2024.09.10.612220","url":null,"abstract":"Microorganisms dominate Earth's ecosystems in both abundance and diversity. Studying these organisms relies on their genome information, but obtaining high-quality genomes has long been challenging due to technical limitations in genomic sequencing. Traditional genome sequencing methods are limited by the need to culture isolated strains, excluding unculturable taxa and restricting the study of complex communities. Metagenomics bypasses this but lacks single-cell resolution and often misses rare, critical species. Droplet-based single-cell genomics offers high-throughput genome library preparation but faces challenges like requiring complex microfluidic setups and low genome coverage. To address these challenges, we introduce CAP-seq, a high-throughput single-cell genomic sequencing method with markedly improved genome coverage and ease of use. CAP-seq employs semi-permeable compartments that allow reagent exchange while retaining large DNA fragments, enabling efficient genome processing. This innovation results in higher-quality single amplified genomes (SAGs) and significantly improves resolution. In validation tests with simple and complex microbial communities, CAP-seq yielded high-quality SAGs with over 50% genome coverage, capturing rare taxa more accurately and providing detailed insights into strain-level variation. CAP-seq thus offers a scalable, high-resolution solution for microbial genomic analysis, overcoming the limitations of droplet-based single-cell genomics and enhancing the study of complex microbial ecosystems.","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1101/2024.09.10.610634
Theresa Schmid, Gabriele Rodrian, Alexander Kohler, Michael Wegner, Lina Goelz, Matthias Weider
Orofacial clefts are the second-most prevalent congenital malformation. Risk factors are multifactorial and include genetic components but also environmental factors. One environmental factor is hypoxia during pregnancy, caused for instance by tobacco smoking, medication or living at high altitudes. Knowledge about the molecular link between hypoxia and orofacial clefts is at large. We here show that hypoxia has only modest effects on proliferating cranial neural crest cells, but dramatically influences their differentiation potential. We detected massive perturbations in their differentiation to chondrocytes, osteoblasts and smooth muscle cells. The transcriptional induction of the majority of regulated genes during each of these processes was grossly impaired by hypoxic conditions, as evidenced by genome-wide transcriptomic analyses. Bioinformatic analyses pointed to cytoskeletal organization and amino acid metabolism as two main processes compromised during all three differentiation pathways, and several orofacial cleft risk genes were among the genes with impaired induction during hypoxia. Our analyses reveal a drastic influence of hypoxia on the differentiation potential of cranial neural crest cells as a possible source for the occurrence of orofacial clefts.
{"title":"Hypoxia impedes differentiation of cranial neural crest cells into derivatives relevant for craniofacial development","authors":"Theresa Schmid, Gabriele Rodrian, Alexander Kohler, Michael Wegner, Lina Goelz, Matthias Weider","doi":"10.1101/2024.09.10.610634","DOIUrl":"https://doi.org/10.1101/2024.09.10.610634","url":null,"abstract":"Orofacial clefts are the second-most prevalent congenital malformation. Risk factors are multifactorial and include genetic components but also environmental factors. One environmental factor is hypoxia during pregnancy, caused for instance by tobacco smoking, medication or living at high altitudes. Knowledge about the molecular link between hypoxia and orofacial clefts is at large. We here show that hypoxia has only modest effects on proliferating cranial neural crest cells, but dramatically influences their differentiation potential. We detected massive perturbations in their differentiation to chondrocytes, osteoblasts and smooth muscle cells. The transcriptional induction of the majority of regulated genes during each of these processes was grossly impaired by hypoxic conditions, as evidenced by genome-wide transcriptomic analyses. Bioinformatic analyses pointed to cytoskeletal organization and amino acid metabolism as two main processes compromised during all three differentiation pathways, and several orofacial cleft risk genes were among the genes with impaired induction during hypoxia. Our analyses reveal a drastic influence of hypoxia on the differentiation potential of cranial neural crest cells as a possible source for the occurrence of orofacial clefts.","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"89 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1101/2024.09.05.611428
Moritz A Peters, Volker Soltys, Dingwen Su, Yingguang Frank Chan
T cells recognize an immense spectrum of pathogens to initiate immune responses by means of a large repertoire of T cell receptors (TCRs) that arise from somatic rearrangements of variable, diversity and joining gene segments at the TCR loci. These gene segments have emerged from a limited number of ancestral genes through a series of gene duplication events, resulting in a greatly variable number of such genes across different species. Apart from the complete V(D)J gene annotations in the human and mouse reference assemblies, little is known about the structure of TCR loci in other species. Here, we performed a comprehensive comparison of the TCRα and TCRβ gene segment clusters in mice and three of its closely related sister species. We show that the TCRα variable gene cluster is frequently rearranged, leading to deletions and sequence inversions in this region. The resulting complexity of TCR loci severely complicates the assembly of these loci and the annotation of gene segments. By jointly utilizing genomic and transcriptomic data, we show that in Mus castaneus the variable gene cluster at the α locus has undergone a recent major locus contraction, leading to the loss of 74 variable gene segments. Additionally, we validated the expression of functional variable genes, including atypical ones with inverted orientation relative to other such segments. Disentangling the fine-scale structure of TCR loci in different species can provide valuable insights in the evolution and diversity of TCR repertoires.
T 细胞通过大量的 T 细胞受体(TCR)来识别大量病原体,从而启动免疫反应,这些 T 细胞受体是由 TCR 基因座上可变、多样和连接基因片段的体细胞重排产生的。这些基因片段是通过一系列基因复制事件从数量有限的祖先基因中产生的,导致不同物种中此类基因的数量差异很大。除了人类和小鼠参考文献汇编中完整的 V(D)J 基因注释外,人们对其他物种 TCR 基因座的结构知之甚少。在这里,我们对小鼠及其三个近缘姊妹物种的 TCRα 和 TCRβ 基因片段群进行了全面比较。我们发现,TCRα可变基因簇经常发生重排,导致该区域出现缺失和序列倒置。由此导致的 TCR 基因座的复杂性使这些基因座的组装和基因片段的注释变得非常复杂。通过联合利用基因组和转录组数据,我们发现蓖麻蝇 α 基因座上的可变基因簇最近经历了一次大的基因座收缩,导致 74 个可变基因片段丢失。此外,我们还验证了功能性可变基因的表达,包括与其他基因片段方向倒置的非典型基因。厘清不同物种中 TCR 基因座的精细结构可以为 TCR 基因组的进化和多样性提供有价值的见解。
{"title":"Distinct evolution at TCRα and TCRβ loci in the genus Mus","authors":"Moritz A Peters, Volker Soltys, Dingwen Su, Yingguang Frank Chan","doi":"10.1101/2024.09.05.611428","DOIUrl":"https://doi.org/10.1101/2024.09.05.611428","url":null,"abstract":"T cells recognize an immense spectrum of pathogens to initiate immune responses by means of a large repertoire of T cell receptors (TCRs) that arise from somatic rearrangements of <em>variable</em>, <em>diversity</em> and <em>joining</em> gene segments at the TCR loci. These gene segments have emerged from a limited number of ancestral genes through a series of gene duplication events, resulting in a greatly variable number of such genes across different species. Apart from the complete V(D)J gene annotations in the human and mouse reference assemblies, little is known about the structure of TCR loci in other species. Here, we performed a comprehensive comparison of the TCRα and TCRβ gene segment clusters in mice and three of its closely related sister species. We show that the TCRα <em>variable</em> gene cluster is frequently rearranged, leading to deletions and sequence inversions in this region. The resulting complexity of TCR loci severely complicates the assembly of these loci and the annotation of gene segments. By jointly utilizing genomic and transcriptomic data, we show that in <em>Mus castaneus</em> the variable gene cluster at the α locus has undergone a recent major locus contraction, leading to the loss of 74 <em>variable</em> gene segments. Additionally, we validated the expression of functional variable genes, including atypical ones with inverted orientation relative to other such segments. Disentangling the fine-scale structure of TCR loci in different species can provide valuable insights in the evolution and diversity of TCR repertoires.","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"95 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1101/2024.09.10.612283
Juan I Bravo, Lucia Zhang, Bérénice Anath Benayoun
Long INterspersed Element-1 (LINE-1; L1) and Alu are two families of transposable elements (TEs) occupying ~17% and ~11% of the human genome, respectively. Though only a small fraction of L1 copies is able to produce the machinery to mobilize autonomously, Alu elements and degenerate L1 copies can hijack their functional machinery and mobilize in trans. The expression and subsequent copy number expansion of L1 and Alu can exert pathological effects on their hosts, promoting genome instability, inflammation, and cell cycle alterations. These features have made L1 and Alu promising focus subjects in studies of aging and aging diseases where they can become active. However, the mechanisms regulating variation in their expression and copy number remain incompletely characterized. Moreover, the relevance of known mechanisms to diverse human populations remains unclear, as mechanisms are often characterized in isogenic cell culture models. To address these gaps, we leveraged genomic data from the 1000 Genomes Project to carry out a trans-ethnic GWAS of L1 and Alu insertion global singletons. These singletons are rare insertions observed only once in a population, potentially reflecting recently acquired L1 and Alu integrants or structural variants, and which we used as proxies for L1/Alu-associated copy number variation. Our computational approach identified single nucleotide variants in genomic regions containing genes with potential and known TE regulatory properties, and it enriched for single nucleotide variants in regions containing known regulators of L1 expression. Moreover, we identified many reference TE copies and polymorphic structural variants that were associated with L1/Alu singletons, suggesting their potential contribution to TE copy number variation through transposition-dependent or transposition-independent mechanisms. Finally, a transcriptional analysis of lymphoblastoid cells highlighted potential cell cycle alterations in a subset of samples harboring L1/Alu singletons. Collectively, our results (i) suggest that known TE regulatory mechanisms may also play regulatory roles in diverse human populations, (ii) expand the list of genic and repetitive genomic loci implicated in TE copy number variation, and (iii) reinforce the links between TEs and disease.
长插入元件-1(LINE-1;L1)和Alu是两个转座元件(TE)家族,分别占人类基因组的17%和11%。虽然只有一小部分L1拷贝能够产生自主调动的机制,但Alu元件和退化的L1拷贝可以劫持其功能机制并进行反式调动。L1 和 Alu 的表达及随后的拷贝数扩增可对宿主产生病理影响,促进基因组不稳定、炎症和细胞周期改变。这些特点使 L1 和 Alu 成为研究衰老和衰老性疾病的有希望的重点对象,因为它们在这些疾病中会变得活跃。然而,调节它们的表达和拷贝数变化的机制仍未完全定性。此外,已知机制与不同人群的相关性仍不清楚,因为这些机制通常是在同源细胞培养模型中表征的。为了填补这些空白,我们利用 "千人基因组计划"(1000 Genomes Project)的基因组数据,对 L1 和 Alu 插入全局单倍子进行了跨种族 GWAS 分析。这些单体是在人群中只观察到一次的罕见插入,可能反映了最近获得的 L1 和 Alu 整合体或结构变异,我们将其用作 L1/Alu 相关拷贝数变异的替代物。我们的计算方法识别了基因组区域中的单核苷酸变异,这些区域包含具有潜在和已知 TE 调控特性的基因,并且富集了包含已知 L1 表达调控因子的区域中的单核苷酸变异。此外,我们还发现了许多与 L1/Alu 单体相关的参考 TE 拷贝和多态结构变异,这表明它们可能通过转座依赖或转座非依赖机制对 TE 拷贝数变异做出了贡献。最后,淋巴母细胞的转录分析突显了携带 L1/Alu 单倍子的部分样本中潜在的细胞周期改变。总之,我们的研究结果(i)表明已知的TE调控机制也可能在不同的人类群体中发挥调控作用,(ii)扩大了与TE拷贝数变异有关的基因和重复基因组位点的清单,(iii)加强了TE与疾病之间的联系。
{"title":"Multi-ancestry GWAS reveals loci linked to human variation in LINE-1- and Alu-copy numbers","authors":"Juan I Bravo, Lucia Zhang, Bérénice Anath Benayoun","doi":"10.1101/2024.09.10.612283","DOIUrl":"https://doi.org/10.1101/2024.09.10.612283","url":null,"abstract":"Long INterspersed Element-1 (LINE-1; L1) and Alu are two families of transposable elements (TEs) occupying ~17% and ~11% of the human genome, respectively. Though only a small fraction of L1 copies is able to produce the machinery to mobilize autonomously, Alu elements and degenerate L1 copies can hijack their functional machinery and mobilize in trans. The expression and subsequent copy number expansion of L1 and Alu can exert pathological effects on their hosts, promoting genome instability, inflammation, and cell cycle alterations. These features have made L1 and Alu promising focus subjects in studies of aging and aging diseases where they can become active. However, the mechanisms regulating variation in their expression and copy number remain incompletely characterized. Moreover, the relevance of known mechanisms to diverse human populations remains unclear, as mechanisms are often characterized in isogenic cell culture models. To address these gaps, we leveraged genomic data from the 1000 Genomes Project to carry out a trans-ethnic GWAS of L1 and Alu insertion global singletons. These singletons are rare insertions observed only once in a population, potentially reflecting recently acquired L1 and Alu integrants or structural variants, and which we used as proxies for L1/Alu-associated copy number variation. Our computational approach identified single nucleotide variants in genomic regions containing genes with potential and known TE regulatory properties, and it enriched for single nucleotide variants in regions containing known regulators of L1 expression. Moreover, we identified many reference TE copies and polymorphic structural variants that were associated with L1/Alu singletons, suggesting their potential contribution to TE copy number variation through transposition-dependent or transposition-independent mechanisms. Finally, a transcriptional analysis of lymphoblastoid cells highlighted potential cell cycle alterations in a subset of samples harboring L1/Alu singletons. Collectively, our results (i) suggest that known TE regulatory mechanisms may also play regulatory roles in diverse human populations, (ii) expand the list of genic and repetitive genomic loci implicated in TE copy number variation, and (iii) reinforce the links between TEs and disease.","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1101/2024.09.11.612570
José M. Uribe-Salazar, Gulhan Kaya, KaeChandra B. Weyenberg, Brittany Radke, Keiko Keiko Hino, Daniela C. Soto, Jia-Lin Shiu, Wenzhu Zhang, Cole Ingamells, Nicholas K. Haghani, Emily Xu, Joseph Rosas, Sergi Simó, Joel Miesfeld, Tom Glaser, Scott C Baraban, Li-En C Jao, Megan Y. Dennis
Recent expansion of duplicated genes unique in the Homo lineage likely contributed to brain evolution and other human-specific traits. One hallmark example is the expansion of the human SRGAP2 family, resulting in a human-specific paralog SRGAP2C. Introduction of SRGAP2C in mouse models is associated with altering cortical neuronal migration, axon guidance, synaptogenesis, and sensory-task performance. Truncated, human-specific SRGAP2C heterodimerizes with the full-length ancestral gene product SRGAP2A and antagonizes its functions. However, the significance of SRGAP2 duplication beyond neocortex development has not been elucidated due to the embryonic lethality of complete Srgap2 knockout in mice. Using zebrafish, we showed that srgap2 knockout results in viable offspring that phenocopy "humanized" SRGAP2C larvae. Specifically, human SRGAP2C protein interacts with zebrafish Srgap2, demonstrating similar Srgap2 functional antagonism observed in mice. Shared traits between knockout and humanized zebrafish larvae include altered morphometric features (i.e., reduced body length and inter-eye distance) and differential expression of synapse-, axogenesis-, vision-related genes. Through single-cell transcriptome analysis, we further observed a skewed balance of excitatory and inhibitory neurons that likely contributes to increased susceptibility to seizures displayed by Srgap2 mutant larvae, a phenotype resembling SRGAP2 loss-of-function in a child with early infantile epileptic encephalopathy. Single-cell data also pointed to strong microglia expression of srgap2 with mutants exhibiting altered membrane dynamics and likely delayed maturation of microglial cells. srgap2-expressing microglia cells were also detected in the developing eye together with altered expression of genes related to axogenesis and synaptogenesis in mutant retinal cells. Consistent with the perturbed gene expression in the retina, we found that SRGAP2 mutant larvae exhibited increased sensitivity to broad and fine visual cues. Finally, comparing the transcriptomes of relevant cell types between human (+SRGAP2C) and non-human primates (-SRGAP2C) revealed significant overlaps of gene alterations with mutant cells in our zebrafish models; this suggests that SRGAP2C plays similar roles altering microglia and the visual system in modern humans. Together, our functional characterization of zebrafish Srgap2 and human SRGAP2C in zebrafish uncovered novel gene functions and highlights the strength of cross-species analysis in understanding the development of human-specific features.
{"title":"Zebrafish models of human-duplicated gene SRGAP2 reveal novel functions in microglia and visual system development","authors":"José M. Uribe-Salazar, Gulhan Kaya, KaeChandra B. Weyenberg, Brittany Radke, Keiko Keiko Hino, Daniela C. Soto, Jia-Lin Shiu, Wenzhu Zhang, Cole Ingamells, Nicholas K. Haghani, Emily Xu, Joseph Rosas, Sergi Simó, Joel Miesfeld, Tom Glaser, Scott C Baraban, Li-En C Jao, Megan Y. Dennis","doi":"10.1101/2024.09.11.612570","DOIUrl":"https://doi.org/10.1101/2024.09.11.612570","url":null,"abstract":"Recent expansion of duplicated genes unique in the Homo lineage likely contributed to brain evolution and other human-specific traits. One hallmark example is the expansion of the human SRGAP2 family, resulting in a human-specific paralog SRGAP2C. Introduction of SRGAP2C in mouse models is associated with altering cortical neuronal migration, axon guidance, synaptogenesis, and sensory-task performance. Truncated, human-specific SRGAP2C heterodimerizes with the full-length ancestral gene product SRGAP2A and antagonizes its functions. However, the significance of SRGAP2 duplication beyond neocortex development has not been elucidated due to the embryonic lethality of complete Srgap2 knockout in mice. Using zebrafish, we showed that srgap2 knockout results in viable offspring that phenocopy \"humanized\" SRGAP2C larvae. Specifically, human SRGAP2C protein interacts with zebrafish Srgap2, demonstrating similar Srgap2 functional antagonism observed in mice. Shared traits between knockout and humanized zebrafish larvae include altered morphometric features (i.e., reduced body length and inter-eye distance) and differential expression of synapse-, axogenesis-, vision-related genes. Through single-cell transcriptome analysis, we further observed a skewed balance of excitatory and inhibitory neurons that likely contributes to increased susceptibility to seizures displayed by Srgap2 mutant larvae, a phenotype resembling SRGAP2 loss-of-function in a child with early infantile epileptic encephalopathy. Single-cell data also pointed to strong microglia expression of srgap2 with mutants exhibiting altered membrane dynamics and likely delayed maturation of microglial cells. srgap2-expressing microglia cells were also detected in the developing eye together with altered expression of genes related to axogenesis and synaptogenesis in mutant retinal cells. Consistent with the perturbed gene expression in the retina, we found that SRGAP2 mutant larvae exhibited increased sensitivity to broad and fine visual cues. Finally, comparing the transcriptomes of relevant cell types between human (+SRGAP2C) and non-human primates (-SRGAP2C) revealed significant overlaps of gene alterations with mutant cells in our zebrafish models; this suggests that SRGAP2C plays similar roles altering microglia and the visual system in modern humans. Together, our functional characterization of zebrafish Srgap2 and human SRGAP2C in zebrafish uncovered novel gene functions and highlights the strength of cross-species analysis in understanding the development of human-specific features.","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"180 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1101/2024.09.05.611437
Moritz A Peters, Volker Soltys, Dingwen Su, Marek Kučka, Yingguang Frank Chan
The adaptive immune system's efficacy relies on the diversity of T cell receptors and the ability to distinguish between self and foreign antigens. Analysis of the paired heterodimeric αβ-TCR chains of individual T cells requires single-cell resolution, but existing single-cell approaches offer limited coverage of the vast TCR repertoire diversity. Here we introduce CITR-seq, a novel, instrument-free, high-throughput method for single-cell TCR sequencing with >88% αβ-TCR pairing precision. We analyzed the TCR repertoires of CD8+ T cells originated from 32 inbred mice using CITR-seq, comprising four evolutionary divergent sister species and their F1 hybrids. Overall, we identified more than 5 million confidently paired TCRs. We found that V(D)J gene usage patterns are highly specific to the genotype and that Vβ-gene usage is strongly impacted by thymic selection. Using F1 hybrids, we show that differences in gene segment usage across species are likely caused by cis-acting factors prior to thymic selection, which imposed strong allelic biases. At the greatest divergence, this led to increased rates of TCR depletion through rejection of particular Vβ-genes. TCR repertoire overlap analysis across all mice revealed that sharing of identical paired CDR3 amino acid motifs is four times more frequent than predicted by random pairing of TCRα and TCRβ chains, with significantly increased sharing rates among related individuals. Collectively, we show that beyond the stochastic nature of TCR repertoire generation, genetic factors contribute significantly to the shape of an individual's repertoire.
{"title":"Genetic determinants of distinct CD8+ α/β-TCR repertoires in the genus Mus","authors":"Moritz A Peters, Volker Soltys, Dingwen Su, Marek Kučka, Yingguang Frank Chan","doi":"10.1101/2024.09.05.611437","DOIUrl":"https://doi.org/10.1101/2024.09.05.611437","url":null,"abstract":"The adaptive immune system's efficacy relies on the diversity of T cell receptors and the ability to distinguish between self and foreign antigens. Analysis of the paired heterodimeric αβ-TCR chains of individual T cells requires single-cell resolution, but existing single-cell approaches offer limited coverage of the vast TCR repertoire diversity. Here we introduce CITR-seq, a novel, instrument-free, high-throughput method for single-cell TCR sequencing with >88% αβ-TCR pairing precision. We analyzed the TCR repertoires of CD8+ T cells originated from 32 inbred mice using CITR-seq, comprising four evolutionary divergent sister species and their F1 hybrids. Overall, we identified more than 5 million confidently paired TCRs. We found that V(D)J gene usage patterns are highly specific to the genotype and that Vβ-gene usage is strongly impacted by thymic selection. Using F1 hybrids, we show that differences in gene segment usage across species are likely caused by <em>cis</em>-acting factors prior to thymic selection, which imposed strong allelic biases. At the greatest divergence, this led to increased rates of TCR depletion through rejection of particular Vβ-genes. TCR repertoire overlap analysis across all mice revealed that sharing of identical paired CDR3 amino acid motifs is four times more frequent than predicted by random pairing of TCRα and TCRβ chains, with significantly increased sharing rates among related individuals. Collectively, we show that beyond the stochastic nature of TCR repertoire generation, genetic factors contribute significantly to the shape of an individual's repertoire.","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"114 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}