首页 > 最新文献

Genome Biology最新文献

英文 中文
H3K18 lactylation marks tissue-specific active enhancers. H3K18乳酸化标志着组织特异性活性增强剂。
IF 12.3 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2022-10-03 DOI: 10.1186/s13059-022-02775-y
Eva Galle, Chee-Wai Wong, Adhideb Ghosh, Thibaut Desgeorges, Kate Melrose, Laura C Hinte, Daniel Castellano-Castillo, Magdalena Engl, Joao Agostinho de Sousa, Francisco Javier Ruiz-Ojeda, Katrien De Bock, Jonatan R Ruiz, Ferdinand von Meyenn

Background: Histone lactylation has been recently described as a novel histone post-translational modification linking cellular metabolism to epigenetic regulation.

Results: Given the expected relevance of this modification and current limited knowledge of its function, we generate genome-wide datasets of H3K18la distribution in various in vitro and in vivo samples, including mouse embryonic stem cells, macrophages, adipocytes, and mouse and human skeletal muscle. We compare them to profiles of well-established histone modifications and gene expression patterns. Supervised and unsupervised bioinformatics analysis shows that global H3K18la distribution resembles H3K27ac, although we also find notable differences. H3K18la marks active CpG island-containing promoters of highly expressed genes across most tissues assessed, including many housekeeping genes, and positively correlates with H3K27ac and H3K4me3 as well as with gene expression. In addition, H3K18la is enriched at active enhancers that lie in proximity to genes that are functionally important for the respective tissue.

Conclusions: Overall, our data suggests that H3K18la is not only a marker for active promoters, but also a mark of tissue specific active enhancers.

背景:组蛋白乳酸化最近被描述为一种新的组蛋白翻译后修饰,将细胞代谢与表观遗传调控联系起来。结果:考虑到这种修饰的预期相关性和目前对其功能的有限了解,我们生成了H3K18la在各种体外和体内样本中分布的全基因组数据集,包括小鼠胚胎干细胞、巨噬细胞、脂肪细胞、小鼠和人类骨骼肌。我们将它们与已建立的组蛋白修饰和基因表达模式进行比较。有监督和无监督生物信息学分析表明,全球H3K18la分布与H3K27ac相似,尽管我们也发现了显著差异。H3K18la在大多数被评估的组织中标记了高表达基因的活性CpG岛启动子,包括许多管家基因,并且与H3K27ac和H3K4me3以及基因表达呈正相关。此外,H3K18la富含活性增强子,这些增强子位于对各自组织具有重要功能的基因附近。结论:总的来说,我们的数据表明H3K18la不仅是活性启动子的标记,也是组织特异性活性增强子的标记。
{"title":"H3K18 lactylation marks tissue-specific active enhancers.","authors":"Eva Galle,&nbsp;Chee-Wai Wong,&nbsp;Adhideb Ghosh,&nbsp;Thibaut Desgeorges,&nbsp;Kate Melrose,&nbsp;Laura C Hinte,&nbsp;Daniel Castellano-Castillo,&nbsp;Magdalena Engl,&nbsp;Joao Agostinho de Sousa,&nbsp;Francisco Javier Ruiz-Ojeda,&nbsp;Katrien De Bock,&nbsp;Jonatan R Ruiz,&nbsp;Ferdinand von Meyenn","doi":"10.1186/s13059-022-02775-y","DOIUrl":"https://doi.org/10.1186/s13059-022-02775-y","url":null,"abstract":"<p><strong>Background: </strong>Histone lactylation has been recently described as a novel histone post-translational modification linking cellular metabolism to epigenetic regulation.</p><p><strong>Results: </strong>Given the expected relevance of this modification and current limited knowledge of its function, we generate genome-wide datasets of H3K18la distribution in various in vitro and in vivo samples, including mouse embryonic stem cells, macrophages, adipocytes, and mouse and human skeletal muscle. We compare them to profiles of well-established histone modifications and gene expression patterns. Supervised and unsupervised bioinformatics analysis shows that global H3K18la distribution resembles H3K27ac, although we also find notable differences. H3K18la marks active CpG island-containing promoters of highly expressed genes across most tissues assessed, including many housekeeping genes, and positively correlates with H3K27ac and H3K4me3 as well as with gene expression. In addition, H3K18la is enriched at active enhancers that lie in proximity to genes that are functionally important for the respective tissue.</p><p><strong>Conclusions: </strong>Overall, our data suggests that H3K18la is not only a marker for active promoters, but also a mark of tissue specific active enhancers.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"23 1","pages":"207"},"PeriodicalIF":12.3,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9531456/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33486751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Repeated turnovers keep sex chromosomes young in willows. 在柳树中,反复的循环使性染色体保持年轻。
IF 12.3 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2022-09-23 DOI: 10.1186/s13059-022-02769-w
Deyan Wang, Yiling Li, Mengmeng Li, Wenlu Yang, Xinzhi Ma, Lei Zhang, Yubo Wang, Yanlin Feng, Yuanyuan Zhang, Ran Zhou, Brian J Sanderson, Ken Keefover-Ring, Tongming Yin, Lawrence B Smart, Stephen P DiFazio, Jianquan Liu, Matthew Olson, Tao Ma

Background: Salicaceae species have diverse sex determination systems and frequent sex chromosome turnovers. However, compared with poplars, the diversity of sex determination in willows is poorly understood, and little is known about the evolutionary forces driving their turnover. Here, we characterized the sex determination in two Salix species, S. chaenomeloides and S. arbutifolia, which have an XY system on chromosome 7 and 15, respectively.

Results: Based on the assemblies of their sex determination regions, we found that the sex determination mechanism of willows may have underlying similarities with poplars, both involving intact and/or partial homologs of a type A cytokinin response regulator (RR) gene. Comparative analyses suggested that at least two sex turnover events have occurred in Salix, one preserving the ancestral pattern of male heterogamety, and the other changing heterogametic sex from XY to ZW, which could be partly explained by the "deleterious mutation load" and "sexually antagonistic selection" theoretical models. We hypothesize that these repeated turnovers keep sex chromosomes of willow species in a perpetually young state, leading to limited degeneration.

Conclusions: Our findings further improve the evolutionary trajectory of sex chromosomes in Salicaceae species, explore the evolutionary forces driving the repeated turnovers of their sex chromosomes, and provide a valuable reference for the study of sex chromosomes in other species.

背景:水杨科物种具有多样化的性别决定系统和频繁的性染色体翻转。然而,与杨树相比,人们对柳树性别决定的多样性知之甚少,对推动其更替的进化力量知之甚少。本研究对分别在7号染色体和15号染色体上具有XY系统的两种柳属植物S. chaenomeloides和S. arbutifolia的性别决定进行了研究。结果:基于性别决定区域的组装,我们发现柳树的性别决定机制可能与杨树具有潜在的相似性,都涉及a型细胞分裂素反应调节因子(RR)基因的完整和/或部分同源。比较分析表明,柳属植物至少发生了两种性别转换事件,一种是保留了祖先的雄性异配子模式,另一种是将异配子性别从XY转变为ZW,这可以部分解释为“有害突变负荷”和“性拮抗选择”理论模型。我们假设这些重复的翻转使柳树物种的性染色体永远处于年轻状态,导致有限的退化。结论:本研究结果进一步完善了水杨科物种性染色体的进化轨迹,探索了水杨科物种性染色体重复翻转的进化动力,为其他物种性染色体的研究提供了有价值的参考。
{"title":"Repeated turnovers keep sex chromosomes young in willows.","authors":"Deyan Wang,&nbsp;Yiling Li,&nbsp;Mengmeng Li,&nbsp;Wenlu Yang,&nbsp;Xinzhi Ma,&nbsp;Lei Zhang,&nbsp;Yubo Wang,&nbsp;Yanlin Feng,&nbsp;Yuanyuan Zhang,&nbsp;Ran Zhou,&nbsp;Brian J Sanderson,&nbsp;Ken Keefover-Ring,&nbsp;Tongming Yin,&nbsp;Lawrence B Smart,&nbsp;Stephen P DiFazio,&nbsp;Jianquan Liu,&nbsp;Matthew Olson,&nbsp;Tao Ma","doi":"10.1186/s13059-022-02769-w","DOIUrl":"https://doi.org/10.1186/s13059-022-02769-w","url":null,"abstract":"<p><strong>Background: </strong>Salicaceae species have diverse sex determination systems and frequent sex chromosome turnovers. However, compared with poplars, the diversity of sex determination in willows is poorly understood, and little is known about the evolutionary forces driving their turnover. Here, we characterized the sex determination in two Salix species, S. chaenomeloides and S. arbutifolia, which have an XY system on chromosome 7 and 15, respectively.</p><p><strong>Results: </strong>Based on the assemblies of their sex determination regions, we found that the sex determination mechanism of willows may have underlying similarities with poplars, both involving intact and/or partial homologs of a type A cytokinin response regulator (RR) gene. Comparative analyses suggested that at least two sex turnover events have occurred in Salix, one preserving the ancestral pattern of male heterogamety, and the other changing heterogametic sex from XY to ZW, which could be partly explained by the \"deleterious mutation load\" and \"sexually antagonistic selection\" theoretical models. We hypothesize that these repeated turnovers keep sex chromosomes of willow species in a perpetually young state, leading to limited degeneration.</p><p><strong>Conclusions: </strong>Our findings further improve the evolutionary trajectory of sex chromosomes in Salicaceae species, explore the evolutionary forces driving the repeated turnovers of their sex chromosomes, and provide a valuable reference for the study of sex chromosomes in other species.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"23 1","pages":"200"},"PeriodicalIF":12.3,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9502649/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33479701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Comparative analyses of vertebrate CPEB proteins define two subfamilies with coordinated yet distinct functions in post-transcriptional gene regulation. 脊椎动物 CPEB 蛋白的比较分析确定了两个亚家族,它们在转录后基因调控中具有协调但不同的功能。
IF 12.3 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2022-09-12 DOI: 10.1186/s13059-022-02759-y
Berta Duran-Arqué, Manuel Cañete, Chiara Lara Castellazzi, Anna Bartomeu, Anna Ferrer-Caelles, Oscar Reina, Adrià Caballé, Marina Gay, Gianluca Arauz-Garofalo, Eulalia Belloc, Raúl Mendez

Background: Vertebrate CPEB proteins bind mRNAs at cytoplasmic polyadenylation elements (CPEs) in their 3' UTRs, leading to cytoplasmic changes in their poly(A) tail lengths; this can promote translational repression or activation of the mRNA. However, neither the regulation nor the mechanisms of action of the CPEB family per se have been systematically addressed to date.

Results: Based on a comparative analysis of the four vertebrate CPEBs, we determine their differential regulation by phosphorylation, the composition and properties of their supramolecular assemblies, and their target mRNAs. We show that all four CPEBs are able to recruit the CCR4-NOT deadenylation complex to repress the translation. However, their regulation, mechanism of action, and target mRNAs define two subfamilies. Thus, CPEB1 forms ribonucleoprotein complexes that are remodeled upon a single phosphorylation event and are associated with mRNAs containing canonical CPEs. CPEB2-4 are regulated by multiple proline-directed phosphorylations that control their liquid-liquid phase separation. CPEB2-4 mRNA targets include CPEB1-bound transcripts, with canonical CPEs, but also a specific subset of mRNAs with non-canonical CPEs.

Conclusions: Altogether, these results show how, globally, the CPEB family of proteins is able to integrate cellular cues to generate a fine-tuned adaptive response in gene expression regulation through the coordinated actions of all four members.

背景:脊椎动物 CPEB 蛋白在 mRNA 的 3' UTR 中的细胞质多腺苷酸化元件 (CPE) 上结合 mRNA,导致其 poly(A) 尾长度发生细胞质变化;这可促进 mRNA 的翻译抑制或激活。然而,迄今为止,CPEB 家族本身的调控和作用机制都没有得到系统的研究:结果:基于对四种脊椎动物 CPEBs 的比较分析,我们确定了它们通过磷酸化的不同调控方式、超分子组装的组成和性质以及它们的靶 mRNA。我们发现,这四种 CPEBs 都能招募 CCR4-NOT 死酰化复合物来抑制翻译。然而,它们的调控、作用机制和目标 mRNA 界定了两个亚家族。因此,CPEB1 形成的核糖核蛋白复合物在一次磷酸化事件后就会重塑,并与含有典型 CPE 的 mRNA 相关联。CPEB2-4 受多种脯氨酸定向磷酸化调控,控制其液相-液相分离。CPEB2-4 mRNA 的靶标包括具有典型 CPEs 的 CPEB1 结合转录本,以及具有非典型 CPEs 的特定 mRNA 子集:总之,这些结果表明,CPEB 蛋白家族能够整合细胞线索,通过所有四个成员的协调作用,在基因表达调控中产生微调的适应性反应。
{"title":"Comparative analyses of vertebrate CPEB proteins define two subfamilies with coordinated yet distinct functions in post-transcriptional gene regulation.","authors":"Berta Duran-Arqué, Manuel Cañete, Chiara Lara Castellazzi, Anna Bartomeu, Anna Ferrer-Caelles, Oscar Reina, Adrià Caballé, Marina Gay, Gianluca Arauz-Garofalo, Eulalia Belloc, Raúl Mendez","doi":"10.1186/s13059-022-02759-y","DOIUrl":"10.1186/s13059-022-02759-y","url":null,"abstract":"<p><strong>Background: </strong>Vertebrate CPEB proteins bind mRNAs at cytoplasmic polyadenylation elements (CPEs) in their 3' UTRs, leading to cytoplasmic changes in their poly(A) tail lengths; this can promote translational repression or activation of the mRNA. However, neither the regulation nor the mechanisms of action of the CPEB family per se have been systematically addressed to date.</p><p><strong>Results: </strong>Based on a comparative analysis of the four vertebrate CPEBs, we determine their differential regulation by phosphorylation, the composition and properties of their supramolecular assemblies, and their target mRNAs. We show that all four CPEBs are able to recruit the CCR4-NOT deadenylation complex to repress the translation. However, their regulation, mechanism of action, and target mRNAs define two subfamilies. Thus, CPEB1 forms ribonucleoprotein complexes that are remodeled upon a single phosphorylation event and are associated with mRNAs containing canonical CPEs. CPEB2-4 are regulated by multiple proline-directed phosphorylations that control their liquid-liquid phase separation. CPEB2-4 mRNA targets include CPEB1-bound transcripts, with canonical CPEs, but also a specific subset of mRNAs with non-canonical CPEs.</p><p><strong>Conclusions: </strong>Altogether, these results show how, globally, the CPEB family of proteins is able to integrate cellular cues to generate a fine-tuned adaptive response in gene expression regulation through the coordinated actions of all four members.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"23 1","pages":"192"},"PeriodicalIF":12.3,"publicationDate":"2022-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9465852/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33463627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Control of immediate early gene expression by CPEB4-repressor complex-mediated mRNA degradation. 通过 CPEB4-repressor 复合物介导的 mRNA 降解控制即时早期基因的表达。
IF 12.3 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2022-09-12 DOI: 10.1186/s13059-022-02760-5
Fabian Poetz, Svetlana Lebedeva, Johanna Schott, Doris Lindner, Uwe Ohler, Georg Stoecklin

Background: Cytoplasmic polyadenylation element-binding protein 4 (CPEB4) is known to associate with cytoplasmic polyadenylation elements (CPEs) located in the 3' untranslated region (UTR) of specific mRNAs and assemble an activator complex promoting the translation of target mRNAs through cytoplasmic polyadenylation.

Results: Here, we find that CPEB4 is part of an alternative repressor complex that mediates mRNA degradation by associating with the evolutionarily conserved CCR4-NOT deadenylase complex. We identify human CPEB4 as an RNA-binding protein (RBP) with enhanced association to poly(A) RNA upon inhibition of class I histone deacetylases (HDACs), a condition known to cause widespread degradation of poly(A)-containing mRNA. Photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP) analysis using endogenously tagged CPEB4 in HeLa cells reveals that CPEB4 preferentially binds to the 3'UTR of immediate early gene mRNAs, at G-containing variants of the canonical U- and A-rich CPE located in close proximity to poly(A) sites. By transcriptome-wide mRNA decay measurements, we find that the strength of CPEB4 binding correlates with short mRNA half-lives and that loss of CPEB4 expression leads to the stabilization of immediate early gene mRNAs. Akin to CPEB4, we demonstrate that CPEB1 and CPEB2 also confer mRNA instability by recruitment of the CCR4-NOT complex.

Conclusions: While CPEB4 was previously known for its ability to stimulate cytoplasmic polyadenylation, our findings establish an additional function for CPEB4 as the RNA adaptor of a repressor complex that enhances the degradation of short-lived immediate early gene mRNAs.

背景:已知细胞质多腺苷酸化元结合蛋白4(CPEB4)与位于特定mRNA的3'非翻译区(UTR)的细胞质多腺苷酸化元(CPEs)结合,并组装成一个激活复合物,通过细胞质多腺苷酸化促进目标mRNA的翻译:在这里,我们发现 CPEB4 是替代性抑制复合体的一部分,该复合体通过与进化保守的 CCR4-NOT 死酶复合体结合来介导 mRNA 降解。我们发现人类 CPEB4 是一种 RNA 结合蛋白(RBP),它在抑制 I 类组蛋白去乙酰化酶(HDACs)时与多聚(A)RNA 的结合增强,而这种情况已知会导致含多聚(A)mRNA 的广泛降解。在 HeLa 细胞中使用内源性标记的 CPEB4 进行光活化核糖核苷增强交联和免疫沉淀(PAR-CLIP)分析发现,CPEB4 优先结合到即刻早期基因 mRNA 的 3'UTR 上,结合点位于靠近多聚(A)位点的富含 U 和 A 的 CPE 的含 G 变体处。通过对整个转录组 mRNA 的衰变测量,我们发现 CPEB4 结合的强度与短 mRNA 半衰期相关,而 CPEB4 表达的缺失会导致早期基因 mRNA 的稳定。与 CPEB4 类似,我们证明 CPEB1 和 CPEB2 也会通过招募 CCR4-NOT 复合物而导致 mRNA 不稳定:结论:尽管 CPEB4 以前因其刺激细胞质多聚腺苷酸化的能力而为人所知,但我们的研究结果为 CPEB4 确立了另一项功能,即作为抑制复合体的 RNA 适配体,增强短效直接早期基因 mRNA 的降解。
{"title":"Control of immediate early gene expression by CPEB4-repressor complex-mediated mRNA degradation.","authors":"Fabian Poetz, Svetlana Lebedeva, Johanna Schott, Doris Lindner, Uwe Ohler, Georg Stoecklin","doi":"10.1186/s13059-022-02760-5","DOIUrl":"10.1186/s13059-022-02760-5","url":null,"abstract":"<p><strong>Background: </strong>Cytoplasmic polyadenylation element-binding protein 4 (CPEB4) is known to associate with cytoplasmic polyadenylation elements (CPEs) located in the 3' untranslated region (UTR) of specific mRNAs and assemble an activator complex promoting the translation of target mRNAs through cytoplasmic polyadenylation.</p><p><strong>Results: </strong>Here, we find that CPEB4 is part of an alternative repressor complex that mediates mRNA degradation by associating with the evolutionarily conserved CCR4-NOT deadenylase complex. We identify human CPEB4 as an RNA-binding protein (RBP) with enhanced association to poly(A) RNA upon inhibition of class I histone deacetylases (HDACs), a condition known to cause widespread degradation of poly(A)-containing mRNA. Photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP) analysis using endogenously tagged CPEB4 in HeLa cells reveals that CPEB4 preferentially binds to the 3'UTR of immediate early gene mRNAs, at G-containing variants of the canonical U- and A-rich CPE located in close proximity to poly(A) sites. By transcriptome-wide mRNA decay measurements, we find that the strength of CPEB4 binding correlates with short mRNA half-lives and that loss of CPEB4 expression leads to the stabilization of immediate early gene mRNAs. Akin to CPEB4, we demonstrate that CPEB1 and CPEB2 also confer mRNA instability by recruitment of the CCR4-NOT complex.</p><p><strong>Conclusions: </strong>While CPEB4 was previously known for its ability to stimulate cytoplasmic polyadenylation, our findings establish an additional function for CPEB4 as the RNA adaptor of a repressor complex that enhances the degradation of short-lived immediate early gene mRNAs.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"23 1","pages":"193"},"PeriodicalIF":12.3,"publicationDate":"2022-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9465963/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33470614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Positional motif analysis reveals the extent of specificity of protein-RNA interactions observed by CLIP. 位置主题分析揭示了 CLIP 观察到的蛋白质-RNA 相互作用的特异性程度。
IF 12.3 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2022-09-09 DOI: 10.1186/s13059-022-02755-2
Klara Kuret, Aram Gustav Amalietti, D Marc Jones, Charlotte Capitanchik, Jernej Ule

Background: Crosslinking and immunoprecipitation (CLIP) is a method used to identify in vivo RNA-protein binding sites on a transcriptome-wide scale. With the increasing amounts of available data for RNA-binding proteins (RBPs), it is important to understand to what degree the enriched motifs specify the RNA-binding profiles of RBPs in cells.

Results: We develop positionally enriched k-mer analysis (PEKA), a computational tool for efficient analysis of enriched motifs from individual CLIP datasets, which minimizes the impact of technical and regional genomic biases by internal data normalization. We cross-validate PEKA with mCross and show that the use of input control for background correction is not required to yield high specificity of enriched motifs. We identify motif classes with common enrichment patterns across eCLIP datasets and across RNA regions, while also observing variations in the specificity and the extent of motif enrichment across eCLIP datasets, between variant CLIP protocols, and between CLIP and in vitro binding data. Thereby, we gain insights into the contributions of technical and regional genomic biases to the enriched motifs, and find how motif enrichment features relate to the domain composition and low-complexity regions of the studied proteins.

Conclusions: Our study provides insights into the overall contributions of regional binding preferences, protein domains, and low-complexity regions to the specificity of protein-RNA interactions, and shows the value of cross-motif and cross-RBP comparison for data interpretation. Our results are presented for exploratory analysis via an online platform in an RBP-centric and motif-centric manner ( https://imaps.goodwright.com/apps/peka/ ).

背景:交联和免疫沉淀(CLIP)是一种用于在整个转录组范围内鉴定体内 RNA 蛋白结合位点的方法。随着 RNA 结合蛋白(RBPs)可用数据量的不断增加,了解富集基序在多大程度上确定了细胞中 RBPs 的 RNA 结合特征就显得尤为重要:我们开发了位置富集 k-mer 分析(PEKA),这是一种用于高效分析来自单个 CLIP 数据集的富集基元的计算工具,它通过内部数据归一化将技术和区域基因组偏差的影响降至最低。我们用 mCross 对 PEKA 进行了交叉验证,结果表明不需要使用输入控制进行背景校正就能获得高特异性的富集主题。我们在 eCLIP 数据集和不同 RNA 区域之间发现了具有共同富集模式的主题词类别,同时也观察到了不同 eCLIP 数据集、不同 CLIP 协议以及不同 CLIP 和体外结合数据之间在特异性和主题词富集程度上的差异。因此,我们深入了解了技术和区域基因组偏差对主题词富集的贡献,并发现了主题词富集特征与所研究蛋白质的结构域组成和低复杂性区域之间的关系:我们的研究深入揭示了区域结合偏好、蛋白质结构域和低复杂性区域对蛋白质-RNA相互作用特异性的总体贡献,并显示了跨主题词和跨RBP比较对数据解读的价值。我们的研究结果通过在线平台以 RBP 为中心和以图案为中心的方式进行了探索性分析 ( https://imaps.goodwright.com/apps/peka/ )。
{"title":"Positional motif analysis reveals the extent of specificity of protein-RNA interactions observed by CLIP.","authors":"Klara Kuret, Aram Gustav Amalietti, D Marc Jones, Charlotte Capitanchik, Jernej Ule","doi":"10.1186/s13059-022-02755-2","DOIUrl":"10.1186/s13059-022-02755-2","url":null,"abstract":"<p><strong>Background: </strong>Crosslinking and immunoprecipitation (CLIP) is a method used to identify in vivo RNA-protein binding sites on a transcriptome-wide scale. With the increasing amounts of available data for RNA-binding proteins (RBPs), it is important to understand to what degree the enriched motifs specify the RNA-binding profiles of RBPs in cells.</p><p><strong>Results: </strong>We develop positionally enriched k-mer analysis (PEKA), a computational tool for efficient analysis of enriched motifs from individual CLIP datasets, which minimizes the impact of technical and regional genomic biases by internal data normalization. We cross-validate PEKA with mCross and show that the use of input control for background correction is not required to yield high specificity of enriched motifs. We identify motif classes with common enrichment patterns across eCLIP datasets and across RNA regions, while also observing variations in the specificity and the extent of motif enrichment across eCLIP datasets, between variant CLIP protocols, and between CLIP and in vitro binding data. Thereby, we gain insights into the contributions of technical and regional genomic biases to the enriched motifs, and find how motif enrichment features relate to the domain composition and low-complexity regions of the studied proteins.</p><p><strong>Conclusions: </strong>Our study provides insights into the overall contributions of regional binding preferences, protein domains, and low-complexity regions to the specificity of protein-RNA interactions, and shows the value of cross-motif and cross-RBP comparison for data interpretation. Our results are presented for exploratory analysis via an online platform in an RBP-centric and motif-centric manner ( https://imaps.goodwright.com/apps/peka/ ).</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"23 1","pages":"191"},"PeriodicalIF":12.3,"publicationDate":"2022-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9461102/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33460725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2. 使用Cuttlefish构建压缩de Bruijn图的可伸缩、超快速和低内存。
IF 12.3 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2022-09-08 DOI: 10.1186/s13059-022-02743-6
Jamshed Khan, Marek Kokot, Sebastian Deorowicz, Rob Patro

The de Bruijn graph is a key data structure in modern computational genomics, and construction of its compacted variant resides upstream of many genomic analyses. As the quantity of genomic data grows rapidly, this often forms a computational bottleneck. We present Cuttlefish 2, significantly advancing the state-of-the-art for this problem. On a commodity server, it reduces the graph construction time for 661K bacterial genomes, of size 2.58Tbp, from 4.5 days to 17-23 h; and it constructs the graph for 1.52Tbp white spruce reads in approximately 10 h, while the closest competitor requires 54-58 h, using considerably more memory.

德布鲁因图是现代计算基因组学中的一个关键数据结构,其压缩变体的构建位于许多基因组分析的上游。随着基因组数据数量的快速增长,这往往会形成计算瓶颈。我们推出了“墨鱼2号”,大大提高了解决这一问题的技术水平。在商品服务器上,它将大小为2.58Tbp的661K个细菌基因组的图形构建时间从4.5天缩短到17-23小时;它在大约10小时内构建了1.52Tbp白云杉读数的图,而最接近的竞争对手需要54-58小时,使用了相当多的内存。
{"title":"Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2.","authors":"Jamshed Khan,&nbsp;Marek Kokot,&nbsp;Sebastian Deorowicz,&nbsp;Rob Patro","doi":"10.1186/s13059-022-02743-6","DOIUrl":"10.1186/s13059-022-02743-6","url":null,"abstract":"<p><p>The de Bruijn graph is a key data structure in modern computational genomics, and construction of its compacted variant resides upstream of many genomic analyses. As the quantity of genomic data grows rapidly, this often forms a computational bottleneck. We present Cuttlefish 2, significantly advancing the state-of-the-art for this problem. On a commodity server, it reduces the graph construction time for 661K bacterial genomes, of size 2.58Tbp, from 4.5 days to 17-23 h; and it constructs the graph for 1.52Tbp white spruce reads in approximately 10 h, while the closest competitor requires 54-58 h, using considerably more memory.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"23 1","pages":"190"},"PeriodicalIF":12.3,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9454175/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33453631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Genomic insights into the evolutionary history and diversification of bulb traits in garlic. 大蒜鳞茎性状的进化历史和多样化的基因组见解。
IF 12.3 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2022-09-07 DOI: 10.1186/s13059-022-02756-1
Ningyang Li, Xueyu Zhang, Xiudong Sun, Siyuan Zhu, Yi Cheng, Meng Liu, Song Gao, Jiangjiang Zhang, Yanzhou Wang, Xiai Yang, Jianrong Chen, Fu Li, Qiaoyun He, Zheng Zeng, Xiaoge Yuan, Zhiman Zhou, Longchuan Ma, Taotao Wang, Xiang Li, Hanqiang Liu, Yupeng Pan, Mengyan Zhou, Chunsheng Gao, Gang Zhou, Zhenlin Han, Shiqi Liu, Jianguang Su, Zhihui Cheng, Shilin Tian, Touming Liu

Background: Garlic is an entirely sterile crop with important value as a vegetable, condiment, and medicine. However, the evolutionary history of garlic remains largely unknown.

Results: Here we report a comprehensive map of garlic genomic variation, consisting of amazingly 129.4 million variations. Evolutionary analysis indicates that the garlic population diverged at least 100,000 years ago, and the two groups cultivated in China were domesticated from two independent routes. Consequently, 15.0 and 17.5% of genes underwent an expression change in two cultivated groups, causing a reshaping of their transcriptomic architecture. Furthermore, we find independent domestication leads to few overlaps of deleterious substitutions in these two groups due to separate accumulation and selection-based removal. By analysis of selective sweeps, genome-wide trait associations and associated transcriptomic analysis, we uncover differential selections for the bulb traits in these two garlic groups during their domestication.

Conclusions: This study provides valuable resources for garlic genomics-based breeding, and comprehensive insights into the evolutionary history of this clonal-propagated crop.

背景:大蒜是一种完全不育的作物,具有重要的蔬菜、调味品和药用价值。然而,大蒜的进化史在很大程度上仍然未知。结果:在这里,我们报告了大蒜基因组变异的综合地图,包括惊人的1.294亿个变异。进化分析表明,大蒜种群至少在10万年前就分化了,在中国种植的两个群体是从两条独立的路线驯化的。因此,15.0%和17.5%的基因在两个培养组中经历了表达变化,导致其转录组结构的重塑。此外,我们发现在这两个群体中,由于单独的积累和基于选择的去除,独立驯化导致有害取代很少重叠。通过选择性扫描分析、全基因组性状关联和相关转录组分析,我们揭示了这两个大蒜群体在驯化过程中球茎性状的差异选择。结论:本研究为大蒜基因组育种提供了宝贵的资源,并对大蒜无性系繁殖作物的进化历史有了全面的了解。
{"title":"Genomic insights into the evolutionary history and diversification of bulb traits in garlic.","authors":"Ningyang Li,&nbsp;Xueyu Zhang,&nbsp;Xiudong Sun,&nbsp;Siyuan Zhu,&nbsp;Yi Cheng,&nbsp;Meng Liu,&nbsp;Song Gao,&nbsp;Jiangjiang Zhang,&nbsp;Yanzhou Wang,&nbsp;Xiai Yang,&nbsp;Jianrong Chen,&nbsp;Fu Li,&nbsp;Qiaoyun He,&nbsp;Zheng Zeng,&nbsp;Xiaoge Yuan,&nbsp;Zhiman Zhou,&nbsp;Longchuan Ma,&nbsp;Taotao Wang,&nbsp;Xiang Li,&nbsp;Hanqiang Liu,&nbsp;Yupeng Pan,&nbsp;Mengyan Zhou,&nbsp;Chunsheng Gao,&nbsp;Gang Zhou,&nbsp;Zhenlin Han,&nbsp;Shiqi Liu,&nbsp;Jianguang Su,&nbsp;Zhihui Cheng,&nbsp;Shilin Tian,&nbsp;Touming Liu","doi":"10.1186/s13059-022-02756-1","DOIUrl":"https://doi.org/10.1186/s13059-022-02756-1","url":null,"abstract":"<p><strong>Background: </strong>Garlic is an entirely sterile crop with important value as a vegetable, condiment, and medicine. However, the evolutionary history of garlic remains largely unknown.</p><p><strong>Results: </strong>Here we report a comprehensive map of garlic genomic variation, consisting of amazingly 129.4 million variations. Evolutionary analysis indicates that the garlic population diverged at least 100,000 years ago, and the two groups cultivated in China were domesticated from two independent routes. Consequently, 15.0 and 17.5% of genes underwent an expression change in two cultivated groups, causing a reshaping of their transcriptomic architecture. Furthermore, we find independent domestication leads to few overlaps of deleterious substitutions in these two groups due to separate accumulation and selection-based removal. By analysis of selective sweeps, genome-wide trait associations and associated transcriptomic analysis, we uncover differential selections for the bulb traits in these two garlic groups during their domestication.</p><p><strong>Conclusions: </strong>This study provides valuable resources for garlic genomics-based breeding, and comprehensive insights into the evolutionary history of this clonal-propagated crop.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"23 1","pages":"188"},"PeriodicalIF":12.3,"publicationDate":"2022-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9450234/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33448119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Arabidopsis APOLO and human UPAT sequence-unrelated long noncoding RNAs can modulate DNA and histone methylation machineries in plants. 拟南芥APOLO和人类UPAT序列无关的长链非编码rna可以调节植物DNA和组蛋白甲基化机制。
IF 12.3 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2022-08-29 DOI: 10.1186/s13059-022-02750-7
Camille Fonouni-Farde, Aurélie Christ, Thomas Blein, María Florencia Legascue, Lucía Ferrero, Michaël Moison, Leandro Lucero, Juan Sebastián Ramírez-Prado, David Latrasse, Daniel Gonzalez, Moussa Benhamed, Leandro Quadrana, Martin Crespi, Federico Ariel

Background: RNA-DNA hybrid (R-loop)-associated long noncoding RNAs (lncRNAs), including the Arabidopsis lncRNA AUXIN-REGULATED PROMOTER LOOP (APOLO), are emerging as important regulators of three-dimensional chromatin conformation and gene transcriptional activity.

Results: Here, we show that in addition to the PRC1-component LIKE HETEROCHROMATIN PROTEIN 1 (LHP1), APOLO interacts with the methylcytosine-binding protein VARIANT IN METHYLATION 1 (VIM1), a conserved homolog of the mammalian DNA methylation regulator UBIQUITIN-LIKE CONTAINING PHD AND RING FINGER DOMAINS 1 (UHRF1). The APOLO-VIM1-LHP1 complex directly regulates the transcription of the auxin biosynthesis gene YUCCA2 by dynamically determining DNA methylation and H3K27me3 deposition over its promoter during the plant thermomorphogenic response. Strikingly, we demonstrate that the lncRNA UHRF1 Protein Associated Transcript (UPAT), a direct interactor of UHRF1 in humans, can be recognized by VIM1 and LHP1 in plant cells, despite the lack of sequence homology between UPAT and APOLO. In addition, we show that increased levels of APOLO or UPAT hamper VIM1 and LHP1 binding to YUCCA2 promoter and globally alter the Arabidopsis transcriptome in a similar manner.

Conclusions: Collectively, our results uncover a new mechanism in which a plant lncRNA coordinates Polycomb action and DNA methylation through the interaction with VIM1, and indicates that evolutionary unrelated lncRNAs with potentially conserved structures may exert similar functions by interacting with homolog partners.

背景:RNA-DNA杂交(R-loop)相关的长链非编码rna (lncRNAs),包括拟南芥lncRNA生长素调控的启动子LOOP (APOLO),正在成为三维染色质构象和基因转录活性的重要调节因子。结果:在这里,我们发现除了prc1组分LIKE异染色质蛋白1 (LHP1)外,APOLO还与甲基胞嘧啶结合蛋白VARIANT in METHYLATION 1 (VIM1)相互作用,VIM1是哺乳动物DNA甲基化调节因子UBIQUITIN-LIKE CONTAINING PHD AND RING FINGER DOMAINS 1 (UHRF1)的保守同源物。APOLO-VIM1-LHP1复合体在植物热形态响应过程中,通过动态决定DNA甲基化和H3K27me3在启动子上的沉积,直接调节生长素生物合成基因YUCCA2的转录。引人注目的是,我们证明了lncRNA UHRF1蛋白相关转录本(UPAT)是人类UHRF1的直接相互作用物,可以被植物细胞中的VIM1和LHP1识别,尽管UPAT和APOLO之间缺乏序列同源性。此外,我们发现APOLO或UPAT水平的增加阻碍了VIM1和LHP1与YUCCA2启动子的结合,并以类似的方式改变了拟南芥转录组。总之,我们的研究结果揭示了植物lncRNA通过与VIM1的相互作用协调Polycomb作用和DNA甲基化的新机制,并表明具有潜在保守结构的进化无关的lncRNA可能通过与同源伴侣的相互作用发挥类似的功能。
{"title":"The Arabidopsis APOLO and human UPAT sequence-unrelated long noncoding RNAs can modulate DNA and histone methylation machineries in plants.","authors":"Camille Fonouni-Farde, Aurélie Christ, Thomas Blein, María Florencia Legascue, Lucía Ferrero, Michaël Moison, Leandro Lucero, Juan Sebastián Ramírez-Prado, David Latrasse, Daniel Gonzalez, Moussa Benhamed, Leandro Quadrana, Martin Crespi, Federico Ariel","doi":"10.1186/s13059-022-02750-7","DOIUrl":"10.1186/s13059-022-02750-7","url":null,"abstract":"<p><strong>Background: </strong>RNA-DNA hybrid (R-loop)-associated long noncoding RNAs (lncRNAs), including the Arabidopsis lncRNA AUXIN-REGULATED PROMOTER LOOP (APOLO), are emerging as important regulators of three-dimensional chromatin conformation and gene transcriptional activity.</p><p><strong>Results: </strong>Here, we show that in addition to the PRC1-component LIKE HETEROCHROMATIN PROTEIN 1 (LHP1), APOLO interacts with the methylcytosine-binding protein VARIANT IN METHYLATION 1 (VIM1), a conserved homolog of the mammalian DNA methylation regulator UBIQUITIN-LIKE CONTAINING PHD AND RING FINGER DOMAINS 1 (UHRF1). The APOLO-VIM1-LHP1 complex directly regulates the transcription of the auxin biosynthesis gene YUCCA2 by dynamically determining DNA methylation and H3K27me3 deposition over its promoter during the plant thermomorphogenic response. Strikingly, we demonstrate that the lncRNA UHRF1 Protein Associated Transcript (UPAT), a direct interactor of UHRF1 in humans, can be recognized by VIM1 and LHP1 in plant cells, despite the lack of sequence homology between UPAT and APOLO. In addition, we show that increased levels of APOLO or UPAT hamper VIM1 and LHP1 binding to YUCCA2 promoter and globally alter the Arabidopsis transcriptome in a similar manner.</p><p><strong>Conclusions: </strong>Collectively, our results uncover a new mechanism in which a plant lncRNA coordinates Polycomb action and DNA methylation through the interaction with VIM1, and indicates that evolutionary unrelated lncRNAs with potentially conserved structures may exert similar functions by interacting with homolog partners.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"23 1","pages":"181"},"PeriodicalIF":12.3,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9422110/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33446687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple genome alignment in the telomere-to-telomere assembly era. 端粒与端粒组装时代的多基因组比对。
IF 12.3 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2022-08-29 DOI: 10.1186/s13059-022-02735-6
Bryce Kille, Advait Balaji, Fritz J Sedlazeck, Michael Nute, Todd J Treangen

With the arrival of telomere-to-telomere (T2T) assemblies of the human genome comes the computational challenge of efficiently and accurately constructing multiple genome alignments at an unprecedented scale. By identifying nucleotides across genomes which share a common ancestor, multiple genome alignments commonly serve as the bedrock for comparative genomics studies. In this review, we provide an overview of the algorithmic template that most multiple genome alignment methods follow. We also discuss prospective areas of improvement of multiple genome alignment for keeping up with continuously arriving high-quality T2T assembled genomes and for unlocking clinically-relevant insights.

随着人类基因组端粒到端粒(T2T)组装的到来,以前所未有的规模高效准确地构建多个基因组比对的计算挑战随之而来。通过识别具有共同祖先的基因组中的核苷酸,多基因组比对通常是比较基因组学研究的基础。在这篇综述中,我们提供了一个算法模板的概述,大多数多基因组比对方法遵循。我们还讨论了改进多基因组比对的前景领域,以跟上不断到达的高质量T2T组装基因组,并解锁临床相关的见解。
{"title":"Multiple genome alignment in the telomere-to-telomere assembly era.","authors":"Bryce Kille,&nbsp;Advait Balaji,&nbsp;Fritz J Sedlazeck,&nbsp;Michael Nute,&nbsp;Todd J Treangen","doi":"10.1186/s13059-022-02735-6","DOIUrl":"https://doi.org/10.1186/s13059-022-02735-6","url":null,"abstract":"<p><p>With the arrival of telomere-to-telomere (T2T) assemblies of the human genome comes the computational challenge of efficiently and accurately constructing multiple genome alignments at an unprecedented scale. By identifying nucleotides across genomes which share a common ancestor, multiple genome alignments commonly serve as the bedrock for comparative genomics studies. In this review, we provide an overview of the algorithmic template that most multiple genome alignment methods follow. We also discuss prospective areas of improvement of multiple genome alignment for keeping up with continuously arriving high-quality T2T assembled genomes and for unlocking clinically-relevant insights.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"23 1","pages":"182"},"PeriodicalIF":12.3,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9421119/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33447714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Modeling zero inflation is not necessary for spatial transcriptomics. 空间转录组学并不需要建立零膨胀模型。
IF 12.3 1区 生物学 Q1 Agricultural and Biological Sciences Pub Date : 2022-05-18 DOI: 10.1186/s13059-022-02684-0
Peiyao Zhao, Jiaqiang Zhu, Ying Ma, Xiang Zhou

Background: Spatial transcriptomics are a set of new technologies that profile gene expression on tissues with spatial localization information. With technological advances, recent spatial transcriptomics data are often in the form of sparse counts with an excessive amount of zero values.

Results: We perform a comprehensive analysis on 20 spatial transcriptomics datasets collected from 11 distinct technologies to characterize the distributional properties of the expression count data and understand the statistical nature of the zero values. Across datasets, we show that a substantial fraction of genes displays overdispersion and/or zero inflation that cannot be accounted for by a Poisson model, with genes displaying overdispersion substantially overlapped with genes displaying zero inflation. In addition, we find that either the Poisson or the negative binomial model is sufficient for modeling the majority of genes across most spatial transcriptomics technologies. We further show major sources of overdispersion and zero inflation in spatial transcriptomics including gene expression heterogeneity across tissue locations and spatial distribution of cell types. In particular, when we focus on a relatively homogeneous set of tissue locations or control for cell type compositions, the number of detected overdispersed and/or zero-inflated genes is substantially reduced, and a simple Poisson model is often sufficient to fit the gene expression data there.

Conclusions: Our study provides the first comprehensive evidence that excessive zeros in spatial transcriptomics are not due to zero inflation, supporting the use of count models without a zero inflation component for modeling spatial transcriptomics.

背景:空间转录组学是一套利用空间定位信息分析组织中基因表达的新技术。随着技术的进步,最近的空间转录组学数据往往以稀疏计数的形式出现,其中存在过多的零值:我们对从 11 种不同技术中收集的 20 个空间转录组学数据集进行了全面分析,以确定表达计数数据的分布特性,并了解零值的统计性质。在所有数据集中,我们发现有相当一部分基因表现出波松模型无法解释的过度分散和/或零膨胀,表现出过度分散的基因与表现出零膨胀的基因基本重叠。此外,我们还发现泊松模型或负二项模型足以为大多数空间转录组学技术中的大部分基因建模。我们进一步展示了空间转录组学中过度分散和零膨胀的主要来源,包括不同组织位置的基因表达异质性和细胞类型的空间分布。特别是,当我们关注一组相对同质的组织位置或控制细胞类型组成时,检测到的过度分散和/或零膨胀基因的数量就会大大减少,而一个简单的泊松模型往往就足以拟合那里的基因表达数据:我们的研究首次提供了全面的证据,证明空间转录组学中过多的零不是由于零膨胀造成的,从而支持使用不包含零膨胀成分的计数模型来建立空间转录组学模型。
{"title":"Modeling zero inflation is not necessary for spatial transcriptomics.","authors":"Peiyao Zhao, Jiaqiang Zhu, Ying Ma, Xiang Zhou","doi":"10.1186/s13059-022-02684-0","DOIUrl":"https://doi.org/10.1186/s13059-022-02684-0","url":null,"abstract":"<p><strong>Background: </strong>Spatial transcriptomics are a set of new technologies that profile gene expression on tissues with spatial localization information. With technological advances, recent spatial transcriptomics data are often in the form of sparse counts with an excessive amount of zero values.</p><p><strong>Results: </strong>We perform a comprehensive analysis on 20 spatial transcriptomics datasets collected from 11 distinct technologies to characterize the distributional properties of the expression count data and understand the statistical nature of the zero values. Across datasets, we show that a substantial fraction of genes displays overdispersion and/or zero inflation that cannot be accounted for by a Poisson model, with genes displaying overdispersion substantially overlapped with genes displaying zero inflation. In addition, we find that either the Poisson or the negative binomial model is sufficient for modeling the majority of genes across most spatial transcriptomics technologies. We further show major sources of overdispersion and zero inflation in spatial transcriptomics including gene expression heterogeneity across tissue locations and spatial distribution of cell types. In particular, when we focus on a relatively homogeneous set of tissue locations or control for cell type compositions, the number of detected overdispersed and/or zero-inflated genes is substantially reduced, and a simple Poisson model is often sufficient to fit the gene expression data there.</p><p><strong>Conclusions: </strong>Our study provides the first comprehensive evidence that excessive zeros in spatial transcriptomics are not due to zero inflation, supporting the use of count models without a zero inflation component for modeling spatial transcriptomics.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"23 1","pages":"118"},"PeriodicalIF":12.3,"publicationDate":"2022-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9116027/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142074282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1