首页 > 最新文献

Genome Biology最新文献

英文 中文
scParser: sparse representation learning for scalable single-cell RNA sequencing data analysis scParser:用于可扩展单细胞 RNA 测序数据分析的稀疏表示学习
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-08-16 DOI: 10.1186/s13059-024-03345-0
Kai Zhao, Hon-Cheong So, Zhixiang Lin
The rapid rise in the availability and scale of scRNA-seq data needs scalable methods for integrative analysis. Though many methods for data integration have been developed, few focus on understanding the heterogeneous effects of biological conditions across different cell populations in integrative analysis. Our proposed scalable approach, scParser, models the heterogeneous effects from biological conditions, which unveils the key mechanisms by which gene expression contributes to phenotypes. Notably, the extended scParser pinpoints biological processes in cell subpopulations that contribute to disease pathogenesis. scParser achieves favorable performance in cell clustering compared to state-of-the-art methods and has a broad and diverse applicability.
scRNA-seq 数据的可用性和规模的快速增长需要可扩展的整合分析方法。虽然已经开发出许多数据整合方法,但很少有方法能在整合分析中重点了解不同细胞群中生物条件的异质性影响。我们提出的可扩展方法 scParser 对生物条件的异质性影响进行建模,从而揭示基因表达对表型产生影响的关键机制。与最先进的方法相比,scParser 在细胞聚类方面取得了良好的性能,并具有广泛而多样的适用性。
{"title":"scParser: sparse representation learning for scalable single-cell RNA sequencing data analysis","authors":"Kai Zhao, Hon-Cheong So, Zhixiang Lin","doi":"10.1186/s13059-024-03345-0","DOIUrl":"https://doi.org/10.1186/s13059-024-03345-0","url":null,"abstract":"The rapid rise in the availability and scale of scRNA-seq data needs scalable methods for integrative analysis. Though many methods for data integration have been developed, few focus on understanding the heterogeneous effects of biological conditions across different cell populations in integrative analysis. Our proposed scalable approach, scParser, models the heterogeneous effects from biological conditions, which unveils the key mechanisms by which gene expression contributes to phenotypes. Notably, the extended scParser pinpoints biological processes in cell subpopulations that contribute to disease pathogenesis. scParser achieves favorable performance in cell clustering compared to state-of-the-art methods and has a broad and diverse applicability.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141991925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Associating transcription factors to single-cell trajectories with DREAMIT 利用 DREAMIT 将转录因子与单细胞轨迹联系起来
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-08-14 DOI: 10.1186/s13059-024-03368-7
Nathan D. Maulding, Lucas Seninge, Joshua M. Stuart
Inferring gene regulatory networks from single-cell RNA-sequencing trajectories has been an active area of research yet methods are still needed to identify regulators governing cell transitions. We developed DREAMIT (Dynamic Regulation of Expression Across Modules in Inferred Trajectories) to annotate transcription-factor activity along single-cell trajectory branches, using ensembles of relations to target genes. Using a benchmark representing several different tissues, as well as external validation with ATAC-Seq and Perturb-Seq data on hematopoietic cells, the method was found to have higher tissue-specific sensitivity and specificity over competing approaches.
从单细胞 RNA 测序轨迹推断基因调控网络一直是一个活跃的研究领域,但仍需要一些方法来识别细胞转换的调控因子。我们开发了 DREAMIT(推断轨迹中跨模块表达的动态调控),利用目标基因的关系集合,沿单细胞轨迹分支注释转录因子的活性。通过使用代表几种不同组织的基准以及造血细胞的 ATAC-Seq 和 Perturb-Seq 数据进行外部验证,发现该方法比其他竞争方法具有更高的组织特异性和特异性。
{"title":"Associating transcription factors to single-cell trajectories with DREAMIT","authors":"Nathan D. Maulding, Lucas Seninge, Joshua M. Stuart","doi":"10.1186/s13059-024-03368-7","DOIUrl":"https://doi.org/10.1186/s13059-024-03368-7","url":null,"abstract":"Inferring gene regulatory networks from single-cell RNA-sequencing trajectories has been an active area of research yet methods are still needed to identify regulators governing cell transitions. We developed DREAMIT (Dynamic Regulation of Expression Across Modules in Inferred Trajectories) to annotate transcription-factor activity along single-cell trajectory branches, using ensembles of relations to target genes. Using a benchmark representing several different tissues, as well as external validation with ATAC-Seq and Perturb-Seq data on hematopoietic cells, the method was found to have higher tissue-specific sensitivity and specificity over competing approaches.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comprehensive network modeling approaches unravel dynamic enhancer-promoter interactions across neural differentiation 综合网络建模方法揭示神经分化过程中增强子与启动子之间的动态相互作用
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-08-14 DOI: 10.1186/s13059-024-03365-w
William DeGroat, Fumitaka Inoue, Tal Ashuach, Nir Yosef, Nadav Ahituv, Anat Kreimer
Increasing evidence suggests that a substantial proportion of disease-associated mutations occur in enhancers, regions of non-coding DNA essential to gene regulation. Understanding the structures and mechanisms of the regulatory programs this variation affects can shed light on the apparatuses of human diseases. We collect epigenetic and gene expression datasets from seven early time points during neural differentiation. Focusing on this model system, we construct networks of enhancer-promoter interactions, each at an individual stage of neural induction. These networks serve as the base for a rich series of analyses, through which we demonstrate their temporal dynamics and enrichment for various disease-associated variants. We apply the Girvan-Newman clustering algorithm to these networks to reveal biologically relevant substructures of regulation. Additionally, we demonstrate methods to validate predicted enhancer-promoter interactions using transcription factor overexpression and massively parallel reporter assays. Our findings suggest a generalizable framework for exploring gene regulatory programs and their dynamics across developmental processes; this includes a comprehensive approach to studying the effects of disease-associated variation on transcriptional networks. The techniques applied to our networks have been published alongside our findings as a computational tool, E-P-INAnalyzer. Our procedure can be utilized across different cellular contexts and disorders.
越来越多的证据表明,相当一部分与疾病相关的突变发生在增强子中,而增强子是非编码 DNA 中对基因调控至关重要的区域。了解这种变异所影响的调控程序的结构和机制可以揭示人类疾病的机制。我们收集了神经分化过程中七个早期时间点的表观遗传和基因表达数据集。针对这一模型系统,我们构建了增强子-启动子相互作用网络,每个网络都处于神经诱导的一个单独阶段。这些网络是一系列丰富分析的基础,我们通过这些分析展示了它们的时间动态和各种疾病相关变异的富集。我们将 Girvan-Newman 聚类算法应用于这些网络,以揭示与生物学相关的调控子结构。此外,我们还展示了利用转录因子过表达和大规模并行报告实验验证预测的增强子-启动子相互作用的方法。我们的研究结果为探索基因调控程序及其在整个发育过程中的动态提供了一个可推广的框架;其中包括一种研究疾病相关变异对转录网络影响的综合方法。应用于我们网络的技术已作为计算工具 E-P-INAnalyzer 与我们的研究结果一同发表。我们的程序可用于不同的细胞环境和疾病。
{"title":"Comprehensive network modeling approaches unravel dynamic enhancer-promoter interactions across neural differentiation","authors":"William DeGroat, Fumitaka Inoue, Tal Ashuach, Nir Yosef, Nadav Ahituv, Anat Kreimer","doi":"10.1186/s13059-024-03365-w","DOIUrl":"https://doi.org/10.1186/s13059-024-03365-w","url":null,"abstract":"Increasing evidence suggests that a substantial proportion of disease-associated mutations occur in enhancers, regions of non-coding DNA essential to gene regulation. Understanding the structures and mechanisms of the regulatory programs this variation affects can shed light on the apparatuses of human diseases. We collect epigenetic and gene expression datasets from seven early time points during neural differentiation. Focusing on this model system, we construct networks of enhancer-promoter interactions, each at an individual stage of neural induction. These networks serve as the base for a rich series of analyses, through which we demonstrate their temporal dynamics and enrichment for various disease-associated variants. We apply the Girvan-Newman clustering algorithm to these networks to reveal biologically relevant substructures of regulation. Additionally, we demonstrate methods to validate predicted enhancer-promoter interactions using transcription factor overexpression and massively parallel reporter assays. Our findings suggest a generalizable framework for exploring gene regulatory programs and their dynamics across developmental processes; this includes a comprehensive approach to studying the effects of disease-associated variation on transcriptional networks. The techniques applied to our networks have been published alongside our findings as a computational tool, E-P-INAnalyzer. Our procedure can be utilized across different cellular contexts and disorders.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The GC-content at the 5′ ends of human protein-coding genes is undergoing mutational decay 人类蛋白质编码基因 5′末端的 GC 含量正在发生突变衰减
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-08-13 DOI: 10.1186/s13059-024-03364-x
Yi Qiu, Yoon Mo Kang, Christopher Korfmann, Fanny Pouyet, Andrew Eckford, Alexander F. Palazzo
In vertebrates, most protein-coding genes have a peak of GC-content near their 5′ transcriptional start site (TSS). This feature promotes both the efficient nuclear export and translation of mRNAs. Despite the importance of GC-content for RNA metabolism, its general features, origin, and maintenance remain mysterious. We investigate the evolutionary forces shaping GC-content at the transcriptional start site (TSS) of genes through both comparative genomic analysis of nucleotide substitution rates between different species and by examining human de novo mutations. Our data suggests that GC-peaks at TSSs were present in the last common ancestor of amniotes, and likely that of vertebrates. We observe that in apes and rodents, where recombination is directed away from TSSs by PRDM9, GC-content at the 5′ end of protein-coding gene is currently undergoing mutational decay. In canids, which lack PRDM9 and perform recombination at TSSs, GC-content at the 5′ end of protein-coding is increasing. We show that these patterns extend into the 5′ end of the open reading frame, thus impacting synonymous codon position choices. Our results indicate that the dynamics of this GC-peak in amniotes is largely shaped by historic patterns of recombination. Since decay of GC-content towards the mutation rate equilibrium is the default state for non-functional DNA, the observed decrease in GC-content at TSSs in apes and rodents indicates that the GC-peak is not being maintained by selection on most protein-coding genes in those species.
在脊椎动物中,大多数蛋白质编码基因的 5′转录起始位点(TSS)附近都有一个 GC 含量峰。这一特征促进了 mRNA 的有效核输出和翻译。尽管 GC 含量对 RNA 代谢非常重要,但它的一般特征、起源和维持仍然是个谜。我们通过对不同物种间核苷酸替换率的基因组比较分析,以及对人类新突变的研究,探讨了基因转录起始位点(TSS)GC-content的进化力量。我们的数据表明,转录起始位点的 GC 峰存在于羊膜动物的最后一个共同祖先,也很可能存在于脊椎动物的最后一个共同祖先。我们观察到,在猿类和啮齿类动物中,重组被 PRDM9 引导远离 TSS,蛋白编码基因 5′端的 GC 内容目前正在发生突变衰减。犬科动物缺乏 PRDM9,并在 TSS 处进行重组,因此蛋白编码基因 5′ 端的 GC 含量正在增加。我们的研究表明,这些模式延伸到了开放阅读框的 5′端,从而影响了同义密码子位置的选择。我们的研究结果表明,羊膜动物中这一 GC 峰的动态在很大程度上受历史重组模式的影响。由于 GC 含量向突变率平衡衰减是无功能 DNA 的默认状态,在猿类和啮齿类动物中观察到的 TSS 处 GC 含量的下降表明,在这些物种中,大多数蛋白质编码基因的选择并没有维持 GC 峰。
{"title":"The GC-content at the 5′ ends of human protein-coding genes is undergoing mutational decay","authors":"Yi Qiu, Yoon Mo Kang, Christopher Korfmann, Fanny Pouyet, Andrew Eckford, Alexander F. Palazzo","doi":"10.1186/s13059-024-03364-x","DOIUrl":"https://doi.org/10.1186/s13059-024-03364-x","url":null,"abstract":"In vertebrates, most protein-coding genes have a peak of GC-content near their 5′ transcriptional start site (TSS). This feature promotes both the efficient nuclear export and translation of mRNAs. Despite the importance of GC-content for RNA metabolism, its general features, origin, and maintenance remain mysterious. We investigate the evolutionary forces shaping GC-content at the transcriptional start site (TSS) of genes through both comparative genomic analysis of nucleotide substitution rates between different species and by examining human de novo mutations. Our data suggests that GC-peaks at TSSs were present in the last common ancestor of amniotes, and likely that of vertebrates. We observe that in apes and rodents, where recombination is directed away from TSSs by PRDM9, GC-content at the 5′ end of protein-coding gene is currently undergoing mutational decay. In canids, which lack PRDM9 and perform recombination at TSSs, GC-content at the 5′ end of protein-coding is increasing. We show that these patterns extend into the 5′ end of the open reading frame, thus impacting synonymous codon position choices. Our results indicate that the dynamics of this GC-peak in amniotes is largely shaped by historic patterns of recombination. Since decay of GC-content towards the mutation rate equilibrium is the default state for non-functional DNA, the observed decrease in GC-content at TSSs in apes and rodents indicates that the GC-peak is not being maintained by selection on most protein-coding genes in those species.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141973773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SynGAP: a synteny-based toolkit for gene structure annotation polishing SynGAP:基于同源关系的基因结构注释工具包
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-08-13 DOI: 10.1186/s13059-024-03359-8
Fengqi Wu, Yingxiao Mai, Chengjie Chen, Rui Xia
Genome sequencing has become a routine task for biologists, but the challenge of gene structure annotation persists, impeding accurate genomic and genetic research. Here, we present a bioinformatics toolkit, SynGAP (Synteny-based Gene structure Annotation Polisher), which uses gene synteny information to accomplish precise and automated polishing of gene structure annotation of genomes. SynGAP offers exceptional capabilities in the improvement of gene structure annotation quality and the profiling of integrative gene synteny between species. Furthermore, an expression variation index is designed for comparative transcriptomics analysis to explore candidate genes responsible for the development of distinct traits observed in phylogenetically related species.
基因组测序已成为生物学家的常规工作,但基因结构注释的难题依然存在,阻碍了基因组和遗传学研究的准确性。在这里,我们提出了一个生物信息学工具包 SynGAP(基于基因合成信息的基因结构注释抛光器),它利用基因合成信息对基因组的基因结构注释进行精确的自动抛光。SynGAP 在提高基因结构注释质量和分析物种间综合基因同源关系方面具有卓越的能力。此外,SynGAP 还为比较转录组学分析设计了表达变异指数,以探索在系统发育相关物种中观察到的负责形成不同性状的候选基因。
{"title":"SynGAP: a synteny-based toolkit for gene structure annotation polishing","authors":"Fengqi Wu, Yingxiao Mai, Chengjie Chen, Rui Xia","doi":"10.1186/s13059-024-03359-8","DOIUrl":"https://doi.org/10.1186/s13059-024-03359-8","url":null,"abstract":"Genome sequencing has become a routine task for biologists, but the challenge of gene structure annotation persists, impeding accurate genomic and genetic research. Here, we present a bioinformatics toolkit, SynGAP (Synteny-based Gene structure Annotation Polisher), which uses gene synteny information to accomplish precise and automated polishing of gene structure annotation of genomes. SynGAP offers exceptional capabilities in the improvement of gene structure annotation quality and the profiling of integrative gene synteny between species. Furthermore, an expression variation index is designed for comparative transcriptomics analysis to explore candidate genes responsible for the development of distinct traits observed in phylogenetically related species.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141973774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prevalence of and gene regulatory constraints on transcriptional adaptation in single cells 单细胞转录适应的普遍性和基因调控约束
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-08-12 DOI: 10.1186/s13059-024-03351-2
Ian A. Mellis, Madeline E. Melzer, Nicholas Bodkin, Yogesh Goyal
Cells and tissues have a remarkable ability to adapt to genetic perturbations via a variety of molecular mechanisms. Nonsense-induced transcriptional compensation, a form of transcriptional adaptation, has recently emerged as one such mechanism, in which nonsense mutations in a gene trigger upregulation of related genes, possibly conferring robustness at cellular and organismal levels. However, beyond a handful of developmental contexts and curated sets of genes, no comprehensive genome-wide investigation of this behavior has been undertaken for mammalian cell types and conditions. How the regulatory-level effects of inherently stochastic compensatory gene networks contribute to phenotypic penetrance in single cells remains unclear. We analyze existing bulk and single-cell transcriptomic datasets to uncover the prevalence of transcriptional adaptation in mammalian systems across diverse contexts and cell types. We perform regulon gene expression analyses of transcription factor target sets in both bulk and pooled single-cell genetic perturbation datasets. Our results reveal greater robustness in expression of regulons of transcription factors exhibiting transcriptional adaptation compared to those of transcription factors that do not. Stochastic mathematical modeling of minimal compensatory gene networks qualitatively recapitulates several aspects of transcriptional adaptation, including paralog upregulation and robustness to mutation. Combined with machine learning analysis of network features of interest, our framework offers potential explanations for which regulatory steps are most important for transcriptional adaptation. Our integrative approach identifies several putative hits—genes demonstrating possible transcriptional adaptation—to follow-up on experimentally and provides a formal quantitative framework to test and refine models of transcriptional adaptation.
细胞和组织具有通过各种分子机制适应遗传扰动的非凡能力。无义诱导转录补偿是转录适应的一种形式,它是最近出现的一种机制,在这种机制中,基因中的无义突变会触发相关基因的上调,从而可能在细胞和生物体水平上赋予稳健性。然而,除了少数几种发育环境和经过筛选的基因集之外,还没有针对哺乳动物细胞类型和条件对这种行为进行过全面的全基因组调查。固有随机代偿基因网络的调控水平效应如何在单细胞中促成表型的穿透性仍不清楚。我们分析了现有的体细胞和单细胞转录组数据集,以揭示哺乳动物系统在不同环境和细胞类型中转录适应的普遍性。我们对大容量数据集和汇集的单细胞遗传扰乱数据集中的转录因子目标集进行了调控基因表达分析。我们的结果表明,与不表现出转录适应性的转录因子相比,表现出转录适应性的转录因子调控子的表达具有更强的稳健性。最小补偿基因网络的随机数学建模定性地再现了转录适应的几个方面,包括旁系上调和对突变的稳健性。结合对相关网络特征的机器学习分析,我们的框架为哪些调控步骤对转录适应最重要提供了可能的解释。我们的综合方法确定了几种可能的命中基因--展示了可能的转录适应性的基因--以进行后续实验,并提供了一个正式的定量框架来测试和完善转录适应性模型。
{"title":"Prevalence of and gene regulatory constraints on transcriptional adaptation in single cells","authors":"Ian A. Mellis, Madeline E. Melzer, Nicholas Bodkin, Yogesh Goyal","doi":"10.1186/s13059-024-03351-2","DOIUrl":"https://doi.org/10.1186/s13059-024-03351-2","url":null,"abstract":"Cells and tissues have a remarkable ability to adapt to genetic perturbations via a variety of molecular mechanisms. Nonsense-induced transcriptional compensation, a form of transcriptional adaptation, has recently emerged as one such mechanism, in which nonsense mutations in a gene trigger upregulation of related genes, possibly conferring robustness at cellular and organismal levels. However, beyond a handful of developmental contexts and curated sets of genes, no comprehensive genome-wide investigation of this behavior has been undertaken for mammalian cell types and conditions. How the regulatory-level effects of inherently stochastic compensatory gene networks contribute to phenotypic penetrance in single cells remains unclear. We analyze existing bulk and single-cell transcriptomic datasets to uncover the prevalence of transcriptional adaptation in mammalian systems across diverse contexts and cell types. We perform regulon gene expression analyses of transcription factor target sets in both bulk and pooled single-cell genetic perturbation datasets. Our results reveal greater robustness in expression of regulons of transcription factors exhibiting transcriptional adaptation compared to those of transcription factors that do not. Stochastic mathematical modeling of minimal compensatory gene networks qualitatively recapitulates several aspects of transcriptional adaptation, including paralog upregulation and robustness to mutation. Combined with machine learning analysis of network features of interest, our framework offers potential explanations for which regulatory steps are most important for transcriptional adaptation. Our integrative approach identifies several putative hits—genes demonstrating possible transcriptional adaptation—to follow-up on experimentally and provides a formal quantitative framework to test and refine models of transcriptional adaptation.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141918821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
READv2: advanced and user-friendly detection of biological relatedness in archaeogenomics READv2:考古基因组学中生物相关性的高级和用户友好检测
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-08-12 DOI: 10.1186/s13059-024-03350-3
Erkin Alaçamlı, Thijessen Naidoo, Merve N. Güler, Ekin Sağlıcan, Şevval Aktürk, Igor Mapelli, Kıvılcım Başak Vural, Mehmet Somel, Helena Malmström, Torsten Günther
The advent of genome-wide ancient DNA analysis has revolutionized our understanding of prehistoric societies. However, studying biological relatedness in these groups requires tailored approaches due to the challenges of analyzing ancient DNA. READv2, an optimized Python3 implementation of the most widely used tool for this purpose, addresses these challenges while surpassing its predecessor in speed and accuracy. For sufficient amounts of data, it can classify up to third-degree relatedness and differentiate between the two types of first-degree relatedness, full siblings and parent-offspring. READv2 enables user-friendly, efficient, and nuanced analysis of biological relatedness, facilitating a deeper understanding of past social structures.
全基因组古 DNA 分析的出现彻底改变了我们对史前社会的认识。然而,由于分析古 DNA 所面临的挑战,研究这些群体的生物亲缘关系需要量身定制的方法。READv2 是最广泛使用的工具在 Python3 上的优化实现,它在速度和准确性上都超越了前者,从而解决了这些难题。在数据量足够大的情况下,它可以对三等亲缘关系进行分类,并区分两类一等亲缘关系(全同胞和亲子)。READv2 可以对生物亲缘关系进行用户友好、高效和细致的分析,从而促进对过去社会结构的深入了解。
{"title":"READv2: advanced and user-friendly detection of biological relatedness in archaeogenomics","authors":"Erkin Alaçamlı, Thijessen Naidoo, Merve N. Güler, Ekin Sağlıcan, Şevval Aktürk, Igor Mapelli, Kıvılcım Başak Vural, Mehmet Somel, Helena Malmström, Torsten Günther","doi":"10.1186/s13059-024-03350-3","DOIUrl":"https://doi.org/10.1186/s13059-024-03350-3","url":null,"abstract":"The advent of genome-wide ancient DNA analysis has revolutionized our understanding of prehistoric societies. However, studying biological relatedness in these groups requires tailored approaches due to the challenges of analyzing ancient DNA. READv2, an optimized Python3 implementation of the most widely used tool for this purpose, addresses these challenges while surpassing its predecessor in speed and accuracy. For sufficient amounts of data, it can classify up to third-degree relatedness and differentiate between the two types of first-degree relatedness, full siblings and parent-offspring. READv2 enables user-friendly, efficient, and nuanced analysis of biological relatedness, facilitating a deeper understanding of past social structures.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141918824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genomic reproducibility in the bioinformatics era 生物信息学时代的基因组重现性
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-08-09 DOI: 10.1186/s13059-024-03343-2
Pelin Icer Baykal, Paweł Piotr Łabaj, Florian Markowetz, Lynn M. Schriml, Daniel J. Stekhoven, Serghei Mangul, Niko Beerenwinkel
In biomedical research, validating a scientific discovery hinges on the reproducibility of its experimental results. However, in genomics, the definition and implementation of reproducibility remain imprecise. We argue that genomic reproducibility, defined as the ability of bioinformatics tools to maintain consistent results across technical replicates, is essential for advancing scientific knowledge and medical applications. Initially, we examine different interpretations of reproducibility in genomics to clarify terms. Subsequently, we discuss the impact of bioinformatics tools on genomic reproducibility and explore methods for evaluating these tools regarding their effectiveness in ensuring genomic reproducibility. Finally, we recommend best practices to improve genomic reproducibility.
在生物医学研究中,科学发现的验证取决于实验结果的可重复性。然而,在基因组学领域,可重复性的定义和实施仍不精确。我们认为,基因组可重复性是指生物信息学工具在不同技术重复中保持结果一致的能力,这对促进科学知识和医学应用至关重要。首先,我们研究了基因组学中对可重复性的不同解释,以澄清术语。随后,我们讨论了生物信息学工具对基因组可重复性的影响,并探讨了评估这些工具在确保基因组可重复性方面有效性的方法。最后,我们推荐了提高基因组可重复性的最佳实践。
{"title":"Genomic reproducibility in the bioinformatics era","authors":"Pelin Icer Baykal, Paweł Piotr Łabaj, Florian Markowetz, Lynn M. Schriml, Daniel J. Stekhoven, Serghei Mangul, Niko Beerenwinkel","doi":"10.1186/s13059-024-03343-2","DOIUrl":"https://doi.org/10.1186/s13059-024-03343-2","url":null,"abstract":"In biomedical research, validating a scientific discovery hinges on the reproducibility of its experimental results. However, in genomics, the definition and implementation of reproducibility remain imprecise. We argue that genomic reproducibility, defined as the ability of bioinformatics tools to maintain consistent results across technical replicates, is essential for advancing scientific knowledge and medical applications. Initially, we examine different interpretations of reproducibility in genomics to clarify terms. Subsequently, we discuss the impact of bioinformatics tools on genomic reproducibility and explore methods for evaluating these tools regarding their effectiveness in ensuring genomic reproducibility. Finally, we recommend best practices to improve genomic reproducibility.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141909013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benchmarking clustering, alignment, and integration methods for spatial transcriptomics 空间转录组学聚类、配准和整合方法的基准测试
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-08-09 DOI: 10.1186/s13059-024-03361-0
Yunfei Hu, Manfei Xie, Yikang Li, Mingxing Rao, Wenjun Shen, Can Luo, Haoran Qin, Jihoon Baek, Xin Maizie Zhou
Spatial transcriptomics (ST) is advancing our understanding of complex tissues and organisms. However, building a robust clustering algorithm to define spatially coherent regions in a single tissue slice and aligning or integrating multiple tissue slices originating from diverse sources for essential downstream analyses remains challenging. Numerous clustering, alignment, and integration methods have been specifically designed for ST data by leveraging its spatial information. The absence of comprehensive benchmark studies complicates the selection of methods and future method development. In this study, we systematically benchmark a variety of state-of-the-art algorithms with a wide range of real and simulated datasets of varying sizes, technologies, species, and complexity. We analyze the strengths and weaknesses of each method using diverse quantitative and qualitative metrics and analyses, including eight metrics for spatial clustering accuracy and contiguity, uniform manifold approximation and projection visualization, layer-wise and spot-to-spot alignment accuracy, and 3D reconstruction, which are designed to assess method performance as well as data quality. The code used for evaluation is available on our GitHub. Additionally, we provide online notebook tutorials and documentation to facilitate the reproduction of all benchmarking results and to support the study of new methods and new datasets. Our analyses lead to comprehensive recommendations that cover multiple aspects, helping users to select optimal tools for their specific needs and guide future method development.
空间转录组学(ST)正在推进我们对复杂组织和生物体的了解。然而,建立一个强大的聚类算法来定义单个组织切片中的空间一致性区域,并对来自不同来源的多个组织切片进行配准或整合,以进行必要的下游分析,这仍然具有挑战性。许多聚类、配准和整合方法都是利用空间信息专门为 ST 数据设计的。由于缺乏全面的基准研究,使得方法的选择和未来的方法开发变得更加复杂。在本研究中,我们利用各种不同规模、技术、物种和复杂程度的真实和模拟数据集,系统地对各种最先进的算法进行了基准测试。我们使用不同的定量和定性指标和分析方法来分析每种方法的优缺点,其中包括空间聚类精度和连续性、均匀流形近似和投影可视化、层间和点对点配准精度以及三维重建等八个指标,这些指标旨在评估方法性能和数据质量。用于评估的代码可在我们的 GitHub 上获取。此外,我们还提供在线笔记本教程和文档,以方便复制所有基准测试结果,并支持对新方法和新数据集的研究。通过分析,我们提出了涵盖多个方面的综合建议,帮助用户根据自己的具体需求选择最佳工具,并指导未来的方法开发。
{"title":"Benchmarking clustering, alignment, and integration methods for spatial transcriptomics","authors":"Yunfei Hu, Manfei Xie, Yikang Li, Mingxing Rao, Wenjun Shen, Can Luo, Haoran Qin, Jihoon Baek, Xin Maizie Zhou","doi":"10.1186/s13059-024-03361-0","DOIUrl":"https://doi.org/10.1186/s13059-024-03361-0","url":null,"abstract":"Spatial transcriptomics (ST) is advancing our understanding of complex tissues and organisms. However, building a robust clustering algorithm to define spatially coherent regions in a single tissue slice and aligning or integrating multiple tissue slices originating from diverse sources for essential downstream analyses remains challenging. Numerous clustering, alignment, and integration methods have been specifically designed for ST data by leveraging its spatial information. The absence of comprehensive benchmark studies complicates the selection of methods and future method development. In this study, we systematically benchmark a variety of state-of-the-art algorithms with a wide range of real and simulated datasets of varying sizes, technologies, species, and complexity. We analyze the strengths and weaknesses of each method using diverse quantitative and qualitative metrics and analyses, including eight metrics for spatial clustering accuracy and contiguity, uniform manifold approximation and projection visualization, layer-wise and spot-to-spot alignment accuracy, and 3D reconstruction, which are designed to assess method performance as well as data quality. The code used for evaluation is available on our GitHub. Additionally, we provide online notebook tutorials and documentation to facilitate the reproduction of all benchmarking results and to support the study of new methods and new datasets. Our analyses lead to comprehensive recommendations that cover multiple aspects, helping users to select optimal tools for their specific needs and guide future method development.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141909014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Creating large-scale genetic diversity in Arabidopsis via base editing-mediated deep artificial evolution 通过碱基编辑介导的深度人工进化在拟南芥中创造大规模遗传多样性
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-08-09 DOI: 10.1186/s13059-024-03358-9
Xiang Wang, Wenbo Pan, Chao Sun, Hong Yang, Zhentao Cheng, Fei Yan, Guojing Ma, Yun Shang, Rui Zhang, Caixia Gao, Lijing Liu, Huawei Zhang
Base editing is a powerful tool for artificial evolution to create allelic diversity and improve agronomic traits. However, the great evolutionary potential for every sgRNA target has been overlooked. And there is currently no high-throughput method for generating and characterizing as many changes in a single target as possible based on large mutant pools to permit rapid gene directed evolution in plants. In this study, we establish an efficient germline-specific evolution system to screen beneficial alleles in Arabidopsis which could be applied for crop improvement. This system is based on a strong egg cell-specific cytosine base editor and the large seed production of Arabidopsis, which enables each T1 plant with unedited wild type alleles to produce thousands of independent T2 mutant lines. It has the ability of creating a wide range of mutant lines, including those containing atypical base substitutions, and as well providing a space- and labor-saving way to store and screen the resulting mutant libraries. Using this system, we efficiently generate herbicide-resistant EPSPS, ALS, and HPPD variants that could be used in crop breeding. Here, we demonstrate the significant potential of base editing-mediated artificial evolution for each sgRNA target and devised an efficient system for conducting deep evolution to harness this potential.
碱基编辑是人工进化的有力工具,可创造等位基因多样性并改善农艺性状。然而,每个 sgRNA 目标的巨大进化潜力都被忽视了。而且目前还没有一种高通量的方法,可以在大型突变体池的基础上尽可能多地产生和表征单一靶标的变化,从而实现植物基因的快速定向进化。在这项研究中,我们建立了一个高效的种系特异性进化系统来筛选拟南芥中的有益等位基因,并将其应用于作物改良。该系统基于强大的卵细胞特异性胞嘧啶碱基编辑器和拟南芥庞大的种子产量,这使得每株带有未经编辑的野生型等位基因的 T1 植物都能产生成千上万个独立的 T2 突变株系。该系统能够产生多种突变株系,包括含有非典型碱基置换的突变株系,还能以节省空间和人力的方式储存和筛选所产生的突变株系文库。利用这一系统,我们有效地产生了可用于作物育种的抗除草剂 EPSPS、ALS 和 HPPD 变异株。在这里,我们证明了碱基编辑介导的人工进化对每个 sgRNA 目标的巨大潜力,并设计了一个高效的系统来进行深度进化以利用这一潜力。
{"title":"Creating large-scale genetic diversity in Arabidopsis via base editing-mediated deep artificial evolution","authors":"Xiang Wang, Wenbo Pan, Chao Sun, Hong Yang, Zhentao Cheng, Fei Yan, Guojing Ma, Yun Shang, Rui Zhang, Caixia Gao, Lijing Liu, Huawei Zhang","doi":"10.1186/s13059-024-03358-9","DOIUrl":"https://doi.org/10.1186/s13059-024-03358-9","url":null,"abstract":"Base editing is a powerful tool for artificial evolution to create allelic diversity and improve agronomic traits. However, the great evolutionary potential for every sgRNA target has been overlooked. And there is currently no high-throughput method for generating and characterizing as many changes in a single target as possible based on large mutant pools to permit rapid gene directed evolution in plants. In this study, we establish an efficient germline-specific evolution system to screen beneficial alleles in Arabidopsis which could be applied for crop improvement. This system is based on a strong egg cell-specific cytosine base editor and the large seed production of Arabidopsis, which enables each T1 plant with unedited wild type alleles to produce thousands of independent T2 mutant lines. It has the ability of creating a wide range of mutant lines, including those containing atypical base substitutions, and as well providing a space- and labor-saving way to store and screen the resulting mutant libraries. Using this system, we efficiently generate herbicide-resistant EPSPS, ALS, and HPPD variants that could be used in crop breeding. Here, we demonstrate the significant potential of base editing-mediated artificial evolution for each sgRNA target and devised an efficient system for conducting deep evolution to harness this potential.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141909082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1