首页 > 最新文献

Genome Biology最新文献

英文 中文
IAMSAM: image-based analysis of molecular signatures using the Segment Anything Model IAMSAM:基于图像的分子特征分析,使用分段 Anything 模型
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-11-11 DOI: 10.1186/s13059-024-03380-x
Dongjoo Lee, Jeongbin Park, Seungho Cook, Seongjin Yoo, Daeseung Lee, Hongyoon Choi
Spatial transcriptomics is a cutting-edge technique that combines gene expression with spatial information, allowing researchers to study molecular patterns within tissue architecture. Here, we present IAMSAM, a user-friendly web-based tool for analyzing spatial transcriptomics data focusing on morphological features. IAMSAM accurately segments tissue images using the Segment Anything Model, allowing for the semi-automatic selection of regions of interest based on morphological signatures. Furthermore, IAMSAM provides downstream analysis, such as identifying differentially expressed genes, enrichment analysis, and cell type prediction within the selected regions. With its simple interface, IAMSAM empowers researchers to explore and interpret heterogeneous tissues in a streamlined manner.
空间转录组学是一种将基因表达与空间信息相结合的前沿技术,使研究人员能够研究组织结构中的分子模式。在这里,我们介绍 IAMSAM,这是一种基于网络的用户友好型工具,用于分析以形态特征为重点的空间转录组学数据。IAMSAM 利用 "任意分割模型"(Segment Anything Model)对组织图像进行精确分割,可根据形态特征半自动选择感兴趣的区域。此外,IAMSAM 还提供下游分析功能,如在选定区域内识别差异表达基因、富集分析和细胞类型预测。通过简单的界面,IAMSAM 使研究人员能够以简化的方式探索和解释异质组织。
{"title":"IAMSAM: image-based analysis of molecular signatures using the Segment Anything Model","authors":"Dongjoo Lee, Jeongbin Park, Seungho Cook, Seongjin Yoo, Daeseung Lee, Hongyoon Choi","doi":"10.1186/s13059-024-03380-x","DOIUrl":"https://doi.org/10.1186/s13059-024-03380-x","url":null,"abstract":"Spatial transcriptomics is a cutting-edge technique that combines gene expression with spatial information, allowing researchers to study molecular patterns within tissue architecture. Here, we present IAMSAM, a user-friendly web-based tool for analyzing spatial transcriptomics data focusing on morphological features. IAMSAM accurately segments tissue images using the Segment Anything Model, allowing for the semi-automatic selection of regions of interest based on morphological signatures. Furthermore, IAMSAM provides downstream analysis, such as identifying differentially expressed genes, enrichment analysis, and cell type prediction within the selected regions. With its simple interface, IAMSAM empowers researchers to explore and interpret heterogeneous tissues in a streamlined manner.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"409 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142598333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adenine base editors induce off-target structure variations in mouse embryos and primary human T cells 腺嘌呤碱基编辑器在小鼠胚胎和原代人类 T 细胞中诱发脱靶结构变异
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-11-11 DOI: 10.1186/s13059-024-03434-0
Leilei Wu, Shutan Jiang, Meisong Shi, Tanglong Yuan, Yaqin Li, Pinzheng Huang, Yingqi Li, Erwei Zuo, Changyang Zhou, Yidi Sun
The safety of CRISPR-based gene editing methods is of the utmost priority in clinical applications. Previous studies have reported that Cas9 cleavage induced frequent aneuploidy in primary human T cells, but whether cleavage-mediated editing of base editors would generate off-target structure variations remains unknown. Here, we investigate the potential off-target structural variations associated with CRISPR/Cas9, ABE, and CBE editing in mouse embryos and primary human T cells by whole-genome sequencing and single-cell RNA-seq analyses. The results show that both Cas9 and ABE generate off-target structural variations (SVs) in mouse embryos, while CBE induces rare SVs. In addition, off-target large deletions are detected in 32.74% of primary human T cells transfected with Cas9 and 9.17% of cells transfected with ABE. Moreover, Cas9-induced aneuploid cells activate the P53 and apoptosis pathways, whereas ABE-associated aneuploid cells significantly upregulate cell cycle-related genes and are arrested in the G0 phase. A percentage of 16.59% and 4.29% aneuploid cells are still observable at 3 weeks post transfection of Cas9 or ABE. These off-target phenomena in ABE are universal as observed in other cell types such as B cells and Huh7. Furthermore, the off-target SVs are significantly reduced in cells treated with high-fidelity ABE (ABE-V106W). This study shows both CRISPR/Cas9 and ABE induce off-target SVs in mouse embryos and primary human T cells, raising an urgent need for the development of high-fidelity gene editing tools.
在临床应用中,基于 CRISPR 的基因编辑方法的安全性是重中之重。之前的研究报告称,Cas9的裂解会诱导原代人类T细胞频繁出现非整倍体,但裂解介导的碱基编辑是否会产生脱靶结构变异仍是未知数。在这里,我们通过全基因组测序和单细胞 RNA-seq 分析,研究了在小鼠胚胎和原代人类 T 细胞中与 CRISPR/Cas9、ABE 和 CBE 编辑相关的潜在脱靶结构变异。结果表明,Cas9 和 ABE 都会在小鼠胚胎中产生脱靶结构变异 (SV),而 CBE 则会诱导罕见的 SV。此外,32.74%转染了Cas9的原代人类T细胞和9.17%转染了ABE的细胞都检测到了脱靶大缺失。此外,Cas9 诱导的非整倍体细胞会激活 P53 和细胞凋亡通路,而 ABE 相关的非整倍体细胞会显著上调细胞周期相关基因,并停滞在 G0 期。在转染 Cas9 或 ABE 3 周后,仍可观察到 16.59% 和 4.29% 的非整倍体细胞。ABE 中的这些脱靶现象与在其他细胞类型(如 B 细胞和 Huh7)中观察到的一样普遍。此外,在使用高保真 ABE(ABE-V106W)处理的细胞中,脱靶 SV 明显减少。这项研究表明,CRISPR/Cas9 和 ABE 都会在小鼠胚胎和原代人类 T 细胞中诱导脱靶 SV,因此迫切需要开发高保真基因编辑工具。
{"title":"Adenine base editors induce off-target structure variations in mouse embryos and primary human T cells","authors":"Leilei Wu, Shutan Jiang, Meisong Shi, Tanglong Yuan, Yaqin Li, Pinzheng Huang, Yingqi Li, Erwei Zuo, Changyang Zhou, Yidi Sun","doi":"10.1186/s13059-024-03434-0","DOIUrl":"https://doi.org/10.1186/s13059-024-03434-0","url":null,"abstract":"The safety of CRISPR-based gene editing methods is of the utmost priority in clinical applications. Previous studies have reported that Cas9 cleavage induced frequent aneuploidy in primary human T cells, but whether cleavage-mediated editing of base editors would generate off-target structure variations remains unknown. Here, we investigate the potential off-target structural variations associated with CRISPR/Cas9, ABE, and CBE editing in mouse embryos and primary human T cells by whole-genome sequencing and single-cell RNA-seq analyses. The results show that both Cas9 and ABE generate off-target structural variations (SVs) in mouse embryos, while CBE induces rare SVs. In addition, off-target large deletions are detected in 32.74% of primary human T cells transfected with Cas9 and 9.17% of cells transfected with ABE. Moreover, Cas9-induced aneuploid cells activate the P53 and apoptosis pathways, whereas ABE-associated aneuploid cells significantly upregulate cell cycle-related genes and are arrested in the G0 phase. A percentage of 16.59% and 4.29% aneuploid cells are still observable at 3 weeks post transfection of Cas9 or ABE. These off-target phenomena in ABE are universal as observed in other cell types such as B cells and Huh7. Furthermore, the off-target SVs are significantly reduced in cells treated with high-fidelity ABE (ABE-V106W). This study shows both CRISPR/Cas9 and ABE induce off-target SVs in mouse embryos and primary human T cells, raising an urgent need for the development of high-fidelity gene editing tools.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"2 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142598474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SpottedPy quantifies relationships between spatial transcriptomic hotspots and uncovers environmental cues of epithelial-mesenchymal plasticity in breast cancer SpottedPy 量化空间转录组热点之间的关系,揭示乳腺癌上皮-间质可塑性的环境线索
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-11-11 DOI: 10.1186/s13059-024-03428-y
Eloise Withnell, Maria Secrier
Spatial transcriptomics is revolutionizing the exploration of intratissue heterogeneity in cancer, yet capturing cellular niches and their spatial relationships remains challenging. We introduce SpottedPy, a Python package designed to identify tumor hotspots and map spatial interactions within the cancer ecosystem. Using SpottedPy, we examine epithelial-mesenchymal plasticity in breast cancer and highlight stable niches associated with angiogenic and hypoxic regions, shielded by CAFs and macrophages. Hybrid and mesenchymal hotspot distribution follows transformation gradients reflecting progressive immunosuppression. Our method offers flexibility to explore spatial relationships at different scales, from immediate neighbors to broader tissue modules, providing new insights into tumor microenvironment dynamics.
空间转录组学正在彻底改变对癌症组织内异质性的探索,然而捕捉细胞龛位及其空间关系仍然具有挑战性。我们介绍了 SpottedPy,这是一个 Python 软件包,旨在识别肿瘤热点并绘制癌症生态系统内的空间相互作用图。利用 SpottedPy,我们研究了乳腺癌的上皮-间质可塑性,并突出了与血管生成和缺氧区域相关的稳定壁龛,这些壁龛受到 CAFs 和巨噬细胞的保护。混合和间质热点的分布遵循转化梯度,反映了渐进的免疫抑制。我们的方法可灵活探索不同尺度的空间关系,从近邻到更广泛的组织模块,为肿瘤微环境动力学提供了新的见解。
{"title":"SpottedPy quantifies relationships between spatial transcriptomic hotspots and uncovers environmental cues of epithelial-mesenchymal plasticity in breast cancer","authors":"Eloise Withnell, Maria Secrier","doi":"10.1186/s13059-024-03428-y","DOIUrl":"https://doi.org/10.1186/s13059-024-03428-y","url":null,"abstract":"Spatial transcriptomics is revolutionizing the exploration of intratissue heterogeneity in cancer, yet capturing cellular niches and their spatial relationships remains challenging. We introduce SpottedPy, a Python package designed to identify tumor hotspots and map spatial interactions within the cancer ecosystem. Using SpottedPy, we examine epithelial-mesenchymal plasticity in breast cancer and highlight stable niches associated with angiogenic and hypoxic regions, shielded by CAFs and macrophages. Hybrid and mesenchymal hotspot distribution follows transformation gradients reflecting progressive immunosuppression. Our method offers flexibility to explore spatial relationships at different scales, from immediate neighbors to broader tissue modules, providing new insights into tumor microenvironment dynamics.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"71 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142598334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
scDOT: optimal transport for mapping senescent cells in spatial transcriptomics scDOT:空间转录组学中绘制衰老细胞图谱的最佳传输方式
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-11-08 DOI: 10.1186/s13059-024-03426-0
Nam D. Nguyen, Lorena Rosas, Timur Khaliullin, Peiran Jiang, Euxhen Hasanaj, Jose A. Ovando-Ricardez, Marta Bueno, Irfan Rahman, Gloria S. Pryhuber, Dongmei Li, Qin Ma, Toren Finkel, Melanie Königshoff, Oliver Eickelberg, Mauricio Rojas, Ana L. Mora, Jose Lugo-Martinez, Ziv Bar-Joseph
The low resolution of spatial transcriptomics data necessitates additional information for optimal use. We developed scDOT, which combines spatial transcriptomics and single cell RNA sequencing to improve the ability to reconstruct single cell resolved spatial maps and identify senescent cells. scDOT integrates optimal transport and expression deconvolution to learn non-linear couplings between cells and spots and to infer cell placements. Application of scDOT to lung spatial transcriptomics data improves on prior methods and allows the identification of the spatial organization of senescent cells, their neighboring cells and novel genes involved in cell-cell interactions that may be driving senescence.
空间转录组学数据的分辨率较低,需要额外的信息才能得到最佳利用。我们开发的 scDOT 结合了空间转录组学和单细胞 RNA 测序,提高了重建单细胞解析空间图和识别衰老细胞的能力。scDOT 在肺部空间转录组学数据中的应用改进了之前的方法,并能识别衰老细胞的空间组织、其邻近细胞以及参与细胞-细胞相互作用的新基因,而细胞-细胞相互作用可能是衰老的驱动因素。
{"title":"scDOT: optimal transport for mapping senescent cells in spatial transcriptomics","authors":"Nam D. Nguyen, Lorena Rosas, Timur Khaliullin, Peiran Jiang, Euxhen Hasanaj, Jose A. Ovando-Ricardez, Marta Bueno, Irfan Rahman, Gloria S. Pryhuber, Dongmei Li, Qin Ma, Toren Finkel, Melanie Königshoff, Oliver Eickelberg, Mauricio Rojas, Ana L. Mora, Jose Lugo-Martinez, Ziv Bar-Joseph","doi":"10.1186/s13059-024-03426-0","DOIUrl":"https://doi.org/10.1186/s13059-024-03426-0","url":null,"abstract":"The low resolution of spatial transcriptomics data necessitates additional information for optimal use. We developed scDOT, which combines spatial transcriptomics and single cell RNA sequencing to improve the ability to reconstruct single cell resolved spatial maps and identify senescent cells. scDOT integrates optimal transport and expression deconvolution to learn non-linear couplings between cells and spots and to infer cell placements. Application of scDOT to lung spatial transcriptomics data improves on prior methods and allows the identification of the spatial organization of senescent cells, their neighboring cells and novel genes involved in cell-cell interactions that may be driving senescence.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"10 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142597550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GraphPCA: a fast and interpretable dimension reduction algorithm for spatial transcriptomics data GraphPCA:用于空间转录组学数据的快速、可解释的降维算法
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-11-07 DOI: 10.1186/s13059-024-03429-x
Jiyuan Yang, Lu Wang, Lin Liu, Xiaoqi Zheng
The rapid advancement of spatial transcriptomics technologies has revolutionized our understanding of cell heterogeneity and intricate spatial structures within tissues and organs. However, the high dimensionality and noise in spatial transcriptomic data present significant challenges for downstream data analyses. Here, we develop GraphPCA, an interpretable and quasi-linear dimension reduction algorithm that leverages the strengths of graphical regularization and principal component analysis. Comprehensive evaluations on simulated and multi-resolution spatial transcriptomic datasets generated from various platforms demonstrate the capacity of GraphPCA to enhance downstream analysis tasks including spatial domain detection, denoising, and trajectory inference compared to other state-of-the-art methods.
空间转录组学技术的快速发展彻底改变了我们对细胞异质性以及组织和器官内复杂空间结构的认识。然而,空间转录组数据的高维度和噪声给下游数据分析带来了巨大挑战。在此,我们开发了 GraphPCA,这是一种可解释的准线性降维算法,充分利用了图形正则化和主成分分析的优势。通过对各种平台生成的模拟和多分辨率空间转录组数据集进行全面评估,证明与其他最先进的方法相比,GraphPCA 有能力增强下游分析任务,包括空间域检测、去噪和轨迹推断。
{"title":"GraphPCA: a fast and interpretable dimension reduction algorithm for spatial transcriptomics data","authors":"Jiyuan Yang, Lu Wang, Lin Liu, Xiaoqi Zheng","doi":"10.1186/s13059-024-03429-x","DOIUrl":"https://doi.org/10.1186/s13059-024-03429-x","url":null,"abstract":"The rapid advancement of spatial transcriptomics technologies has revolutionized our understanding of cell heterogeneity and intricate spatial structures within tissues and organs. However, the high dimensionality and noise in spatial transcriptomic data present significant challenges for downstream data analyses. Here, we develop GraphPCA, an interpretable and quasi-linear dimension reduction algorithm that leverages the strengths of graphical regularization and principal component analysis. Comprehensive evaluations on simulated and multi-resolution spatial transcriptomic datasets generated from various platforms demonstrate the capacity of GraphPCA to enhance downstream analysis tasks including spatial domain detection, denoising, and trajectory inference compared to other state-of-the-art methods.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"3 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CaClust: linking genotype to transcriptional heterogeneity of follicular lymphoma using BCR and exomic variants CaClust:利用 BCR 和外显子变异将滤泡淋巴瘤的基因型与转录异质性联系起来
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-11-05 DOI: 10.1186/s13059-024-03417-1
Kazimierz Oksza-Orzechowski, Edwin Quinten, Shadi Shafighi, Szymon M. Kiełbasa, Hugo W. van Kessel, Ruben A. L. de Groen, Joost S. P. Vermaat, Julieta H. Sepúlveda Yáñez, Marcelo A. Navarrete, Hendrik Veelken, Cornelis A. M. van Bergen, Ewa Szczurek
Tumours exhibit high genotypic and transcriptional heterogeneity. Both affect cancer progression and treatment, but have been predominantly studied separately in follicular lymphoma. To comprehensively investigate the evolution and genotype-to-phenotype maps in follicular lymphoma, we introduce CaClust, a probabilistic graphical model integrating deep whole exome, single-cell RNA and B-cell receptor sequencing data to infer clone genotypes, cell-to-clone mapping, and single-cell genotyping. CaClust outperforms a state-of-the-art model on simulated and patient data. In-depth analyses of single cells from four samples showcase effects of driver mutations, follicular lymphoma evolution, possible therapeutic targets, and single-cell genotyping that agrees with an independent targeted resequencing experiment.
肿瘤表现出高度的基因型和转录异质性。两者都会影响癌症的进展和治疗,但在滤泡性淋巴瘤中主要是单独研究。为了全面研究滤泡性淋巴瘤的进化和基因型到表型图谱,我们引入了CaClust,这是一种概率图形模型,整合了深度全外显子组、单细胞RNA和B细胞受体测序数据,用于推断克隆基因型、细胞到克隆图谱和单细胞基因分型。在模拟数据和患者数据上,CaClust 的表现优于最先进的模型。对四个样本的单细胞进行的深入分析显示了驱动突变的影响、滤泡淋巴瘤的演变、可能的治疗靶点以及与独立的靶向重测序实验一致的单细胞基因分型。
{"title":"CaClust: linking genotype to transcriptional heterogeneity of follicular lymphoma using BCR and exomic variants","authors":"Kazimierz Oksza-Orzechowski, Edwin Quinten, Shadi Shafighi, Szymon M. Kiełbasa, Hugo W. van Kessel, Ruben A. L. de Groen, Joost S. P. Vermaat, Julieta H. Sepúlveda Yáñez, Marcelo A. Navarrete, Hendrik Veelken, Cornelis A. M. van Bergen, Ewa Szczurek","doi":"10.1186/s13059-024-03417-1","DOIUrl":"https://doi.org/10.1186/s13059-024-03417-1","url":null,"abstract":"Tumours exhibit high genotypic and transcriptional heterogeneity. Both affect cancer progression and treatment, but have been predominantly studied separately in follicular lymphoma. To comprehensively investigate the evolution and genotype-to-phenotype maps in follicular lymphoma, we introduce CaClust, a probabilistic graphical model integrating deep whole exome, single-cell RNA and B-cell receptor sequencing data to infer clone genotypes, cell-to-clone mapping, and single-cell genotyping. CaClust outperforms a state-of-the-art model on simulated and patient data. In-depth analyses of single cells from four samples showcase effects of driver mutations, follicular lymphoma evolution, possible therapeutic targets, and single-cell genotyping that agrees with an independent targeted resequencing experiment.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"35 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142580334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TDFPS-Designer: an efficient toolkit for barcode design and selection in nanopore sequencing TDFPS-Designer:纳米孔测序中条形码设计和选择的高效工具包
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-11-04 DOI: 10.1186/s13059-024-03423-3
Junhai Qi, Zhengyi Li, Yao-zhong Zhang, Guojun Li, Xin Gao, Renmin Han
Oxford Nanopore Technologies (ONT) offers ultrahigh-throughput multi-sample sequencing but only provides barcode kits that enable up to 96-sample multiplexing. We present TDFPS-Designer, a new toolkit for nanopore sequencing barcode design, which creates significantly more barcodes: 137 with a length of 20 base pairs, 410 at 24 bp, and 1779 at 30 bp, far surpassing ONT’s offerings. It includes GPU-based acceleration for ultra-fast demultiplexing and designs robust barcodes suitable for high-error ONT data. TDFPS-Designer outperforms current methods, improving the demultiplexing recall rate by 20% relative to Guppy, without a reduction in precision.
牛津纳米孔技术公司(ONT)提供超高通量多样品测序服务,但只提供最多可复用 96 个样品的条形码工具包。我们介绍的 TDFPS-Designer 是一种用于纳米孔测序条形码设计的新工具包,它能创建更多的条形码:137 个长度为 20 碱基对的条形码、410 个长度为 24 bp 的条形码和 1779 个长度为 30 bp 的条形码,远远超过了牛津纳米孔技术公司的产品。它包括基于 GPU 的加速功能,可实现超快的解复用,并设计出适用于高误差 ONT 数据的强大条形码。TDFPS-Designer 的性能优于当前的方法,与 Guppy 相比,它将解复用召回率提高了 20%,而精度却没有降低。
{"title":"TDFPS-Designer: an efficient toolkit for barcode design and selection in nanopore sequencing","authors":"Junhai Qi, Zhengyi Li, Yao-zhong Zhang, Guojun Li, Xin Gao, Renmin Han","doi":"10.1186/s13059-024-03423-3","DOIUrl":"https://doi.org/10.1186/s13059-024-03423-3","url":null,"abstract":"Oxford Nanopore Technologies (ONT) offers ultrahigh-throughput multi-sample sequencing but only provides barcode kits that enable up to 96-sample multiplexing. We present TDFPS-Designer, a new toolkit for nanopore sequencing barcode design, which creates significantly more barcodes: 137 with a length of 20 base pairs, 410 at 24 bp, and 1779 at 30 bp, far surpassing ONT’s offerings. It includes GPU-based acceleration for ultra-fast demultiplexing and designs robust barcodes suitable for high-error ONT data. TDFPS-Designer outperforms current methods, improving the demultiplexing recall rate by 20% relative to Guppy, without a reduction in precision.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"142 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142574464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benchmarking and building DNA binding affinity models using allele-specific and allele-agnostic transcription factor binding data 利用等位基因特异性和等位基因不确定性转录因子结合数据制定基准并构建 DNA 结合亲和力模型
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-31 DOI: 10.1186/s13059-024-03424-2
Xiaoting Li, Lucas A. N. Melo, Harmen J. Bussemaker
Transcription factors (TFs) bind to DNA in a highly sequence-specific manner. This specificity manifests itself in vivo as differences in TF occupancy between the two alleles at heterozygous loci. Genome-scale assays such as ChIP-seq currently are limited in their power to detect allele-specific binding (ASB) both in terms of read coverage and representation of individual variants in the cell lines used. This makes prediction of allelic differences in TF binding from sequence alone desirable, provided that the reliability of such predictions can be quantitatively assessed. We here propose methods for benchmarking sequence-to-affinity models for TF binding in terms of their ability to predict allelic imbalances in ChIP-seq counts. We use a likelihood function based on an over-dispersed binomial distribution to aggregate evidence for allelic preference across the genome without requiring statistical significance for individual variants. This allows us to systematically compare predictive performance when multiple binding models for the same TF are available. To facilitate the de novo inference of high-quality models from paired-end in vivo binding data such as ChIP-seq, ChIP-exo, and CUT&Tag without read mapping or peak calling, we introduce an extensible reimplementation of our biophysically interpretable machine learning framework named PyProBound. Explicitly accounting for assay-specific bias in DNA fragmentation rate when training on ChIP-seq yields improved TF binding models. Moreover, we show how PyProBound can leverage our threshold-free ASB likelihood function to perform de novo motif discovery using allele-specific ChIP-seq counts. Our work provides new strategies for predicting the functional impact of non-coding variants.
转录因子(TF)以高度序列特异性的方式与 DNA 结合。这种特异性在体内表现为杂合位点上两个等位基因之间 TF 占有率的差异。目前,ChIP-seq 等基因组规模的检测方法在检测等位基因特异性结合(ASB)方面的能力有限,这既体现在读取覆盖率方面,也体现在所用细胞系中个体变异的代表性方面。因此,如果能对此类预测的可靠性进行定量评估,那么仅从序列预测等位基因在 TF 结合方面的差异就很有必要。在此,我们提出了一些方法,用于对 TF 结合的序列-亲和模型预测 ChIP-seq 计数中等位基因不平衡的能力进行基准测试。我们使用基于过度分散二项分布的似然函数来汇总整个基因组中等位基因偏好的证据,而不要求单个变异的统计显著性。这样,当同一 TF 有多个结合模型时,我们就能系统地比较其预测性能。为了便于从成对端体内结合数据(如 ChIP-seq、ChIP-exo 和 CUT&Tag)中从头推断高质量模型,而无需读图或峰值调用,我们引入了一个可扩展的可生物物理解释的机器学习框架的重新实现,命名为 PyProBound。在对 ChIP-seq 进行训练时,明确考虑 DNA 片段破碎率的检测特异性偏差,可以改进 TF 结合模型。此外,我们还展示了 PyProBound 如何利用我们的无阈值 ASB 似然函数,使用等位基因特异性 ChIP-seq 计数从头开始发现主题。我们的工作为预测非编码变异的功能影响提供了新策略。
{"title":"Benchmarking and building DNA binding affinity models using allele-specific and allele-agnostic transcription factor binding data","authors":"Xiaoting Li, Lucas A. N. Melo, Harmen J. Bussemaker","doi":"10.1186/s13059-024-03424-2","DOIUrl":"https://doi.org/10.1186/s13059-024-03424-2","url":null,"abstract":"Transcription factors (TFs) bind to DNA in a highly sequence-specific manner. This specificity manifests itself in vivo as differences in TF occupancy between the two alleles at heterozygous loci. Genome-scale assays such as ChIP-seq currently are limited in their power to detect allele-specific binding (ASB) both in terms of read coverage and representation of individual variants in the cell lines used. This makes prediction of allelic differences in TF binding from sequence alone desirable, provided that the reliability of such predictions can be quantitatively assessed. We here propose methods for benchmarking sequence-to-affinity models for TF binding in terms of their ability to predict allelic imbalances in ChIP-seq counts. We use a likelihood function based on an over-dispersed binomial distribution to aggregate evidence for allelic preference across the genome without requiring statistical significance for individual variants. This allows us to systematically compare predictive performance when multiple binding models for the same TF are available. To facilitate the de novo inference of high-quality models from paired-end in vivo binding data such as ChIP-seq, ChIP-exo, and CUT&Tag without read mapping or peak calling, we introduce an extensible reimplementation of our biophysically interpretable machine learning framework named PyProBound. Explicitly accounting for assay-specific bias in DNA fragmentation rate when training on ChIP-seq yields improved TF binding models. Moreover, we show how PyProBound can leverage our threshold-free ASB likelihood function to perform de novo motif discovery using allele-specific ChIP-seq counts. Our work provides new strategies for predicting the functional impact of non-coding variants.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"8 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142555785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Response to "Neglecting normalization impact in semi-synthetic RNA-seq data simulation generates artificial false positives" and "Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples" 对 "在半合成 RNA-seq 数据模拟中忽略归一化的影响会产生人为的假阳性 "和 "在分析人类群体样本时,采用流行的差异表达方法进行反向归一化可大大降低假阳性 "的回应
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-30 DOI: 10.1186/s13059-024-03232-8
Xinzhou Ge, Yumei Li, Wei Li, Jingyi Jessica Li
Two correspondences raised concerns or comments about our analyses regarding exaggerated false positives found by differential expression (DE) methods. Here, we discuss the points they raise and explain why we agree or disagree with these points. We add new analysis to confirm that the Wilcoxon rank-sum test remains the most robust method compared to the other five DE methods (DESeq2, edgeR, limma-voom, dearseq, and NOISeq) in two-condition DE analyses after considering normalization and winsorization, the data preprocessing steps discussed in the two correspondences.
有两封来信对我们关于差异表达(DE)方法发现的夸大假阳性的分析提出了担忧或意见。在此,我们讨论了他们提出的观点,并解释了我们同意或不同意这些观点的原因。我们添加了新的分析,以确认在考虑了归一化和赢家化(即两封来信中讨论的数据预处理步骤)之后,在双条件 DE 分析中,与其他五种 DE 方法(DESeq2、edgeR、limma-voom、dearseq 和 NOISeq)相比,Wilcoxon 秩和检验仍然是最稳健的方法。
{"title":"Response to \"Neglecting normalization impact in semi-synthetic RNA-seq data simulation generates artificial false positives\" and \"Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples\"","authors":"Xinzhou Ge, Yumei Li, Wei Li, Jingyi Jessica Li","doi":"10.1186/s13059-024-03232-8","DOIUrl":"https://doi.org/10.1186/s13059-024-03232-8","url":null,"abstract":"Two correspondences raised concerns or comments about our analyses regarding exaggerated false positives found by differential expression (DE) methods. Here, we discuss the points they raise and explain why we agree or disagree with these points. We add new analysis to confirm that the Wilcoxon rank-sum test remains the most robust method compared to the other five DE methods (DESeq2, edgeR, limma-voom, dearseq, and NOISeq) in two-condition DE analyses after considering normalization and winsorization, the data preprocessing steps discussed in the two correspondences.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"15 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142541575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neglecting the impact of normalization in semi-synthetic RNA-seq data simulations generates artificial false positives 在半合成 RNA-seq 数据模拟中忽略归一化的影响会产生人为假阳性结果
IF 12.3 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-30 DOI: 10.1186/s13059-024-03231-9
Boris P. Hejblum, Kalidou Ba, Rodolphe Thiébaut, Denis Agniel
A recent study reported exaggerated false positives by popular differential expression methods when analyzing large population samples. We reproduce the differential expression analysis simulation results and identify a caveat in the data generation process. Data not truly generated under the null hypothesis led to incorrect comparisons of benchmark methods. We provide corrected simulation results that demonstrate the good performance of dearseq and argue against the superiority of the Wilcoxon rank-sum test as suggested in the previous study.
最近的一项研究报告称,流行的差异表达分析方法在分析大量群体样本时会出现夸大的假阳性。我们重现了差异表达分析的模拟结果,并发现了数据生成过程中的一个注意事项。由于数据不是在零假设下真实生成的,导致基准方法的比较结果不正确。我们提供了修正后的模拟结果,证明了 dearseq 的良好性能,并反驳了之前研究中提出的 Wilcoxon 秩和检验的优越性。
{"title":"Neglecting the impact of normalization in semi-synthetic RNA-seq data simulations generates artificial false positives","authors":"Boris P. Hejblum, Kalidou Ba, Rodolphe Thiébaut, Denis Agniel","doi":"10.1186/s13059-024-03231-9","DOIUrl":"https://doi.org/10.1186/s13059-024-03231-9","url":null,"abstract":"A recent study reported exaggerated false positives by popular differential expression methods when analyzing large population samples. We reproduce the differential expression analysis simulation results and identify a caveat in the data generation process. Data not truly generated under the null hypothesis led to incorrect comparisons of benchmark methods. We provide corrected simulation results that demonstrate the good performance of dearseq and argue against the superiority of the Wilcoxon rank-sum test as suggested in the previous study.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"66 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142541577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1