首页 > 最新文献

Genome research最新文献

英文 中文
Factors impacting target-enriched long-read sequencing of resistomes and mobilomes 影响抗性基因组和动员基因组目标富集长读程测序的因素
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-05 DOI: 10.1101/gr.279226.124
Ilya B. Slizovskiy, Nathalie Bonin, Jonathan E. Bravo, Peter M. Ferm, Jacob Singer, Christina Boucher, Noelle R. Noyes
We investigated the efficiency of target-enriched long-read sequencing (TELSeq) for detecting antimicrobial resistance genes (ARGs) and mobile genetic elements (MGEs) within complex matrices. We aimed to overcome limitations associated with traditional antimicrobial resistance (AMR) detection methods, including short-read shotgun metagenomics, which can lack sensitivity, specificity, and the ability to provide detailed genomic context. By combining biotinylated probe-based enrichment with long-read sequencing, we facilitated the amplification and sequencing of ARGs, eliminating the need for bioinformatic reconstruction. Our experimental design included replicates of human fecal microbiota transplant material, bovine feces, pristine prairie soil, and a mock human gut microbial community, allowing us to examine variables including genomic DNA input and probe set composition. Our findings demonstrated that TELSeq markedly improves the detection rates of ARGs and MGEs compared to traditional sequencing methods, underlining its potential for accurate AMR monitoring. A key insight from our research is the importance of incorporating mobilome profiles to better predict the transferability of ARGs within microbial communities, prompting a recommendation for the use of combined ARG–MGE probe sets for future studies. We also reveal limitations for ARG detection from low-input workflows, and describe the next steps for ongoing protocol refinement to minimize technical variability and expand utility in clinical and public health settings. This effort is part of our broader commitment to advancing methodologies that address the global challenge of AMR.
我们研究了目标富集长读数测序(TELSeq)在复杂基质中检测抗菌素耐药性基因(ARGs)和移动遗传因子(MGEs)的效率。我们的目标是克服传统抗菌药耐药性(AMR)检测方法的局限性,包括缺乏灵敏度、特异性和提供详细基因组背景信息能力的短读数猎枪元基因组学。通过将基于生物素化探针的富集与长线程测序相结合,我们促进了 ARGs 的扩增和测序,从而消除了生物信息重建的需要。我们的实验设计包括人类粪便微生物群移植材料、牛粪便、原始草原土壤和模拟人类肠道微生物群落的重复实验,使我们能够研究包括基因组 DNA 输入和探针集组成在内的变量。我们的研究结果表明,与传统测序方法相比,TELSeq 显著提高了 ARGs 和 MGEs 的检出率,凸显了其在准确监测 AMR 方面的潜力。我们的研究得出的一个重要结论是,必须结合移动组图谱来更好地预测 ARGs 在微生物群落中的可转移性,因此建议在未来的研究中使用 ARG-MGE 组合探针集。我们还揭示了低投入工作流程在检测 ARG 方面的局限性,并介绍了下一步如何不断完善方案,以最大限度地减少技术变异,扩大在临床和公共卫生环境中的应用。这项工作是我们更广泛承诺的一部分,我们致力于推进各种方法,以应对 AMR 这一全球性挑战。
{"title":"Factors impacting target-enriched long-read sequencing of resistomes and mobilomes","authors":"Ilya B. Slizovskiy, Nathalie Bonin, Jonathan E. Bravo, Peter M. Ferm, Jacob Singer, Christina Boucher, Noelle R. Noyes","doi":"10.1101/gr.279226.124","DOIUrl":"https://doi.org/10.1101/gr.279226.124","url":null,"abstract":"We investigated the efficiency of target-enriched long-read sequencing (TELSeq) for detecting antimicrobial resistance genes (ARGs) and mobile genetic elements (MGEs) within complex matrices. We aimed to overcome limitations associated with traditional antimicrobial resistance (AMR) detection methods, including short-read shotgun metagenomics, which can lack sensitivity, specificity, and the ability to provide detailed genomic context. By combining biotinylated probe-based enrichment with long-read sequencing, we facilitated the amplification and sequencing of ARGs, eliminating the need for bioinformatic reconstruction. Our experimental design included replicates of human fecal microbiota transplant material, bovine feces, pristine prairie soil, and a mock human gut microbial community, allowing us to examine variables including genomic DNA input and probe set composition. Our findings demonstrated that TELSeq markedly improves the detection rates of ARGs and MGEs compared to traditional sequencing methods, underlining its potential for accurate AMR monitoring. A key insight from our research is the importance of incorporating mobilome profiles to better predict the transferability of ARGs within microbial communities, prompting a recommendation for the use of combined ARG–MGE probe sets for future studies. We also reveal limitations for ARG detection from low-input workflows, and describe the next steps for ongoing protocol refinement to minimize technical variability and expand utility in clinical and public health settings. This effort is part of our broader commitment to advancing methodologies that address the global challenge of AMR.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"8 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142580570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-read subcellular fractionation and sequencing reveals the translational fate of full-length mRNA isoforms during neuronal differentiation. 长读数亚细胞分馏和测序揭示了全长 mRNA 同工型在神经元分化过程中的翻译命运。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-05 DOI: 10.1101/gr.279170.124
Alexander J Ritter, Jolene M Draper, Chris Vollmers, Jeremy R Sanford

Alternative splicing (AS) alters the cis-regulatory landscape of mRNA isoforms, leading to transcripts with distinct localization, stability, and translational efficiency. To rigorously investigate mRNA isoform-specific ribosome association, we generated subcellular fractionation and sequencing (Frac-seq) libraries using both conventional short reads and long reads from human embryonic stem cells (ESCs) and neural progenitor cells (NPCs) derived from the same ESCs. We performed de novo transcriptome assembly from high-confidence long reads from cytosolic, monosomal, light, and heavy polyribosomal fractions and quantified their abundance using short reads from their respective subcellular fractions. Thousands of transcripts in each cell type exhibited association with particular subcellular fractions relative to the cytosol. Of the multi-isoform genes, 27% and 19% exhibited significant differential isoform sedimentation in ESCs and NPCs, respectively. Alternative promoter usage and internal exon skipping accounted for the majority of differences between isoforms from the same gene. Random forest classifiers implicated coding sequence (CDS) and untranslated region (UTR) lengths as important determinants of isoform-specific sedimentation profiles, and motif analyses reveal potential cell type-specific and subcellular fraction-associated RNA-binding protein signatures. Taken together, our data demonstrate that alternative mRNA processing within the CDS and UTRs impacts the translational control of mRNA isoforms during stem cell differentiation, and highlight the utility of using a novel long-read sequencing-based method to study translational control.

替代剪接(AS)改变了 mRNA 同工型的顺式调控结构,导致转录本具有不同的定位、稳定性和翻译效率。为了严格研究mRNA异构体特异性核糖体关联,我们使用传统的短读数和长读数生成了亚细胞分馏和测序(Frac-seq)文库,这些文库来自人类胚胎干细胞(ESC)和来自同一ESC的神经祖细胞(NPC)。我们利用来自细胞质、单体、轻型和重型多核糖体组分的高置信度长读数进行了从头转录组组装,并利用来自各自亚细胞组分的短读数量化了它们的丰度。与细胞质相比,每种细胞类型中都有数千个转录本与特定亚细胞组分相关。在多同工酶基因中,分别有 27% 和 19% 的基因在 ESC 和 NPC 中表现出明显的同工酶沉积差异。启动子的交替使用和内部外显子的跳转是造成同一基因不同异构体之间差异的主要原因。随机森林分类器表明,编码序列(CDS)和UTR长度是决定同工酶特异性沉降谱的重要因素,而主题分析揭示了潜在的细胞类型特异性和亚细胞组分相关的RNA结合蛋白特征。总之,我们的数据证明了在干细胞分化过程中,CDS和UTR内的mRNA替代加工影响了mRNA异构体的翻译控制,并突出了使用基于长读数测序的新型方法研究翻译控制的实用性。
{"title":"Long-read subcellular fractionation and sequencing reveals the translational fate of full-length mRNA isoforms during neuronal differentiation.","authors":"Alexander J Ritter, Jolene M Draper, Chris Vollmers, Jeremy R Sanford","doi":"10.1101/gr.279170.124","DOIUrl":"10.1101/gr.279170.124","url":null,"abstract":"<p><p>Alternative splicing (AS) alters the <i>cis</i>-regulatory landscape of mRNA isoforms, leading to transcripts with distinct localization, stability, and translational efficiency. To rigorously investigate mRNA isoform-specific ribosome association, we generated subcellular fractionation and sequencing (Frac-seq) libraries using both conventional short reads and long reads from human embryonic stem cells (ESCs) and neural progenitor cells (NPCs) derived from the same ESCs. We performed de novo transcriptome assembly from high-confidence long reads from cytosolic, monosomal, light, and heavy polyribosomal fractions and quantified their abundance using short reads from their respective subcellular fractions. Thousands of transcripts in each cell type exhibited association with particular subcellular fractions relative to the cytosol. Of the multi-isoform genes, 27% and 19% exhibited significant differential isoform sedimentation in ESCs and NPCs, respectively. Alternative promoter usage and internal exon skipping accounted for the majority of differences between isoforms from the same gene. Random forest classifiers implicated coding sequence (CDS) and untranslated region (UTR) lengths as important determinants of isoform-specific sedimentation profiles, and motif analyses reveal potential cell type-specific and subcellular fraction-associated RNA-binding protein signatures. Taken together, our data demonstrate that alternative mRNA processing within the CDS and UTRs impacts the translational control of mRNA isoforms during stem cell differentiation, and highlight the utility of using a novel long-read sequencing-based method to study translational control.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141261622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging the T2T assembly to resolve rare and pathogenic inversions in reference genome gaps 利用 T2T 组装解决参考基因组间隙中的罕见和致病倒位问题
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-01 DOI: 10.1101/gr.279346.124
Kristine Bilgrav Saether, Jesper Eisfeldt, Jesse D. Bengtsson, Ming Yin Lun, Christopher M. Grochowski, Medhat Mahmoud, Hsiao-Tuan Chao, Jill A. Rosenfeld, Pengfei Liu, Marlene Ek, Jakob Schuy, Adam Ameur, Hongzheng Dai, Undiagnosed Diseases Network, James Paul Hwang, Fritz J. Sedlazeck, Weimin Bi, Ronit Marom, Josephine Wincent, Ann Nordgren, Claudia M.B. Carvalho, Anna Lindstrand
Chromosomal inversions (INVs) are particularly challenging to detect due to their copy-number neutral state and association with repetitive regions. Inversions represent about 1/20 of all balanced structural chromosome aberrations and can lead to disease by gene disruption or altering regulatory regions of dosage-sensitive genes in cis. Short-read genome sequencing (srGS) can only resolve ∼70% of cytogenetically visible inversions referred to clinical diagnostic laboratories, likely due to breakpoints in repetitive regions. Here, we study 12 inversions by long-read genome sequencing (lrGS) (n = 9) or srGS (n = 3) and resolve nine of them. In four cases, the inversion breakpoint region was missing from at least one of the human reference genomes (GRCh37, GRCh38, T2T-CHM13) and a reference agnostic analysis was needed. One of these cases, an INV9 mappable only in de novo assembled lrGS data using T2T-CHM13 disrupts EHMT1 consistent with a Mendelian diagnosis (Kleefstra syndrome 1; MIM#610253). Next, by pairwise comparison between T2T-CHM13, GRCh37, and GRCh38, as well as the chimpanzee and bonobo, we show that hundreds of megabases of sequence are missing from at least one human reference, highlighting that primate genomes contribute to genomic diversity. Aligning population genomic data to these regions indicated that these regions are variable between individuals. Our analysis emphasizes that T2T-CHM13 is necessary to maximize the value of lrGS for optimal inversion detection in clinical diagnostics. These results highlight the importance of leveraging diverse and comprehensive reference genomes to resolve unsolved molecular cases in rare diseases.
染色体倒位(INVs)由于其拷贝数中性状态和与重复区的关联,检测起来特别具有挑战性。倒位约占所有平衡染色体结构畸变的 1/20,可通过基因中断或改变顺式剂量敏感基因的调控区而导致疾病。短读基因组测序(srGS)只能解决临床诊断实验室转来的70%细胞遗传学上可见的倒位,这可能是由于重复区域的断点造成的。在这里,我们通过长线程基因组测序(lrGS)(n = 9)或 srGS(n = 3)研究了 12 例倒位,并解决了其中的 9 例。在四个案例中,反转断点区域在至少一个人类参考基因组(GRCh37、GRCh38、T2T-CHM13)中缺失,因此需要进行参考不可知分析。在这些病例中,有一个 INV9 仅在使用 T2T-CHM13 的从头组装 lrGS 数据中可映射,它破坏了 EHMT1,与孟德尔诊断一致(Kleefstra 综合征 1;MIM#610253)。接下来,通过对 T2T-CHM13、GRCh37 和 GRCh38 以及黑猩猩和倭黑猩猩进行配对比较,我们发现至少有一个人类参考文献缺失了数百兆字节的序列,这突出表明灵长类动物基因组对基因组多样性做出了贡献。将群体基因组数据与这些区域进行比对表明,这些区域在不同个体之间存在差异。我们的分析强调,T2T-CHM13 对临床诊断中的最佳反转检测来说,是最大化 lrGS 价值的必要条件。这些结果突显了利用多样化和全面的参考基因组解决罕见病未解决的分子病例的重要性。
{"title":"Leveraging the T2T assembly to resolve rare and pathogenic inversions in reference genome gaps","authors":"Kristine Bilgrav Saether, Jesper Eisfeldt, Jesse D. Bengtsson, Ming Yin Lun, Christopher M. Grochowski, Medhat Mahmoud, Hsiao-Tuan Chao, Jill A. Rosenfeld, Pengfei Liu, Marlene Ek, Jakob Schuy, Adam Ameur, Hongzheng Dai, Undiagnosed Diseases Network, James Paul Hwang, Fritz J. Sedlazeck, Weimin Bi, Ronit Marom, Josephine Wincent, Ann Nordgren, Claudia M.B. Carvalho, Anna Lindstrand","doi":"10.1101/gr.279346.124","DOIUrl":"https://doi.org/10.1101/gr.279346.124","url":null,"abstract":"Chromosomal inversions (INVs) are particularly challenging to detect due to their copy-number neutral state and association with repetitive regions. Inversions represent about 1/20 of all balanced structural chromosome aberrations and can lead to disease by gene disruption or altering regulatory regions of dosage-sensitive genes in <em>cis</em>. Short-read genome sequencing (srGS) can only resolve ∼70% of cytogenetically visible inversions referred to clinical diagnostic laboratories, likely due to breakpoints in repetitive regions. Here, we study 12 inversions by long-read genome sequencing (lrGS) (<em>n</em> = 9) or srGS (<em>n</em> = 3) and resolve nine of them. In four cases, the inversion breakpoint region was missing from at least one of the human reference genomes (GRCh37, GRCh38, T2T-CHM13) and a reference agnostic analysis was needed. One of these cases, an INV9 mappable only in de novo assembled lrGS data using T2T-CHM13 disrupts <em>EHMT1</em> consistent with a Mendelian diagnosis (Kleefstra syndrome 1; MIM#610253). Next, by pairwise comparison between T2T-CHM13, GRCh37, and GRCh38, as well as the chimpanzee and bonobo, we show that hundreds of megabases of sequence are missing from at least one human reference, highlighting that primate genomes contribute to genomic diversity. Aligning population genomic data to these regions indicated that these regions are variable between individuals. Our analysis emphasizes that T2T-CHM13 is necessary to maximize the value of lrGS for optimal inversion detection in clinical diagnostics. These results highlight the importance of leveraging diverse and comprehensive reference genomes to resolve unsolved molecular cases in rare diseases.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"16 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142563090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation. 对来自 "1000 基因组计划 "的样本进行高覆盖率纳米孔测序,建立人类遗传变异综合目录。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-30 DOI: 10.1101/gr.279273.124
Jonas A Gustafson, Sophia B Gibson, Nikhita Damaraju, Miranda P G Zalusky, Kendra Hoekzema, David Twesigomwe, Lei Yang, Anthony A Snead, Phillip A Richmond, Wouter De Coster, Nathan D Olson, Andrea Guarracino, Qiuhui Li, Angela L Miller, Joy Goffena, Zachary B Anderson, Sophie H R Storz, Sydney A Ward, Maisha Sinha, Claudia Gonzaga-Jauregui, Wayne E Clarke, Anna O Basile, André Corvelo, Catherine Reeves, Adrienne Helland, Rajeeva Lochan Musunuri, Mahler Revsine, Karynne E Patterson, Cate R Paschal, Christina Zakarian, Sara Goodwin, Tanner D Jensen, Esther Robb, W Richard McCombie, Fritz J Sedlazeck, Justin M Zook, Stephen B Montgomery, Erik Garrison, Mikhail Kolmogorov, Michael C Schatz, Richard N McLaughlin, Harriet Dashnow, Michael C Zody, Matt Loose, Miten Jain, Evan E Eichler, Danny E Miller

Fewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control data sets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project (1KGP) Oxford Nanopore Technologies Sequencing Consortium aims to generate LRS data from at least 800 of the 1KGP samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37× and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.

只有不到一半的孟德尔或单基因疑似病例在经过全面的临床基因检测后获得了精确的分子诊断。数据质量和成本的提高提高了人们对使用长读程测序(LRS)简化临床基因组检测的兴趣,但由于缺乏用于变异筛选和优先排序的对照数据集,LRS 数据的三级分析具有挑战性。为了解决这个问题,1000 基因组计划 ONT 测序联盟的目标是从 1000 基因组计划中至少 800 个样本中生成 LRS 数据。我们的目标是利用 LRS 来识别更广泛的变异,从而提高我们对人类正常变异模式的理解。在这里,我们展示了对代表所有 5 个超级种群和 19 个亚种群的前 100 个样本的分析数据。这些样本的平均测序覆盖深度为 37 倍,测序读数 N50 为 54 kbp,在识别同源多聚物区域之外的单核苷酸和滞后变异方面与之前的研究具有很高的一致性。通过使用多个结构变异(SV)调用器,我们在每个基因组中平均鉴定出 24,543 个高置信度 SV,其中包括可能破坏基因功能的共享和私有 SV,以及使用短读数无法检测到的疾病相关重复序列中的致病性扩增。对甲基化特征的评估揭示了已知印迹位点的预期模式、具有偏斜 X 失活模式的样本以及新的差异甲基化区域。所有原始测序数据、处理过的数据和统计摘要都是公开的,为临床遗传学界发现致病性 SV 提供了宝贵的资源。
{"title":"High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation.","authors":"Jonas A Gustafson, Sophia B Gibson, Nikhita Damaraju, Miranda P G Zalusky, Kendra Hoekzema, David Twesigomwe, Lei Yang, Anthony A Snead, Phillip A Richmond, Wouter De Coster, Nathan D Olson, Andrea Guarracino, Qiuhui Li, Angela L Miller, Joy Goffena, Zachary B Anderson, Sophie H R Storz, Sydney A Ward, Maisha Sinha, Claudia Gonzaga-Jauregui, Wayne E Clarke, Anna O Basile, André Corvelo, Catherine Reeves, Adrienne Helland, Rajeeva Lochan Musunuri, Mahler Revsine, Karynne E Patterson, Cate R Paschal, Christina Zakarian, Sara Goodwin, Tanner D Jensen, Esther Robb, W Richard McCombie, Fritz J Sedlazeck, Justin M Zook, Stephen B Montgomery, Erik Garrison, Mikhail Kolmogorov, Michael C Schatz, Richard N McLaughlin, Harriet Dashnow, Michael C Zody, Matt Loose, Miten Jain, Evan E Eichler, Danny E Miller","doi":"10.1101/gr.279273.124","DOIUrl":"10.1101/gr.279273.124","url":null,"abstract":"<p><p>Fewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control data sets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project (1KGP) Oxford Nanopore Technologies Sequencing Consortium aims to generate LRS data from at least 800 of the 1KGP samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37× and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142365031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visualization and analysis of medically relevant tandem repeats in nanopore sequencing of control cohorts with pathSTR. 纳米孔测序中与医学相关的串联重复序列的可视化和分析,以及病理序列对照组。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-30 DOI: 10.1101/gr.279265.124
Wouter De Coster, Ida Höijer, Inge Bruggeman, Svenn D'Hert, Malin Melin, Adam Ameur, Rosa Rademakers

The lack of population-scale databases hampers research and diagnostics for medically relevant tandem repeats and repeat expansions. We attempt to fill this gap using our pathSTR web tool, which leverages long-read sequencing of large cohorts to determine repeat length and sequence composition in a healthy population. The current version includes 1040 individuals of The 1000 Genomes Project cohort sequenced on the Oxford Nanopore Technologies PromethION. A comprehensive set of medically relevant tandem repeats has been genotyped using STRdust and LongTR to determine the tandem repeat length and sequence composition. PathSTR provides rich visualizations of this data set and the feature to upload one's data for comparison along the control cohort. We demonstrate the implementation of this application using data from targeted nanopore sequencing of a patient with myotonic dystrophy type 1. This resource will empower the genetics community to get a more complete overview of normal variation in tandem repeat length and sequence composition and, as such, enable a better assessment of rare tandem repeat alleles observed in patients.

缺乏人群规模的数据库阻碍了与医学相关的串联重复序列和重复扩增的研究和诊断。我们试图利用我们的 pathSTR 网络工具填补这一空白,该工具利用大型队列的长读程测序来确定健康人群的重复序列长度和序列组成。当前版本包括在牛津纳米孔技术公司的 PromethION 上测序的 1000 基因组计划队列中的 1040 个个体。利用 STRdust 和 LongTR 对一组全面的医学相关串联重复序列进行了基因分型,以确定串联重复序列的长度和序列组成。PathSTR 为该数据集提供了丰富的可视化功能,并提供了上传个人数据以便与对照组数据进行比较的功能。我们利用一名 1 型肌张力营养不良症患者的定向纳米孔测序数据演示了这一应用的实施。这一资源将使遗传学界能够更全面地了解串联重复长度和序列组成的正常变异,从而更好地评估在患者身上观察到的罕见串联重复等位基因。
{"title":"Visualization and analysis of medically relevant tandem repeats in nanopore sequencing of control cohorts with pathSTR.","authors":"Wouter De Coster, Ida Höijer, Inge Bruggeman, Svenn D'Hert, Malin Melin, Adam Ameur, Rosa Rademakers","doi":"10.1101/gr.279265.124","DOIUrl":"10.1101/gr.279265.124","url":null,"abstract":"<p><p>The lack of population-scale databases hampers research and diagnostics for medically relevant tandem repeats and repeat expansions. We attempt to fill this gap using our pathSTR web tool, which leverages long-read sequencing of large cohorts to determine repeat length and sequence composition in a healthy population. The current version includes 1040 individuals of The 1000 Genomes Project cohort sequenced on the Oxford Nanopore Technologies PromethION. A comprehensive set of medically relevant tandem repeats has been genotyped using STRdust and LongTR to determine the tandem repeat length and sequence composition. PathSTR provides rich visualizations of this data set and the feature to upload one's data for comparison along the control cohort. We demonstrate the implementation of this application using data from targeted nanopore sequencing of a patient with myotonic dystrophy type 1. This resource will empower the genetics community to get a more complete overview of normal variation in tandem repeat length and sequence composition and, as such, enable a better assessment of rare tandem repeat alleles observed in patients.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141987779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometric deep learning framework for de novo genome assembly 用于从头开始基因组组装的几何深度学习框架
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-29 DOI: 10.1101/gr.279307.124
Lovro Vrček, Xavier Bresson, Thomas Laurent, Martin Schmitz, Kenji Kawaguchi, Mile Šikić
The critical stage of every de novo genome assembler is identifying paths in assembly graphs that correspond to the reconstructed genomic sequences. The existing algorithmic methods struggle with this, primarily due to repetitive regions causing complex graph tangles, leading to fragmented assemblies. Here, we introduce GNNome, a framework for path identification based on geometric deep learning that enables training models on assembly graphs without relying on existing assembly strategies. By leveraging only the symmetries inherent to the problem, GNNome reconstructs assemblies from PacBio HiFi reads with contiguity and quality comparable to those of the state-of-the-art tools across several species. With every new genome assembled telomere-to-telomere, the amount of reliable training data at our disposal increases. Combining the straightforward generation of abundant simulated data for diverse genomic structures with the AI approach makes the proposed framework a plausible cornerstone for future work on reconstructing complex genomes with different ploidy and aneuploidy degrees. To facilitate such developments, we make the framework and the best-performing model publicly available, provided as a tool that can directly be used to assemble new haploid genomes.
每一个全新基因组装配器的关键阶段都是在装配图中找出与重建的基因组序列相对应的路径。现有的算法方法很难做到这一点,主要原因是重复区域会造成复杂的图形纠结,从而导致组装结果支离破碎。在这里,我们介绍基于几何深度学习的路径识别框架 GNNome,它可以在不依赖现有组装策略的情况下在组装图上训练模型。GNNome 仅利用问题固有的对称性,就能从 PacBio HiFi 读数中重建组装,其连续性和质量可与多个物种的最先进工具相媲美。每组装一个新的端粒到端粒的基因组,我们就能获得更多可靠的训练数据。针对不同基因组结构直接生成大量模拟数据的方法与人工智能方法相结合,使我们提出的框架成为未来重建不同倍性和非整倍体程度的复杂基因组工作的可靠基石。为了促进这种发展,我们公开了该框架和表现最佳的模型,并将其作为一种可直接用于组装新单倍体基因组的工具。
{"title":"Geometric deep learning framework for de novo genome assembly","authors":"Lovro Vrček, Xavier Bresson, Thomas Laurent, Martin Schmitz, Kenji Kawaguchi, Mile Šikić","doi":"10.1101/gr.279307.124","DOIUrl":"https://doi.org/10.1101/gr.279307.124","url":null,"abstract":"The critical stage of every de novo genome assembler is identifying paths in assembly graphs that correspond to the reconstructed genomic sequences. The existing algorithmic methods struggle with this, primarily due to repetitive regions causing complex graph tangles, leading to fragmented assemblies. Here, we introduce GNNome, a framework for path identification based on geometric deep learning that enables training models on assembly graphs without relying on existing assembly strategies. By leveraging only the symmetries inherent to the problem, GNNome reconstructs assemblies from PacBio HiFi reads with contiguity and quality comparable to those of the state-of-the-art tools across several species. With every new genome assembled telomere-to-telomere, the amount of reliable training data at our disposal increases. Combining the straightforward generation of abundant simulated data for diverse genomic structures with the AI approach makes the proposed framework a plausible cornerstone for future work on reconstructing complex genomes with different ploidy and aneuploidy degrees. To facilitate such developments, we make the framework and the best-performing model publicly available, provided as a tool that can directly be used to assemble new haploid genomes.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"34 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142541287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The chromatin tapestry as a framework for neurodevelopment. 染色质织锦是神经发育的框架。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-29 DOI: 10.1101/gr.278408.123
Ben Nolan, Timothy E Reznicek, Christopher T Cummings, M Jordan Rowley

The neuronal nucleus houses a meticulously organized genome. Within this structure, genetic material is not simply compacted but arranged into a precise and functional 3D chromatin landscape essential for cellular regulation. This mini-review highlights the importance of this chromatin landscape in healthy neurodevelopment, as well as the diseases that occur with aberrant chromatin architecture. We discuss insights into the fundamental mechanistic relationship between histone modifications, DNA methylation, and genome organization. We then discuss findings that reveal how these epigenetic features change throughout normal neurodevelopment. Finally, we highlight single-gene neurodevelopmental disorders that illustrate the interdependence of epigenetic features, showing how disruptions in DNA methylation or genome architecture can ripple across the entire epigenome. As such, we emphasize the importance of measuring multiple chromatin architectural aspects, as the disruption of one mechanism can likely impact others in the intricate epigenetic network. This mini-review underscores the vast gaps in our understanding of chromatin structure in neurodevelopmental diseases and the substantial research needed to understand the interplay between chromatin features and neurodevelopment.

神经细胞核内有一个组织严密的基因组。在这个结构中,遗传物质并不是简单地压缩,而是排列成一个精确的功能性三维染色质景观,这对细胞调控至关重要。这篇微型综述强调了染色质结构在健康神经发育中的重要性,以及染色质结构异常导致的疾病。我们将讨论组蛋白修饰、DNA 甲基化和基因组组织之间的基本机制关系。然后,我们将讨论揭示这些表观遗传特征如何在正常神经发育过程中发生变化的研究结果。最后,我们重点介绍了单基因神经发育障碍,这些障碍说明了表观遗传特征之间的相互依存关系,显示了 DNA 甲基化或基因组结构的破坏是如何波及整个表观遗传组的。因此,我们强调测量多种染色质结构方面的重要性,因为一种机制的破坏很可能会影响错综复杂的表观遗传网络中的其他机制。这篇微型综述强调了我们对神经发育性疾病中染色质结构的理解存在巨大差距,要了解染色质特征与神经发育之间的相互作用还需要进行大量研究。
{"title":"The chromatin tapestry as a framework for neurodevelopment.","authors":"Ben Nolan, Timothy E Reznicek, Christopher T Cummings, M Jordan Rowley","doi":"10.1101/gr.278408.123","DOIUrl":"10.1101/gr.278408.123","url":null,"abstract":"<p><p>The neuronal nucleus houses a meticulously organized genome. Within this structure, genetic material is not simply compacted but arranged into a precise and functional 3D chromatin landscape essential for cellular regulation. This mini-review highlights the importance of this chromatin landscape in healthy neurodevelopment, as well as the diseases that occur with aberrant chromatin architecture. We discuss insights into the fundamental mechanistic relationship between histone modifications, DNA methylation, and genome organization. We then discuss findings that reveal how these epigenetic features change throughout normal neurodevelopment. Finally, we highlight single-gene neurodevelopmental disorders that illustrate the interdependence of epigenetic features, showing how disruptions in DNA methylation or genome architecture can ripple across the entire epigenome. As such, we emphasize the importance of measuring multiple chromatin architectural aspects, as the disruption of one mechanism can likely impact others in the intricate epigenetic network. This mini-review underscores the vast gaps in our understanding of chromatin structure in neurodevelopmental diseases and the substantial research needed to understand the interplay between chromatin features and neurodevelopment.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"34 10","pages":"1477-1486"},"PeriodicalIF":6.2,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11529992/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resolving complex duplication variants in autism spectrum disorder using long-read genome sequencing 利用长线程基因组测序解决自闭症谱系障碍中的复杂重复变异问题
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-29 DOI: 10.1101/gr.279263.124
Jesper Eisfeldt, Edward J. Higginbotham, Felix Lenner, Jennifer Howe, Bridget A. Fernandez, Anna Lindstrand, Stephen W. Scherer, Lars Feuk
Rare or de novo structural variation, primarily in the form of copy number variants, is detected in 5%–10% of autism spectrum disorder (ASD) families. While complex structural variants involving duplications can generally be detected using microarray or short-read genome sequencing (GS), these methods frequently fail to characterize breakpoints at nucleotide resolution, requiring additional molecular methods for validation and fine-mapping. Here, we use Oxford Nanopore Technologies PromethION long-read GS to characterize complex genomic rearrangements (CGRs) involving large duplications that segregate with ASD in five families. In total, we investigated 13 CGR carriers and were able to resolve all breakpoint junctions at nucleotide resolution. While all breakpoints were identified, the precise genomic architecture of one rearrangement remained unresolved with three different potential structures. The findings in two families include potential fusion genes formed through duplication rearrangements, involving IL1RAPL1–DMD and SUPT16H–CHD8. In two of the families originating from the same geographical region, an identical rearrangement involving ANK2 was identified, which likely represents a founder variant. In addition, we analyze methylation status directly from the long-read data, allowing us to assess the activity of rearranged genes and regulatory regions. Investigation of methylation across the CGRs reveals aberrant methylation status in carriers across a rearrangement affecting the CREBBP locus. In aggregate, our results demonstrate the utility of nanopore sequencing to pinpoint CGRs associated with ASD in five unrelated families, and highlight the importance of a gene-centric description of disease-associated complex chromosomal rearrangements.
在5%-10%的自闭症谱系障碍(ASD)家族中,可以检测到罕见的或从头开始的结构变异,主要以拷贝数变异的形式存在。虽然使用微阵列或短线程基因组测序(GS)通常可以检测到涉及重复的复杂结构变异,但这些方法经常无法以核苷酸分辨率描述断点的特征,因此需要额外的分子方法进行验证和精细图谱绘制。在这里,我们使用牛津纳米孔技术公司(Oxford Nanopore Technologies)的PromethION长读程基因组测序技术,对五个家族中与ASD分离的涉及大重复的复杂基因组重排(CGRs)进行了表征。我们总共研究了 13 个 CGR 携带者,并以核苷酸分辨率解析了所有断点连接。虽然确定了所有断点,但一个重排的精确基因组结构仍未确定,有三种不同的潜在结构。两个家系的发现包括通过重复重排形成的潜在融合基因,涉及 IL1RAPL1-DMD 和 SUPT16H-CHD8。在来自同一地理区域的两个家系中,发现了涉及 ANK2 的相同重排,这很可能是一个创始变异基因。此外,我们还直接从长读数数据中分析甲基化状态,从而评估重排基因和调控区域的活性。对整个 CGRs 的甲基化调查显示,在影响 CREBBP 基因座的重排中,携带者的甲基化状态异常。总之,我们的研究结果证明了纳米孔测序技术在确定五个无关联家族中与 ASD 相关的 CGRs 方面的实用性,并强调了以基因为中心描述与疾病相关的复杂染色体重排的重要性。
{"title":"Resolving complex duplication variants in autism spectrum disorder using long-read genome sequencing","authors":"Jesper Eisfeldt, Edward J. Higginbotham, Felix Lenner, Jennifer Howe, Bridget A. Fernandez, Anna Lindstrand, Stephen W. Scherer, Lars Feuk","doi":"10.1101/gr.279263.124","DOIUrl":"https://doi.org/10.1101/gr.279263.124","url":null,"abstract":"Rare or de novo structural variation, primarily in the form of copy number variants, is detected in 5%–10% of autism spectrum disorder (ASD) families. While complex structural variants involving duplications can generally be detected using microarray or short-read genome sequencing (GS), these methods frequently fail to characterize breakpoints at nucleotide resolution, requiring additional molecular methods for validation and fine-mapping. Here, we use Oxford Nanopore Technologies PromethION long-read GS to characterize complex genomic rearrangements (CGRs) involving large duplications that segregate with ASD in five families. In total, we investigated 13 CGR carriers and were able to resolve all breakpoint junctions at nucleotide resolution. While all breakpoints were identified, the precise genomic architecture of one rearrangement remained unresolved with three different potential structures. The findings in two families include potential fusion genes formed through duplication rearrangements, involving <em>IL1RAPL1–DMD</em> and <em>SUPT16H–CHD8</em>. In two of the families originating from the same geographical region, an identical rearrangement involving <em>ANK2</em> was identified, which likely represents a founder variant. In addition, we analyze methylation status directly from the long-read data, allowing us to assess the activity of rearranged genes and regulatory regions. Investigation of methylation across the CGRs reveals aberrant methylation status in carriers across a rearrangement affecting the <em>CREBBP</em> locus. In aggregate, our results demonstrate the utility of nanopore sequencing to pinpoint CGRs associated with ASD in five unrelated families, and highlight the importance of a gene-centric description of disease-associated complex chromosomal rearrangements.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"86 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142541288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A national long-read sequencing study on chromosomal rearrangements uncovers hidden complexities 全国性染色体重排长读测序研究揭示了隐藏的复杂性
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-29 DOI: 10.1101/gr.279510.124
Jesper Eisfeldt, Adam Ameur, Felix Lenner, Esmee Ten Berk de Boer, Marlene Ek, Josephine Wincent, Raquel Vaz, Jesper Ottosson, Tord Jonson, Sofie Ivarsson, Sofia Thunström, Alexandra Topa, Simon Stenberg, Anna Rohlin, Anna Sandestig, Margareta Nordling, Pia Palmebäck, Magnus Burstedt, Frida Nordin, Eva-Lena Stattin, Maria Sobol, Panagiotis Baliakas, Marie-Louise Bondeson, Ida Höijer, Kristine Bilgrav Saether, Lovisa Lovmar, Hans Ehrencrona, Malin Melin, Lars Feuk, Anna Lindstrand
Clinical genetic laboratories often require a comprehensive analysis of chromosomal rearrangements/structural variants (SVs), from large events like translocations and inversions to supernumerary ring/marker chromosomes and small deletions or duplications. Understanding the complexity of these events and their clinical consequences requires pinpointing breakpoint junctions and resolving the derivative chromosome structure. This task often surpasses the capabilities of short-read sequencing technologies. In contrast, long-read sequencing techniques present a compelling alternative for clinical diagnostics. Here, Genomic Medicine Sweden—Rare Diseases has explored the utility of HiFi Revio long-read genome sequencing (lrGS) for digital karyotyping of SVs nationwide. The 16 samples from 13 families were collected from all Swedish healthcare regions. Prior investigations had identified 16 SVs, ranging from simple to complex rearrangements, including inversions, translocations, and copy number variants. We have established a national pipeline and a shared variant database for variant calling and filtering. Using lrGS, 14 of the 16 known SVs are detected. Of these, 13 are mapped at nucleotide resolution, and one complex rearrangement is only visible by read depth. Two Chromosome 21 rearrangements, one mosaic, remain undetected. Average read lengths are 8.3–18.8 kb with coverage exceeding 20× for all samples. De novo assembly results in a limited number of phased contigs per individual (N50 6–86 Mb), enabling direct characterization of the chromosomal rearrangements. In a national pilot study, we demonstrate the utility of HiFi Revio lrGS for analyzing chromosomal rearrangements. Based on our results, we propose a 5-year plan to expand lrGS use for rare disease diagnostics in Sweden.
临床基因实验室经常需要对染色体重排/结构变异(SV)进行全面分析,从易位和倒位等大型事件到超数环/标记染色体和小缺失或重复。要了解这些事件的复杂性及其临床后果,就必须精确定位断点连接并解析衍生染色体结构。这项任务往往超出了短线程测序技术的能力。相比之下,长读程测序技术为临床诊断提供了一个引人注目的替代方案。在此,瑞典罕见病基因组医学研究所(Genomic Medicine Sweden-Rare Diseases)探索了 HiFi Revio 长读程基因组测序(lrGS)在全国 SV 数字核型分析中的应用。来自 13 个家庭的 16 份样本收集自瑞典所有医疗保健地区。之前的调查发现了 16 个 SV,从简单到复杂的重排都有,包括倒位、易位和拷贝数变异。我们建立了一个全国性管道和共享变异数据库,用于变异调用和筛选。利用 lrGS,我们检测到了 16 个已知 SV 中的 14 个。其中 13 个是以核苷酸分辨率绘制的,一个复杂的重排只能通过读取深度看到。两个 21 号染色体重排仍未检测到,其中一个是镶嵌重排。所有样本的平均读长为 8.3-18.8 kb,覆盖率超过 20 倍。从头组装的结果是每个个体的分阶段等位基因数量有限(N50 6-86 Mb),从而能够直接确定染色体重排的特征。在一项国家试点研究中,我们证明了 HiFi Revio lrGS 在分析染色体重排方面的实用性。基于我们的研究结果,我们提出了一项为期 5 年的计划,以扩大 lrGS 在瑞典罕见病诊断中的应用。
{"title":"A national long-read sequencing study on chromosomal rearrangements uncovers hidden complexities","authors":"Jesper Eisfeldt, Adam Ameur, Felix Lenner, Esmee Ten Berk de Boer, Marlene Ek, Josephine Wincent, Raquel Vaz, Jesper Ottosson, Tord Jonson, Sofie Ivarsson, Sofia Thunström, Alexandra Topa, Simon Stenberg, Anna Rohlin, Anna Sandestig, Margareta Nordling, Pia Palmebäck, Magnus Burstedt, Frida Nordin, Eva-Lena Stattin, Maria Sobol, Panagiotis Baliakas, Marie-Louise Bondeson, Ida Höijer, Kristine Bilgrav Saether, Lovisa Lovmar, Hans Ehrencrona, Malin Melin, Lars Feuk, Anna Lindstrand","doi":"10.1101/gr.279510.124","DOIUrl":"https://doi.org/10.1101/gr.279510.124","url":null,"abstract":"Clinical genetic laboratories often require a comprehensive analysis of chromosomal rearrangements/structural variants (SVs), from large events like translocations and inversions to supernumerary ring/marker chromosomes and small deletions or duplications. Understanding the complexity of these events and their clinical consequences requires pinpointing breakpoint junctions and resolving the derivative chromosome structure. This task often surpasses the capabilities of short-read sequencing technologies. In contrast, long-read sequencing techniques present a compelling alternative for clinical diagnostics. Here, Genomic Medicine Sweden—Rare Diseases has explored the utility of HiFi Revio long-read genome sequencing (lrGS) for digital karyotyping of SVs nationwide. The 16 samples from 13 families were collected from all Swedish healthcare regions. Prior investigations had identified 16 SVs, ranging from simple to complex rearrangements, including inversions, translocations, and copy number variants. We have established a national pipeline and a shared variant database for variant calling and filtering. Using lrGS, 14 of the 16 known SVs are detected. Of these, 13 are mapped at nucleotide resolution, and one complex rearrangement is only visible by read depth. Two Chromosome 21 rearrangements, one mosaic, remain undetected. Average read lengths are 8.3–18.8 kb with coverage exceeding 20× for all samples. De novo assembly results in a limited number of phased contigs per individual (N50 6–86 Mb), enabling direct characterization of the chromosomal rearrangements. In a national pilot study, we demonstrate the utility of HiFi Revio lrGS for analyzing chromosomal rearrangements. Based on our results, we propose a 5-year plan to expand lrGS use for rare disease diagnostics in Sweden.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"105 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142541289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-read RNA sequencing reveals allele-specific N6-methyladenosine modifications 长读RNA测序揭示等位基因特异性N6-甲基腺苷修饰
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-29 DOI: 10.1101/gr.279270.124
Dayea Park, Can Cenik
Long-read sequencing technology enables highly accurate detection of allele-specific RNA expression, providing insights into the effects of genetic variation on splicing and RNA abundance. Furthermore, the ability to directly sequence RNA using the Oxford Nanopore technology promises the detection of RNA modifications in tandem with ascertaining the allelic origin of each molecule. Here, we leverage these advantages to determine allele-biased patterns of N6-methyladenosine (m6A) modifications in native mRNA. We utilized human and mouse cells with known genetic variants to assign allelic origin of each mRNA molecule combined with a supervised machine learning model to detect read-level m6A modification ratios. Our analyses revealed the importance of sequences adjacent to the DRACH-motif in determining m6A deposition, in addition to allelic differences that directly alter the motif. Moreover, we discovered allele-specific m6A modification (ASM) events with no genetic variants in close proximity to the differentially modified nucleotide, demonstrating the unique advantage of using long reads and surpassing the capabilities of antibody-based short-read approaches. This technological advancement promises to advance our understanding of the role of genetics in determining mRNA modifications.
长读测序技术能够高度准确地检测等位基因特异性 RNA 的表达,从而深入了解基因变异对剪接和 RNA 丰度的影响。此外,利用牛津纳米孔技术直接对 RNA 进行测序的能力有望在检测 RNA 修饰的同时确定每个分子的等位基因来源。在这里,我们利用这些优势来确定原生 mRNA 中 N6-甲基腺苷(m6A)修饰的等位基因偏倚模式。我们利用已知基因变异的人类和小鼠细胞来确定每个 mRNA 分子的等位基因来源,并结合监督机器学习模型来检测读数级 m6A 修饰比率。我们的分析表明,除了等位基因差异会直接改变Motif外,邻近DRACH-motif的序列在决定m6A沉积方面也很重要。此外,我们还发现了等位基因特异性 m6A 修饰(ASM)事件,这些事件与不同修饰的核苷酸之间没有任何基因变异,这证明了使用长读数的独特优势,并超越了基于抗体的短读数方法的能力。这一技术进步有望推进我们对遗传学在决定 mRNA 修饰中的作用的理解。
{"title":"Long-read RNA sequencing reveals allele-specific N6-methyladenosine modifications","authors":"Dayea Park, Can Cenik","doi":"10.1101/gr.279270.124","DOIUrl":"https://doi.org/10.1101/gr.279270.124","url":null,"abstract":"Long-read sequencing technology enables highly accurate detection of allele-specific RNA expression, providing insights into the effects of genetic variation on splicing and RNA abundance. Furthermore, the ability to directly sequence RNA using the Oxford Nanopore technology promises the detection of RNA modifications in tandem with ascertaining the allelic origin of each molecule. Here, we leverage these advantages to determine allele-biased patterns of N6-methyladenosine (m6A) modifications in native mRNA. We utilized human and mouse cells with known genetic variants to assign allelic origin of each mRNA molecule combined with a supervised machine learning model to detect read-level m6A modification ratios. Our analyses revealed the importance of sequences adjacent to the DRACH-motif in determining m6A deposition, in addition to allelic differences that directly alter the motif. Moreover, we discovered allele-specific m6A modification (ASM) events with no genetic variants in close proximity to the differentially modified nucleotide, demonstrating the unique advantage of using long reads and surpassing the capabilities of antibody-based short-read approaches. This technological advancement promises to advance our understanding of the role of genetics in determining mRNA modifications.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"5 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142541354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1