Integration of transcriptomics and long-read genomics prioritizes structural variants in rare disease

IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Genome research Pub Date : 2025-03-20 DOI:10.1101/gr.279323.124
Tanner D. Jensen, Bohan Ni, Chloe M. Reuter, John E. Gorzynski, Sarah Fazal, Devon Bonner, Rachel A. Ungar, Pagé C. Goddard, Archana Raja, Euan A. Ashley, Jonathan A. Bernstein, Stephan Zuchner, Undiagnosed Diseases Network, Michael D. Greicius, Stephen B. Montgomery, Michael C. Schatz, Matthew T. Wheeler, Alexis Battle
{"title":"Integration of transcriptomics and long-read genomics prioritizes structural variants in rare disease","authors":"Tanner D. Jensen, Bohan Ni, Chloe M. Reuter, John E. Gorzynski, Sarah Fazal, Devon Bonner, Rachel A. Ungar, Pagé C. Goddard, Archana Raja, Euan A. Ashley, Jonathan A. Bernstein, Stephan Zuchner, Undiagnosed Diseases Network, Michael D. Greicius, Stephen B. Montgomery, Michael C. Schatz, Matthew T. Wheeler, Alexis Battle","doi":"10.1101/gr.279323.124","DOIUrl":null,"url":null,"abstract":"Rare structural variants (SVs)—insertions, deletions, and complex rearrangements—can cause Mendelian disease, yet they remain difficult to accurately detect and interpret. We sequenced and analyzed Oxford Nanopore Technologies long-read genomes of 68 individuals from the undiagnosed disease network (UDN) with no previously identified diagnostic mutations from short-read sequencing. Using our optimized SV detection pipelines and 571 control long-read genomes, we detected 716 long-read rare (MAF &lt; 0.01) SV alleles per genome on average, achieving a 2.4× increase from short reads. To characterize the functional effects of rare SVs, we assessed their relationship with gene expression from blood or fibroblasts from the same individuals and found that rare SVs overlapping enhancers were enriched (LOR = 0.46) near expression outliers. We also evaluated tandem repeat expansions (TREs) and found 14 rare TREs per genome; notably, these TREs were also enriched near overexpression outliers. To prioritize candidate functional SVs, we developed Watershed-SV, a probabilistic model that integrates expression data with SV-specific genomic annotations, which significantly outperforms baseline models that do not incorporate expression data. Watershed-SV identified a median of eight high-confidence functional SVs per UDN genome. Notably, this included compound heterozygous deletions in <em>FAM177A1</em> shared by two siblings, which were likely causal for a rare neurodevelopmental disorder. Our observations demonstrate the promise of integrating long-read sequencing with gene expression toward improving the prioritization of functional SVs and TREs in rare disease patients.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"49 1","pages":""},"PeriodicalIF":5.5000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1101/gr.279323.124","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Rare structural variants (SVs)—insertions, deletions, and complex rearrangements—can cause Mendelian disease, yet they remain difficult to accurately detect and interpret. We sequenced and analyzed Oxford Nanopore Technologies long-read genomes of 68 individuals from the undiagnosed disease network (UDN) with no previously identified diagnostic mutations from short-read sequencing. Using our optimized SV detection pipelines and 571 control long-read genomes, we detected 716 long-read rare (MAF < 0.01) SV alleles per genome on average, achieving a 2.4× increase from short reads. To characterize the functional effects of rare SVs, we assessed their relationship with gene expression from blood or fibroblasts from the same individuals and found that rare SVs overlapping enhancers were enriched (LOR = 0.46) near expression outliers. We also evaluated tandem repeat expansions (TREs) and found 14 rare TREs per genome; notably, these TREs were also enriched near overexpression outliers. To prioritize candidate functional SVs, we developed Watershed-SV, a probabilistic model that integrates expression data with SV-specific genomic annotations, which significantly outperforms baseline models that do not incorporate expression data. Watershed-SV identified a median of eight high-confidence functional SVs per UDN genome. Notably, this included compound heterozygous deletions in FAM177A1 shared by two siblings, which were likely causal for a rare neurodevelopmental disorder. Our observations demonstrate the promise of integrating long-read sequencing with gene expression toward improving the prioritization of functional SVs and TREs in rare disease patients.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
整合转录组学和长片段基因组学可优先处理罕见病中的结构变异
罕见的结构变异(sv)——插入、缺失和复杂的重排——可以导致孟德尔病,但它们仍然难以准确检测和解释。我们对来自未诊断疾病网络(UDN)的68个个体的Oxford Nanopore Technologies长读基因组进行了测序和分析,这些个体之前没有从短读测序中发现诊断性突变。利用优化的SV检测管道和571个对照长读基因组,我们检测到716个长读罕见(MAF <;平均每个基因组有0.01个SV等位基因,从短reads中获得2.4倍的增加。为了描述罕见SVs的功能影响,我们评估了它们与来自同一个体的血液或成纤维细胞的基因表达的关系,发现罕见SVs重叠增强子在表达异常值附近富集(LOR = 0.46)。我们还评估了串联重复扩增(TREs),发现每个基因组有14个罕见的TREs;值得注意的是,这些TREs也在过表达异常值附近富集。为了确定候选功能sv的优先级,我们开发了Watershed-SV,这是一个将表达数据与sv特异性基因组注释集成在一起的概率模型,其性能明显优于不包含表达数据的基线模型。分水岭sv鉴定出每个UDN基因组中有8个高可信度的功能性sv。值得注意的是,这包括两个兄弟姐妹共有的FAM177A1的复合杂合缺失,这可能是一种罕见的神经发育障碍的原因。我们的观察结果表明,将长读测序与基因表达结合起来,可以改善罕见疾病患者功能性SVs和TREs的优先排序。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Genome research
Genome research 生物-生化与分子生物学
CiteScore
12.40
自引率
1.40%
发文量
140
审稿时长
6 months
期刊介绍: Launched in 1995, Genome Research is an international, continuously published, peer-reviewed journal that focuses on research that provides novel insights into the genome biology of all organisms, including advances in genomic medicine. Among the topics considered by the journal are genome structure and function, comparative genomics, molecular evolution, genome-scale quantitative and population genetics, proteomics, epigenomics, and systems biology. The journal also features exciting gene discoveries and reports of cutting-edge computational biology and high-throughput methodologies. New data in these areas are published as research papers, or methods and resource reports that provide novel information on technologies or tools that will be of interest to a broad readership. Complete data sets are presented electronically on the journal''s web site where appropriate. The journal also provides Reviews, Perspectives, and Insight/Outlook articles, which present commentary on the latest advances published both here and elsewhere, placing such progress in its broader biological context.
期刊最新文献
Single-nucleus multiomic profiling of the aging mouse substantia nigra reveals conserved gene alterations linked to Parkinson's disease. Confounding factors in assessing the enriched expression of somatic mutant allele in bulk tumor samples. Functional genomics analysis of developing zebrafish and human endoderm reveals highly conserved cis-regulatory modules acting during vertebrate organogenesis. Dynamic A-to-I RNA editing in response to gut microbiome in honey bees. A spectral component approach leveraging identity-by-descent graphs to address recent population structure in genomic analysis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1