利用区域等位基因频率协助分析插入和缺失。

IF 3.9 4区 生物学 Q1 GENETICS & HEREDITY Functional & Integrative Genomics Pub Date : 2024-05-20 DOI:10.1007/s10142-024-01358-3
Sarath Babu Krishna Murthy, Sandy Yang, Shiraz Bheda, Nikita Tomar, Haiyue Li, Amir Yaghoobi, Atlas Khan, Krzysztof Kiryluk, Joshua E. Motelow, Nick Ren, Ali G. Gharavi, Hila Milo Rasouly
{"title":"利用区域等位基因频率协助分析插入和缺失。","authors":"Sarath Babu Krishna Murthy,&nbsp;Sandy Yang,&nbsp;Shiraz Bheda,&nbsp;Nikita Tomar,&nbsp;Haiyue Li,&nbsp;Amir Yaghoobi,&nbsp;Atlas Khan,&nbsp;Krzysztof Kiryluk,&nbsp;Joshua E. Motelow,&nbsp;Nick Ren,&nbsp;Ali G. Gharavi,&nbsp;Hila Milo Rasouly","doi":"10.1007/s10142-024-01358-3","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate estimation of population allele frequency (AF) is crucial for gene discovery and genetic diagnostics. However, determining AF for frameshift-inducing small insertions and deletions (indels) faces challenges due to discrepancies in mapping and variant calling methods. Here, we propose an innovative approach to assess indel AF. We developed CRAFTS-indels (Calculating Regional Allele Frequency Targeting Small indels), an algorithm that combines AF of distinct indels within a given region and provides “regional AF” (rAF). We tested and validated CRAFTS-indels using three independent datasets: gnomAD v2 (<i>n</i>=125,748 samples), an internal dataset (IGM; <i>n</i>=39,367), and the UK BioBank (UKBB; <i>n</i>=469,835). By comparing rAF against standard AF, we identified rare indels with rAF exceeding standard AF (sAF≤10<sup>-4</sup> and rAF&gt;10<sup>-4</sup>) as “rAF-hi” indels. Notably, a high percentage of rare indels were “rAF-hi”, with a higher proportion in gnomAD v2 (11-20%) and IGM (11-22%) compared to the UKBB (5-9% depending on the CRAFTS-indels’ parameters). Analysis of the overlap of regions based on their rAF with low complexity regions and with ClinVar classification supported the pertinence of rAF. Using the internal dataset, we illustrated the utility of CRAFTS-indel in the analysis of de novo variants and the potential negative impact of rAF-hi indels in gene discovery. In summary, annotation of indels with cohort specific rAF can be used to handle some of the limitations of current annotation pipelines and facilitate detection of novel gene disease associations. CRAFTS-indels offers a user-friendly approach to providing rAF annotation. It can be integrated into public databases such as gnomAD, UKBB and used by ClinVar to revise indel classifications.</p></div>","PeriodicalId":574,"journal":{"name":"Functional & Integrative Genomics","volume":null,"pages":null},"PeriodicalIF":3.9000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assisting the analysis of insertions and deletions using regional allele frequencies\",\"authors\":\"Sarath Babu Krishna Murthy,&nbsp;Sandy Yang,&nbsp;Shiraz Bheda,&nbsp;Nikita Tomar,&nbsp;Haiyue Li,&nbsp;Amir Yaghoobi,&nbsp;Atlas Khan,&nbsp;Krzysztof Kiryluk,&nbsp;Joshua E. Motelow,&nbsp;Nick Ren,&nbsp;Ali G. Gharavi,&nbsp;Hila Milo Rasouly\",\"doi\":\"10.1007/s10142-024-01358-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Accurate estimation of population allele frequency (AF) is crucial for gene discovery and genetic diagnostics. However, determining AF for frameshift-inducing small insertions and deletions (indels) faces challenges due to discrepancies in mapping and variant calling methods. Here, we propose an innovative approach to assess indel AF. We developed CRAFTS-indels (Calculating Regional Allele Frequency Targeting Small indels), an algorithm that combines AF of distinct indels within a given region and provides “regional AF” (rAF). We tested and validated CRAFTS-indels using three independent datasets: gnomAD v2 (<i>n</i>=125,748 samples), an internal dataset (IGM; <i>n</i>=39,367), and the UK BioBank (UKBB; <i>n</i>=469,835). By comparing rAF against standard AF, we identified rare indels with rAF exceeding standard AF (sAF≤10<sup>-4</sup> and rAF&gt;10<sup>-4</sup>) as “rAF-hi” indels. Notably, a high percentage of rare indels were “rAF-hi”, with a higher proportion in gnomAD v2 (11-20%) and IGM (11-22%) compared to the UKBB (5-9% depending on the CRAFTS-indels’ parameters). Analysis of the overlap of regions based on their rAF with low complexity regions and with ClinVar classification supported the pertinence of rAF. Using the internal dataset, we illustrated the utility of CRAFTS-indel in the analysis of de novo variants and the potential negative impact of rAF-hi indels in gene discovery. In summary, annotation of indels with cohort specific rAF can be used to handle some of the limitations of current annotation pipelines and facilitate detection of novel gene disease associations. CRAFTS-indels offers a user-friendly approach to providing rAF annotation. It can be integrated into public databases such as gnomAD, UKBB and used by ClinVar to revise indel classifications.</p></div>\",\"PeriodicalId\":574,\"journal\":{\"name\":\"Functional & Integrative Genomics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Functional & Integrative Genomics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10142-024-01358-3\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Functional & Integrative Genomics","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10142-024-01358-3","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

准确估计群体等位基因频率(AF)对基因发现和遗传诊断至关重要。然而,由于制图和变异调用方法的差异,确定帧移诱导的小插入和缺失(indels)的等位基因频率面临挑战。在此,我们提出了一种评估吲哚AF的创新方法。我们开发了 CRAFTS-indels(以小吲哚为目标的区域等位基因频率计算),这是一种将给定区域内不同吲哚的等位基因频率结合起来并提供 "区域等位基因频率"(rAF)的算法。我们使用三个独立数据集测试并验证了 CRAFTS-indels:gnomAD v2(n=125,748 个样本)、内部数据集(IGM;n=39,367 个样本)和英国生物库(UKBB;n=469,835 个样本)。通过比较 rAF 与标准 AF,我们将 rAF 超过标准 AF(sAF≤10-4 和 rAF>10-4)的罕见吲哚识别为 "rAF-hi "吲哚。值得注意的是,"rAF-hi "稀有吲哚的比例很高,在 gnomAD v2(11-20%)和 IGM(11-22%)中的比例高于 UKBB(5-9%,取决于 CRAFTS-indels 参数)。基于 rAF 的区域与低复杂度区域和 ClinVar 分类的重叠分析支持了 rAF 的相关性。我们利用内部数据集说明了 CRAFTS-indel 在分析新变异中的实用性,以及 rAF-hi indels 对基因发现的潜在负面影响。总之,使用队列特异性 rAF 对indels 进行注释可以解决目前注释管道的一些局限性,并促进新型基因疾病关联的检测。CRAFTS-indels 为提供 rAF 注释提供了一种用户友好型方法。它可以集成到 gnomAD、UKBB 等公共数据库中,并被 ClinVar 用于修订吲哚分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Assisting the analysis of insertions and deletions using regional allele frequencies

Accurate estimation of population allele frequency (AF) is crucial for gene discovery and genetic diagnostics. However, determining AF for frameshift-inducing small insertions and deletions (indels) faces challenges due to discrepancies in mapping and variant calling methods. Here, we propose an innovative approach to assess indel AF. We developed CRAFTS-indels (Calculating Regional Allele Frequency Targeting Small indels), an algorithm that combines AF of distinct indels within a given region and provides “regional AF” (rAF). We tested and validated CRAFTS-indels using three independent datasets: gnomAD v2 (n=125,748 samples), an internal dataset (IGM; n=39,367), and the UK BioBank (UKBB; n=469,835). By comparing rAF against standard AF, we identified rare indels with rAF exceeding standard AF (sAF≤10-4 and rAF>10-4) as “rAF-hi” indels. Notably, a high percentage of rare indels were “rAF-hi”, with a higher proportion in gnomAD v2 (11-20%) and IGM (11-22%) compared to the UKBB (5-9% depending on the CRAFTS-indels’ parameters). Analysis of the overlap of regions based on their rAF with low complexity regions and with ClinVar classification supported the pertinence of rAF. Using the internal dataset, we illustrated the utility of CRAFTS-indel in the analysis of de novo variants and the potential negative impact of rAF-hi indels in gene discovery. In summary, annotation of indels with cohort specific rAF can be used to handle some of the limitations of current annotation pipelines and facilitate detection of novel gene disease associations. CRAFTS-indels offers a user-friendly approach to providing rAF annotation. It can be integrated into public databases such as gnomAD, UKBB and used by ClinVar to revise indel classifications.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.50
自引率
3.40%
发文量
92
审稿时长
2 months
期刊介绍: Functional & Integrative Genomics is devoted to large-scale studies of genomes and their functions, including systems analyses of biological processes. The journal will provide the research community an integrated platform where researchers can share, review and discuss their findings on important biological questions that will ultimately enable us to answer the fundamental question: How do genomes work?
期刊最新文献
The Integrator complex: an emerging complex structure involved in the regulation of gene expression by targeting RNA polymerase II Genotyping by sequencing; a strategy for identification and mapping of induced mutation in newly developed wheat mutant lines Transcriptome analysis of the allotetraploids of the Dilatata group of Paspalum (Poaceae): effects of diploidization on the expression of defensin and Snakin/GASA genes Identification of lncRNAs regulating seed traits in Brassica juncea and development of a comprehensive seed omics database Identification, charectrization and genetic transformation of lignin and pectin polysaccharides through CRISPR/Cas9 in Nicotiana tobacum
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1