LoFTK: a framework for fully automated calculation of predicted Loss-of-Function variants and genes.

IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Biodata Mining Pub Date : 2023-02-02 DOI:10.1186/s13040-023-00321-5
Abdulrahman Alasiri, Konrad J Karczewski, Brian Cole, Bao-Li Loza, Jason H Moore, Sander W van der Laan, Folkert W Asselbergs, Brendan J Keating, Jessica van Setten
{"title":"LoFTK: a framework for fully automated calculation of predicted Loss-of-Function variants and genes.","authors":"Abdulrahman Alasiri, Konrad J Karczewski, Brian Cole, Bao-Li Loza, Jason H Moore, Sander W van der Laan, Folkert W Asselbergs, Brendan J Keating, Jessica van Setten","doi":"10.1186/s13040-023-00321-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Loss-of-Function (LoF) variants in human genes are important due to their impact on clinical phenotypes and frequent occurrence in the genomes of healthy individuals. The association of LoF variants with complex diseases and traits may lead to the discovery and validation of novel therapeutic targets. Current approaches predict high-confidence LoF variants without identifying the specific genes or the number of copies they affect. Moreover, there is a lack of methods for detecting knockout genes caused by compound heterozygous (CH) LoF variants.</p><p><strong>Results: </strong>We have developed the Loss-of-Function ToolKit (LoFTK), which allows efficient and automated prediction of LoF variants from genotyped, imputed and sequenced genomes. LoFTK enables the identification of genes that are inactive in one or two copies and provides summary statistics for downstream analyses. LoFTK can identify CH LoF variants, which result in LoF genes with two copies lost. Using data from parents and offspring we show that 96% of CH LoF genes predicted by LoFTK in the offspring have the respective alleles donated by each parent.</p><p><strong>Conclusions: </strong>LoFTK is a command-line based tool that provides a reliable computational workflow for predicting LoF variants from genotyped and sequenced genomes, identifying genes that are inactive in 1 or 2 copies. LoFTK is an open software and is freely available to non-commercial users at https://github.com/CirculatoryHealth/LoFTK .</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"16 1","pages":"3"},"PeriodicalIF":6.1000,"publicationDate":"2023-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9893534/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-023-00321-5","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Loss-of-Function (LoF) variants in human genes are important due to their impact on clinical phenotypes and frequent occurrence in the genomes of healthy individuals. The association of LoF variants with complex diseases and traits may lead to the discovery and validation of novel therapeutic targets. Current approaches predict high-confidence LoF variants without identifying the specific genes or the number of copies they affect. Moreover, there is a lack of methods for detecting knockout genes caused by compound heterozygous (CH) LoF variants.

Results: We have developed the Loss-of-Function ToolKit (LoFTK), which allows efficient and automated prediction of LoF variants from genotyped, imputed and sequenced genomes. LoFTK enables the identification of genes that are inactive in one or two copies and provides summary statistics for downstream analyses. LoFTK can identify CH LoF variants, which result in LoF genes with two copies lost. Using data from parents and offspring we show that 96% of CH LoF genes predicted by LoFTK in the offspring have the respective alleles donated by each parent.

Conclusions: LoFTK is a command-line based tool that provides a reliable computational workflow for predicting LoF variants from genotyped and sequenced genomes, identifying genes that are inactive in 1 or 2 copies. LoFTK is an open software and is freely available to non-commercial users at https://github.com/CirculatoryHealth/LoFTK .

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
LoFTK:全自动计算预测功能缺失变体和基因的框架。
背景:人类基因中的功能缺失(LoF)变异非常重要,因为它们会影响临床表型,而且经常出现在健康人的基因组中。LoF变异与复杂疾病和性状的关联可能有助于发现和验证新的治疗靶点。目前的方法可以预测高置信度的 LoF 变异,但无法确定其影响的特定基因或拷贝数。此外,目前还缺乏检测由复合杂合子(CH)LoF 变异引起的基因敲除的方法:我们开发了功能缺失工具包(LoFTK),它可以从基因分型、估算和测序的基因组中高效、自动地预测LoF变异。LoFTK 能够识别在一个或两个拷贝中失去活性的基因,并为下游分析提供汇总统计数据。LoFTK 可以识别 CH LoF 变异,这将导致 LoF 基因丢失两个拷贝。通过使用亲代和子代的数据,我们发现在 LoFTK 预测的子代 CH LoF 基因中,96% 的基因具有父母各自捐献的等位基因:LoFTK是一种基于命令行的工具,它提供了一种可靠的计算工作流程,可从基因分型和测序的基因组中预测LoF变异,识别1个或2个拷贝中无活性的基因。LoFTK 是一款开放软件,非商业用户可通过 https://github.com/CirculatoryHealth/LoFTK 免费下载。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Biodata Mining
Biodata Mining MATHEMATICAL & COMPUTATIONAL BIOLOGY-
CiteScore
7.90
自引率
0.00%
发文量
28
审稿时长
23 weeks
期刊介绍: BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data. Topical areas include, but are not limited to: -Development, evaluation, and application of novel data mining and machine learning algorithms. -Adaptation, evaluation, and application of traditional data mining and machine learning algorithms. -Open-source software for the application of data mining and machine learning algorithms. -Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies. -Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.
期刊最新文献
Genotype subtyping approach to identify unnoticed variants in diseases from GWAS data. Generalization or mirage? Data leakage and reported performance in neonatal EEG seizure detection models: a systematic review. SBT-Net: a tri-cue guided multimodal fusion framework for depression recognition. Cross-cohort genetic risk prediction for Alzheimer's disease: a transfer learning approach using GWAS and deep learning models. An unsupervised tool for biomarker discovery and cancer subtyping applied to glioblastoma.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1