GAUSS: a summary-statistics-based R package for accurate estimation of linkage disequilibrium for variants, gaussian imputation and TWAS analysis of cosmopolitan cohorts.

IF 4.4 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Bioinformatics Pub Date : 2024-04-17 DOI:10.1093/bioinformatics/btae203
Donghyung Lee, S. Bacanu
{"title":"GAUSS: a summary-statistics-based R package for accurate estimation of linkage disequilibrium for variants, gaussian imputation and TWAS analysis of cosmopolitan cohorts.","authors":"Donghyung Lee, S. Bacanu","doi":"10.1093/bioinformatics/btae203","DOIUrl":null,"url":null,"abstract":"MOTIVATION\nAs the availability of larger and more ethnically diverse reference panels grows, there is an increase in demand for ancestry-informed imputation of genome-wide association studies (GWAS), and other downstream analyses, e.g., fine-mapping. Performing such analyses at the genotype level is computationally challenging and necessitates, at best, a laborious process to access individual-level genotype and phenotype data. Summary-statistics-based tools, not requiring individual-level data, provide an efficient alternative that streamlines computational requirements and promotes open science by simplifying the re-analysis and downstream analysis of existing GWAS summary data. However, existing tools perform only disparate parts of needed analysis, have only command-line interfaces, and are difficult to extend/link by applied researchers.\n\n\nRESULTS\nTo address these challenges, we present GAUSS-a comprehensive and user-friendly R package designed to facilitate the re-analysis/downstream analysis of GWAS summary statistics. GAUSS offers an integrated toolkit for a range of functionalities, including i) estimating ancestry proportion of study cohorts, ii) calculating ancestry-informed linkage disequilibrium, iii) imputing summary statistics of unobserved variants, iv) conducting transcriptome-wide association studies, and v) correcting for \"Winner's Curse\" biases. Notably, GAUSS utilizes an expansive, multi-ethnic reference panel consisting of 32,953 genomes from 29 ethnic groups. This panel enhances the range and accuracy of imputable variants, including the ability to impute summary statistics of rarer variants. As a result, GAUSS elevates the quality and applicability of existing GWAS analyses without requiring access to subject-level genotypic and phenotypic information.\n\n\nAVAILABILITY AND IMPLEMENTATION\nThe GAUSS R package, complete with its source code, is readily accessible to the public via our GitHub repository at https://github.com/statsleelab/gauss. To further assist users, we provided illustrative use-case scenarios that are conveniently found at https://statsleelab.github.io/gauss/, along with a comprehensive user guide detailed in Supplementary Text 1 from Supplementary Data.\n\n\nSUPPLEMENTARY INFORMATION\nSupplementary data are available at Bioinformatics online.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.4000,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btae203","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

MOTIVATION As the availability of larger and more ethnically diverse reference panels grows, there is an increase in demand for ancestry-informed imputation of genome-wide association studies (GWAS), and other downstream analyses, e.g., fine-mapping. Performing such analyses at the genotype level is computationally challenging and necessitates, at best, a laborious process to access individual-level genotype and phenotype data. Summary-statistics-based tools, not requiring individual-level data, provide an efficient alternative that streamlines computational requirements and promotes open science by simplifying the re-analysis and downstream analysis of existing GWAS summary data. However, existing tools perform only disparate parts of needed analysis, have only command-line interfaces, and are difficult to extend/link by applied researchers. RESULTS To address these challenges, we present GAUSS-a comprehensive and user-friendly R package designed to facilitate the re-analysis/downstream analysis of GWAS summary statistics. GAUSS offers an integrated toolkit for a range of functionalities, including i) estimating ancestry proportion of study cohorts, ii) calculating ancestry-informed linkage disequilibrium, iii) imputing summary statistics of unobserved variants, iv) conducting transcriptome-wide association studies, and v) correcting for "Winner's Curse" biases. Notably, GAUSS utilizes an expansive, multi-ethnic reference panel consisting of 32,953 genomes from 29 ethnic groups. This panel enhances the range and accuracy of imputable variants, including the ability to impute summary statistics of rarer variants. As a result, GAUSS elevates the quality and applicability of existing GWAS analyses without requiring access to subject-level genotypic and phenotypic information. AVAILABILITY AND IMPLEMENTATION The GAUSS R package, complete with its source code, is readily accessible to the public via our GitHub repository at https://github.com/statsleelab/gauss. To further assist users, we provided illustrative use-case scenarios that are conveniently found at https://statsleelab.github.io/gauss/, along with a comprehensive user guide detailed in Supplementary Text 1 from Supplementary Data. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GAUSS:一个基于摘要统计的 R 软件包,用于准确估计变异的连锁不平衡、高斯估算和世界性队列的 TWAS 分析。
动机 随着规模越来越大、种族越来越多样化的参照组的增多,对全基因组关联研究(GWAS)的祖先推算及其他下游分析(如精细图谱)的需求也在增加。在基因型水平上进行此类分析在计算上极具挑战性,充其量只能通过费力的过程来获取个体水平的基因型和表型数据。基于摘要统计的工具不需要个体水平的数据,它提供了一种高效的替代方法,通过简化现有 GWAS 摘要数据的再分析和下游分析,简化了计算要求,促进了开放科学的发展。为了应对这些挑战,我们提出了 GAUSS--一个全面且用户友好的 R 软件包,旨在促进 GWAS 摘要统计数据的再分析/下游分析。GAUSS 为一系列功能提供了集成工具包,包括 i) 估算研究队列的祖先比例;ii) 计算祖先信息关联不平衡;iii) 归因未观察变异的汇总统计;iv) 开展全转录组关联研究;v) 校正 "赢家诅咒 "偏倚。值得注意的是,GAUSS 利用了一个由来自 29 个种族群体的 32953 个基因组组成的庞大的多种族参考面板。该数据库提高了可归因变异的范围和准确性,包括归因较罕见变异的汇总统计数据的能力。因此,GAUSS 提高了现有 GWAS 分析的质量和适用性,而无需访问受试者水平的基因型和表型信息。可用性和实施 GAUSS R 软件包及其源代码可通过我们的 GitHub 存储库 https://github.com/statsleelab/gauss 随时供公众访问。为了进一步帮助用户,我们在 https://statsleelab.github.io/gauss/ 上提供了示例用例,并在补充数据的补充文本 1 中详细介绍了全面的用户指南。补充信息补充数据可在 Bioinformatics online 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Bioinformatics
Bioinformatics 生物-生化研究方法
CiteScore
11.20
自引率
5.20%
发文量
753
审稿时长
2.1 months
期刊介绍: The leading journal in its field, Bioinformatics publishes the highest quality scientific papers and review articles of interest to academic and industrial researchers. Its main focus is on new developments in genome bioinformatics and computational biology. Two distinct sections within the journal - Discovery Notes and Application Notes- focus on shorter papers; the former reporting biologically interesting discoveries using computational methods, the latter exploring the applications used for experiments.
期刊最新文献
PQSDC: a parallel lossless compressor for quality scores data via sequences partition and Run-Length prediction mapping. MUSE-XAE: MUtational Signature Extraction with eXplainable AutoEncoder enhances tumour types classification. CopyVAE: a variational autoencoder-based approach for copy number variation inference using single-cell transcriptomics CORDAX web server: An online platform for the prediction and 3D visualization of aggregation motifs in protein sequences. LMCrot: An enhanced protein crotonylation site predictor by leveraging an interpretable window-level embedding from a transformer-based protein language model.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1