Characterizing substructure via mixture modeling in large-scale genetic summary statistics.

IF 8.1 1区 生物学 Q1 GENETICS & HEREDITY American journal of human genetics Pub Date : 2025-01-11 DOI:10.1016/j.ajhg.2024.12.007
Hayley R Stoneman, Adelle M Price, Nikole Scribner Trout, Riley Lamont, Souha Tifour, Nikita Pozdeyev, Kristy Crooks, Meng Lin, Nicholas Rafaels, Christopher R Gignoux, Katie M Marker, Audrey E Hendricks
{"title":"Characterizing substructure via mixture modeling in large-scale genetic summary statistics.","authors":"Hayley R Stoneman, Adelle M Price, Nikole Scribner Trout, Riley Lamont, Souha Tifour, Nikita Pozdeyev, Kristy Crooks, Meng Lin, Nicholas Rafaels, Christopher R Gignoux, Katie M Marker, Audrey E Hendricks","doi":"10.1016/j.ajhg.2024.12.007","DOIUrl":null,"url":null,"abstract":"<p><p>Genetic summary data are broadly accessible and highly useful, including for risk prediction, causal inference, fine mapping, and incorporation of external controls. However, collapsing individual-level data into summary data, such as allele frequencies, masks intra- and inter-sample heterogeneity, leading to confounding, reduced power, and bias. Ultimately, unaccounted-for substructure limits summary data usability, especially for understudied or admixed populations. There is a need for methods to enable the harmonization of summary data where the underlying substructure is matched between datasets. Here, we present Summix2, a comprehensive set of methods and software based on a computationally efficient mixture model to enable the harmonization of genetic summary data by estimating and adjusting for substructure. In extensive simulations and application to public data, we show that Summix2 characterizes finer-scale population structure, identifies ascertainment bias, and scans for potential regions of selection due to local substructure deviation. Summix2 increases the robust use of diverse, publicly available summary data, resulting in improved and more equitable research.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1000,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of human genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.ajhg.2024.12.007","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Genetic summary data are broadly accessible and highly useful, including for risk prediction, causal inference, fine mapping, and incorporation of external controls. However, collapsing individual-level data into summary data, such as allele frequencies, masks intra- and inter-sample heterogeneity, leading to confounding, reduced power, and bias. Ultimately, unaccounted-for substructure limits summary data usability, especially for understudied or admixed populations. There is a need for methods to enable the harmonization of summary data where the underlying substructure is matched between datasets. Here, we present Summix2, a comprehensive set of methods and software based on a computationally efficient mixture model to enable the harmonization of genetic summary data by estimating and adjusting for substructure. In extensive simulations and application to public data, we show that Summix2 characterizes finer-scale population structure, identifies ascertainment bias, and scans for potential regions of selection due to local substructure deviation. Summix2 increases the robust use of diverse, publicly available summary data, resulting in improved and more equitable research.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
大尺度遗传汇总统计中混合建模表征子结构。
遗传摘要数据可广泛获取且非常有用,包括用于风险预测、因果推断、精细制图和外部控制的整合。然而,将个体水平的数据分解为汇总数据,如等位基因频率,会掩盖样本内和样本间的异质性,导致混淆、降低功率和偏倚。最终,未解释的子结构限制了汇总数据的可用性,特别是对于未充分研究或混合的人群。需要一些方法来协调汇总数据,其中底层子结构在数据集之间匹配。在这里,我们提出了Summix2,这是一套综合的方法和软件,基于计算效率高的混合模型,通过估计和调整子结构来实现遗传汇总数据的协调。在对公共数据的广泛模拟和应用中,我们表明Summix2表征了更精细尺度的种群结构,识别了确定偏差,并扫描了由于局部子结构偏差而导致的潜在选择区域。Summix2增强了对多样化、公开可用的摘要数据的有力使用,从而改进和更加公平的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
14.70
自引率
4.10%
发文量
185
审稿时长
1 months
期刊介绍: The American Journal of Human Genetics (AJHG) is a monthly journal published by Cell Press, chosen by The American Society of Human Genetics (ASHG) as its premier publication starting from January 2008. AJHG represents Cell Press's first society-owned journal, and both ASHG and Cell Press anticipate significant synergies between AJHG content and that of other Cell Press titles.
期刊最新文献
Discovery of a DNA methylation profile in individuals with Sifrim-Hitz-Weiss syndrome. Characterizing substructure via mixture modeling in large-scale genetic summary statistics. Bi-allelic KICS2 mutations impair KICSTOR complex-mediated mTORC1 regulation, causing intellectual disability and epilepsy. A unified framework for cell-type-specific eQTL prioritization by integrating bulk and scRNA-seq data. Functional characterization of eQTLs and asthma risk loci with scATAC-seq across immune cell types and contexts.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1