Pub Date : 2025-09-04Epub Date: 2025-07-22DOI: 10.1016/j.ajhg.2025.06.018
Jasmine Baker, Erik Stricker, Julie Coleman, Shamika Ketkar, Taotao Tan, Ashley M Butler, LaTerrica Williams, Latanya Hammonds-Odie, Debra Murray, Brendan Lee, Kim C Worley, Elizabeth G Atkinson
A lack of representation in genomic research and limited access to computational training create barriers for many researchers seeking to analyze large-scale genetic datasets. The All of Us Research Program provides an unprecedented opportunity to address these gaps by offering genomic data from a broad range of participants, but its impact depends on equipping researchers with the necessary skills to use it effectively. The All of Us Biomedical Researcher (BR) Scholars Program at Baylor College of Medicine aims to break down these barriers by providing early-career researchers with hands-on training in computational genomics through the All of Us Evenings with Genetics Research Program. The year-long program begins with the faculty summit, an in-person computational boot camp that introduces scholars to foundational skills for using the All of Us dataset via a cloud-based research environment. The genomics tutorials focus on genome-wide association studies (GWASs), utilizing Jupyter Notebooks and the Hail computing framework to provide an accessible and scalable approach to large-scale data analysis. Scholars engage in hands-on exercises covering data preparation, quality control, association testing, and result interpretation. By the end of the summit, participants will have successfully conducted a GWAS, visualized key findings, and gained confidence in computational resource management. This initiative expands access to genomic research by equipping early-career researchers from a variety of backgrounds with the tools and knowledge to analyze All of Us data. By lowering barriers to entry and promoting the study of representative populations, the program fosters innovation in precision medicine and advances equity in genomic research.
{"title":"Implementing a training resource for large-scale genomic data analysis in the All of Us Researcher Workbench.","authors":"Jasmine Baker, Erik Stricker, Julie Coleman, Shamika Ketkar, Taotao Tan, Ashley M Butler, LaTerrica Williams, Latanya Hammonds-Odie, Debra Murray, Brendan Lee, Kim C Worley, Elizabeth G Atkinson","doi":"10.1016/j.ajhg.2025.06.018","DOIUrl":"10.1016/j.ajhg.2025.06.018","url":null,"abstract":"<p><p>A lack of representation in genomic research and limited access to computational training create barriers for many researchers seeking to analyze large-scale genetic datasets. The All of Us Research Program provides an unprecedented opportunity to address these gaps by offering genomic data from a broad range of participants, but its impact depends on equipping researchers with the necessary skills to use it effectively. The All of Us Biomedical Researcher (BR) Scholars Program at Baylor College of Medicine aims to break down these barriers by providing early-career researchers with hands-on training in computational genomics through the All of Us Evenings with Genetics Research Program. The year-long program begins with the faculty summit, an in-person computational boot camp that introduces scholars to foundational skills for using the All of Us dataset via a cloud-based research environment. The genomics tutorials focus on genome-wide association studies (GWASs), utilizing Jupyter Notebooks and the Hail computing framework to provide an accessible and scalable approach to large-scale data analysis. Scholars engage in hands-on exercises covering data preparation, quality control, association testing, and result interpretation. By the end of the summit, participants will have successfully conducted a GWAS, visualized key findings, and gained confidence in computational resource management. This initiative expands access to genomic research by equipping early-career researchers from a variety of backgrounds with the tools and knowledge to analyze All of Us data. By lowering barriers to entry and promoting the study of representative populations, the program fosters innovation in precision medicine and advances equity in genomic research.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2001-2009"},"PeriodicalIF":8.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12320718/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144697425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-04DOI: 10.1016/j.ajhg.2025.07.015
Yu-Jyun Huang, Nuzulul Kurniansyah, Daniel F Levey, Joel Gelernter, Jennifer E Huffman, Kelly Cho, Peter W F Wilson, Daniel J Gottlieb, Kenneth M Rice, Tamar Sofer
Strong sex differences exist in sleep phenotypes and also cardiovascular diseases (CVDs). However, sex-specific causal effects of sleep phenotypes on CVD-related outcomes have not been thoroughly examined. Mendelian randomization (MR) analysis is a useful approach for estimating the causal effect of a risk factor on an outcome of interest when interventional studies are not available. We first conducted sex-specific genome-wide association studies (GWASs) for suboptimal-sleep phenotypes (insomnia, obstructive sleep apnea [OSA], short and long sleep durations, and excessive daytime sleepiness) utilizing the Million Veteran Program (MVP) dataset. We then developed a semi-empirical Bayesian framework that (1) calibrates variant-phenotype effect estimates by leveraging information across sex groups and (2) applies shrinkage sex-specific effect estimates in MR analysis to alleviate weak instrumental bias when sex groups are analyzed in isolation. Simulation studies demonstrate that the causal effect estimates derived from our framework are substantially more efficient than those obtained through conventional methods. We estimated the causal effects of sleep phenotypes on CVD-related outcomes using sex-specific GWAS data from the MVP and All of Us. Significant sex differences in causal effects were observed, particularly between OSA and chronic kidney disease, as well as long sleep duration on several CVD-related outcomes. By applying shrinkage estimates for instrumental variable selection, we identified multiple sex-specific significant causal relationships between OSA and CVD-related phenotypes. The method is generalizable and can be used to improve power and alleviate weak instrument bias when only a small sample is available for a specific condition or group.
睡眠表型和心血管疾病(cvd)存在明显的性别差异。然而,睡眠表型对cvd相关结果的性别特异性因果影响尚未得到彻底研究。孟德尔随机化(MR)分析是一种有用的方法,可以在没有介入研究的情况下估计风险因素对结果的因果关系。我们首先利用百万退伍军人计划(MVP)数据集对次优睡眠表型(失眠、阻塞性睡眠呼吸暂停[OSA]、短睡眠时间和长睡眠时间以及白天过度嗜睡)进行了性别特异性全基因组关联研究(GWASs)。然后,我们开发了一个半经验贝叶斯框架,该框架(1)通过利用跨性别群体的信息来校准变异表型效应估计;(2)在MR分析中应用收缩性别特异性效应估计,以减轻性别群体孤立分析时的弱工具偏差。模拟研究表明,从我们的框架中得出的因果效应估计比通过传统方法获得的因果效应估计要有效得多。我们使用来自MVP和All of Us的性别特异性GWAS数据估计了睡眠表型对cvd相关结果的因果影响。在因果效应中观察到显著的性别差异,特别是在OSA和慢性肾脏疾病之间,以及长时间睡眠对几种cvd相关结果的影响。通过应用工具变量选择的收缩估计,我们确定了OSA和cvd相关表型之间的多重性别特异性显著因果关系。该方法具有通用性,可用于在特定条件或群体中只有小样本可用时提高功率和减轻弱仪器偏差。
{"title":"A semi-empirical Bayes approach for calibrating weak instrumental bias in sex-specific Mendelian randomization studies.","authors":"Yu-Jyun Huang, Nuzulul Kurniansyah, Daniel F Levey, Joel Gelernter, Jennifer E Huffman, Kelly Cho, Peter W F Wilson, Daniel J Gottlieb, Kenneth M Rice, Tamar Sofer","doi":"10.1016/j.ajhg.2025.07.015","DOIUrl":"10.1016/j.ajhg.2025.07.015","url":null,"abstract":"<p><p>Strong sex differences exist in sleep phenotypes and also cardiovascular diseases (CVDs). However, sex-specific causal effects of sleep phenotypes on CVD-related outcomes have not been thoroughly examined. Mendelian randomization (MR) analysis is a useful approach for estimating the causal effect of a risk factor on an outcome of interest when interventional studies are not available. We first conducted sex-specific genome-wide association studies (GWASs) for suboptimal-sleep phenotypes (insomnia, obstructive sleep apnea [OSA], short and long sleep durations, and excessive daytime sleepiness) utilizing the Million Veteran Program (MVP) dataset. We then developed a semi-empirical Bayesian framework that (1) calibrates variant-phenotype effect estimates by leveraging information across sex groups and (2) applies shrinkage sex-specific effect estimates in MR analysis to alleviate weak instrumental bias when sex groups are analyzed in isolation. Simulation studies demonstrate that the causal effect estimates derived from our framework are substantially more efficient than those obtained through conventional methods. We estimated the causal effects of sleep phenotypes on CVD-related outcomes using sex-specific GWAS data from the MVP and All of Us. Significant sex differences in causal effects were observed, particularly between OSA and chronic kidney disease, as well as long sleep duration on several CVD-related outcomes. By applying shrinkage estimates for instrumental variable selection, we identified multiple sex-specific significant causal relationships between OSA and CVD-related phenotypes. The method is generalizable and can be used to improve power and alleviate weak instrument bias when only a small sample is available for a specific condition or group.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":"112 9","pages":"2213-2231"},"PeriodicalIF":8.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416758/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145005776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-04DOI: 10.1016/j.ajhg.2025.08.003
Tara Dutka, Erika J Faust, C Scott Gallagher, Travis Hyams, Elyse Kozlowski, Erica Landis, Minnkyong Lee, Grace F Liou, Tamara R Litwin, Christopher Lunt, Sana H Mian, Anjene Musick, Nguyen Park, Theresa Patten, Janeth Sanchez, Sheri D Schully, Cathy Shyr, Geoffrey S Ginsburg
{"title":"All of Us Research Program year in review: 2024.","authors":"Tara Dutka, Erika J Faust, C Scott Gallagher, Travis Hyams, Elyse Kozlowski, Erica Landis, Minnkyong Lee, Grace F Liou, Tamara R Litwin, Christopher Lunt, Sana H Mian, Anjene Musick, Nguyen Park, Theresa Patten, Janeth Sanchez, Sheri D Schully, Cathy Shyr, Geoffrey S Ginsburg","doi":"10.1016/j.ajhg.2025.08.003","DOIUrl":"10.1016/j.ajhg.2025.08.003","url":null,"abstract":"","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":"112 9","pages":"1983-1987"},"PeriodicalIF":8.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12461012/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145005783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-04Epub Date: 2025-08-18DOI: 10.1016/j.ajhg.2025.07.009
Michael J Betti, James Jaworski, Shilin Zhao, J Sunil Rao, Bríd M Ryan, Ann G Schwartz, Christine M Lusk, Lucie McCoy, John K Wiencke, Marino A Bruce, Stephen Chanock, Eric R Gamazon, Jacklyn N Hellwege, Melinda C Aldrich
Striking disparities in lung cancer exist, with Black/African American individuals disproportionately affected by lung cancer, yet the genetic architecture in African ancestry individuals is poorly understood. We aimed to address this by performing a comprehensive genetic association study of lung cancer, incorporating local ancestry, across 6,490 African ancestry individuals (2,390 individuals with lung cancer and 4,100 control subjects). We identified a single genome-wide significant (p < 5 × 10-8) locus, 15q25.1 (lead SNP rs17486278, OR [95% CI] = 1.34 [1.23-1.45], p = 4.52 × 10-12), that has consistently shown a strong association with lung cancer across populations. Additionally, we identified nine suggestive (p < 1 × 10-6) loci. Four of these loci (3p12.1, 8q22.2, 14q11.2, and 18q22.3) have no prior reported associations with lung cancer. We performed a multi-ancestry lung cancer meta-analysis using prior large-scale summary statistics from European and Asian ancestry populations, incorporating our African ancestry results. The meta-analysis identified 17 genome-wide significant loci, including an association with locus 4q35.2 (p = 1.22 × 10-8), a genomic region that has been previously linked to forced expiratory volume. Genome-wide SNP-based heritability for lung cancer was 16% among African ancestry individuals. Follow-up in silico functional analyses identified genetically regulated gene expression (GReX) of nine genes (AC012184.3, ADK, CCDC12, CHRNA3, EML4, PSMA4, SNRNP200, TMEM50A, and ZYG11A) associated with lung cancer risk and biological pathways relevant to cancer and lung function. Cumulatively, these findings further elucidate the genetic architecture of lung cancer in African ancestry individuals, confirming prior loci and revealing new loci.
{"title":"Genetic analysis in African ancestry populations reveals genetic contributors to lung cancer susceptibility.","authors":"Michael J Betti, James Jaworski, Shilin Zhao, J Sunil Rao, Bríd M Ryan, Ann G Schwartz, Christine M Lusk, Lucie McCoy, John K Wiencke, Marino A Bruce, Stephen Chanock, Eric R Gamazon, Jacklyn N Hellwege, Melinda C Aldrich","doi":"10.1016/j.ajhg.2025.07.009","DOIUrl":"10.1016/j.ajhg.2025.07.009","url":null,"abstract":"<p><p>Striking disparities in lung cancer exist, with Black/African American individuals disproportionately affected by lung cancer, yet the genetic architecture in African ancestry individuals is poorly understood. We aimed to address this by performing a comprehensive genetic association study of lung cancer, incorporating local ancestry, across 6,490 African ancestry individuals (2,390 individuals with lung cancer and 4,100 control subjects). We identified a single genome-wide significant (p < 5 × 10<sup>-8</sup>) locus, 15q25.1 (lead SNP rs17486278, OR [95% CI] = 1.34 [1.23-1.45], p = 4.52 × 10<sup>-12</sup>), that has consistently shown a strong association with lung cancer across populations. Additionally, we identified nine suggestive (p < 1 × 10<sup>-6</sup>) loci. Four of these loci (3p12.1, 8q22.2, 14q11.2, and 18q22.3) have no prior reported associations with lung cancer. We performed a multi-ancestry lung cancer meta-analysis using prior large-scale summary statistics from European and Asian ancestry populations, incorporating our African ancestry results. The meta-analysis identified 17 genome-wide significant loci, including an association with locus 4q35.2 (p = 1.22 × 10<sup>-8</sup>), a genomic region that has been previously linked to forced expiratory volume. Genome-wide SNP-based heritability for lung cancer was 16% among African ancestry individuals. Follow-up in silico functional analyses identified genetically regulated gene expression (GReX) of nine genes (AC012184.3, ADK, CCDC12, CHRNA3, EML4, PSMA4, SNRNP200, TMEM50A, and ZYG11A) associated with lung cancer risk and biological pathways relevant to cancer and lung function. Cumulatively, these findings further elucidate the genetic architecture of lung cancer in African ancestry individuals, confirming prior loci and revealing new loci.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2102-2114"},"PeriodicalIF":8.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12461003/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144881876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome-wide association studies (GWASs) have statistically identified thousands of loci influencing a trait of interest. To explain the organizational principles among the functionally often unrelated encoded proteins, the omnigenic model postulates core genes with direct and peripheral genes with indirect effects on molecular trait etiology. However, both core genes and the network paths by which they are influenced are unknown for most traits. Using our previously developed Speos framework to identify core genes, we here focus on the autoimmune disease ulcerative colitis (UC) to explore the regulatory relationships between core and peripheral genes and their organization in multi-modal molecular networks. The identified core genes are characterized by tissue-specific expression and trait-relevant network connections. Using genome-scale perturbation data, we demonstrate that one-third of overexpression or knockdown perturbations impact core genes differently than peripheral genes, a pattern that is not observed for GWAS or random genes. This coordinated perturbation response by core genes was robust across traits and cell lines, despite differing causal perturbagens, suggesting a universal core-gene property. Intriguingly, co-perturbation simulations suggest frequent genetic interactions between core genes, highlighting the role of non-additive interactions previously not considered in the omnigenic model. Thus, physiologically relevant core-gene sets occupy a central position in the underlying molecular network, resulting in genome-wide coordinated regulation. As previous theoretical studies have shown that coordinated regulation of core genes could explain much of the missing heritability, our qualitative observation can provide a foundation for detailed quantitative analyses.
{"title":"Exploring the omnigenic architecture of selected complex traits.","authors":"Florin Ratajczak, Matthias Heinig, Pascal Falter-Braun","doi":"10.1016/j.ajhg.2025.07.006","DOIUrl":"10.1016/j.ajhg.2025.07.006","url":null,"abstract":"<p><p>Genome-wide association studies (GWASs) have statistically identified thousands of loci influencing a trait of interest. To explain the organizational principles among the functionally often unrelated encoded proteins, the omnigenic model postulates core genes with direct and peripheral genes with indirect effects on molecular trait etiology. However, both core genes and the network paths by which they are influenced are unknown for most traits. Using our previously developed Speos framework to identify core genes, we here focus on the autoimmune disease ulcerative colitis (UC) to explore the regulatory relationships between core and peripheral genes and their organization in multi-modal molecular networks. The identified core genes are characterized by tissue-specific expression and trait-relevant network connections. Using genome-scale perturbation data, we demonstrate that one-third of overexpression or knockdown perturbations impact core genes differently than peripheral genes, a pattern that is not observed for GWAS or random genes. This coordinated perturbation response by core genes was robust across traits and cell lines, despite differing causal perturbagens, suggesting a universal core-gene property. Intriguingly, co-perturbation simulations suggest frequent genetic interactions between core genes, highlighting the role of non-additive interactions previously not considered in the omnigenic model. Thus, physiologically relevant core-gene sets occupy a central position in the underlying molecular network, resulting in genome-wide coordinated regulation. As previous theoretical studies have shown that coordinated regulation of core genes could explain much of the missing heritability, our qualitative observation can provide a foundation for detailed quantitative analyses.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2115-2137"},"PeriodicalIF":8.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12461020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144788056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-04Epub Date: 2025-08-20DOI: 10.1016/j.ajhg.2025.07.016
Anjali Das, Chirag Lakhani, Chloé Terwagne, Jui-Shan T Lin, Tatsuhiko Naito, Towfique Raj, David A Knowles
Increased availability of whole-genome sequencing (WGS) has facilitated the study of rare variants (RVs) in complex diseases. Multiple RV association tests are available to study the relationship between genotype and phenotype, but most do not fully leverage the availability of variant-level functional annotations. We propose genome-wide rare variant enrichment evaluation (gruyere), an empirical Bayesian framework that complements existing methods by learning global, trait-specific weights for functional annotations to improve variant prioritization. We apply gruyere to WGS data from the Alzheimer's Disease Sequencing Project to identify Alzheimer disease (AD)-associated genes and annotations. Growing evidence suggests that the disruption of microglial regulation is a key contributor to AD risk, yet existing methods have not examined rare non-coding effects that incorporate such cell-type-specific information. To address this gap, we (1) define per-gene non-coding RV test sets using predicted enhancer and promoter regions in microglia and other brain cell types (oligodendrocytes, astrocytes, and neurons) and (2) include cell-type-specific variant effect predictions (VEPs) as functional annotations. gruyere identifies 13 significant genetic associations not detected by other RV methods, four of which remain significant in omnibus tests. We find that deep-learning-based VEPs for splicing, transcription factor binding, and chromatin state are highly predictive of functional non-coding RVs. Our study establishes a robust framework incorporating functional annotations, coding RVs, and cell-type-associated non-coding RVs to perform genome-wide association tests, uncovering AD-relevant genes and annotations.
{"title":"Leveraging functional annotations to map rare variants associated with Alzheimer disease with gruyere.","authors":"Anjali Das, Chirag Lakhani, Chloé Terwagne, Jui-Shan T Lin, Tatsuhiko Naito, Towfique Raj, David A Knowles","doi":"10.1016/j.ajhg.2025.07.016","DOIUrl":"10.1016/j.ajhg.2025.07.016","url":null,"abstract":"<p><p>Increased availability of whole-genome sequencing (WGS) has facilitated the study of rare variants (RVs) in complex diseases. Multiple RV association tests are available to study the relationship between genotype and phenotype, but most do not fully leverage the availability of variant-level functional annotations. We propose genome-wide rare variant enrichment evaluation (gruyere), an empirical Bayesian framework that complements existing methods by learning global, trait-specific weights for functional annotations to improve variant prioritization. We apply gruyere to WGS data from the Alzheimer's Disease Sequencing Project to identify Alzheimer disease (AD)-associated genes and annotations. Growing evidence suggests that the disruption of microglial regulation is a key contributor to AD risk, yet existing methods have not examined rare non-coding effects that incorporate such cell-type-specific information. To address this gap, we (1) define per-gene non-coding RV test sets using predicted enhancer and promoter regions in microglia and other brain cell types (oligodendrocytes, astrocytes, and neurons) and (2) include cell-type-specific variant effect predictions (VEPs) as functional annotations. gruyere identifies 13 significant genetic associations not detected by other RV methods, four of which remain significant in omnibus tests. We find that deep-learning-based VEPs for splicing, transcription factor binding, and chromatin state are highly predictive of functional non-coding RVs. Our study establishes a robust framework incorporating functional annotations, coding RVs, and cell-type-associated non-coding RVs to perform genome-wide association tests, uncovering AD-relevant genes and annotations.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2138-2151"},"PeriodicalIF":8.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12373115/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144939230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-04Epub Date: 2025-08-19DOI: 10.1016/j.ajhg.2025.07.014
Cornelis Blauwendraat, Alastair J Noyce, Ignacio F Mata, Laurel A Screven, J Solle, Sonya B Dumanis, Ekemini A Riley, Maria Teresa Periñan, Njideka Okubadejo, Christine Klein, Huw R Morris, Andrew B Singleton
The need for more diversity in research is a widely recognized problem, especially in the genetics and genomics fields. While resolving this problem seems straightforward by recruiting and sequencing research participants from underrepresented populations, implementing an effort like this is complex operationally. Key considerations include ensuring equity, building capacity, and creating a sustainable research collective that works collaboratively to address local and global questions in research. Here, we provide a roadmap detailing how the Global Parkinson's Genetics Program (GP2) is tackling the lack of diversity in Parkinson disease (PD) genetics research and also reflect on 5 years of progress. GP2 aims to be a global hub facilitating subject recruitment, sample collection, data generation, harmonization, and sharing. It also acts as a centralized target discovery hub for PD genetics worldwide. The underlying tenets of GP2 center on transparency, the democratization of data and discovery, training and career support, providing (or generating) actionable results, and creating a functional collective of PD researchers worldwide. GP2 is working with 275 research groups worldwide. There are data and samples from 265,000 subjects currently committed to the program as of May 2025. We discuss the lessons learned in this process and highlight what we view as the emerging opportunities that the program will aim to target over the next period.
{"title":"Tackling a disease on a global scale, the Global Parkinson's Genetics Program, GP2: A new generation of opportunities.","authors":"Cornelis Blauwendraat, Alastair J Noyce, Ignacio F Mata, Laurel A Screven, J Solle, Sonya B Dumanis, Ekemini A Riley, Maria Teresa Periñan, Njideka Okubadejo, Christine Klein, Huw R Morris, Andrew B Singleton","doi":"10.1016/j.ajhg.2025.07.014","DOIUrl":"10.1016/j.ajhg.2025.07.014","url":null,"abstract":"<p><p>The need for more diversity in research is a widely recognized problem, especially in the genetics and genomics fields. While resolving this problem seems straightforward by recruiting and sequencing research participants from underrepresented populations, implementing an effort like this is complex operationally. Key considerations include ensuring equity, building capacity, and creating a sustainable research collective that works collaboratively to address local and global questions in research. Here, we provide a roadmap detailing how the Global Parkinson's Genetics Program (GP2) is tackling the lack of diversity in Parkinson disease (PD) genetics research and also reflect on 5 years of progress. GP2 aims to be a global hub facilitating subject recruitment, sample collection, data generation, harmonization, and sharing. It also acts as a centralized target discovery hub for PD genetics worldwide. The underlying tenets of GP2 center on transparency, the democratization of data and discovery, training and career support, providing (or generating) actionable results, and creating a functional collective of PD researchers worldwide. GP2 is working with 275 research groups worldwide. There are data and samples from 265,000 subjects currently committed to the program as of May 2025. We discuss the lessons learned in this process and highlight what we view as the emerging opportunities that the program will aim to target over the next period.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"1988-2000"},"PeriodicalIF":8.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12461011/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144939287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-04Epub Date: 2025-08-18DOI: 10.1016/j.ajhg.2025.07.012
Chad A Shaw, C J Williams, Taotao Tan, Daniel Illera, Nicholas Di, Joshua M Shulman, John W Belmont
We present the Causal Pivot (CP) as a structural causal model (SCM) for analyzing genetic heterogeneity in complex diseases. The CP leverages an established causal factor or factors to detect the contribution of additional suspected causes. Specifically, polygenic risk scores (PRSs) serve as known causes, while rare variants (RVs) or RV ensembles are evaluated as candidate causes. The CP incorporates outcome-induced association by conditioning on disease status. We derive a conditional maximum-likelihood procedure for binary and quantitative traits and develop the Causal Pivot likelihood ratio test (CP-LRT) to detect causal signals. Through simulations, we demonstrate the CP-LRT's robust power and superior error control compared to alternatives. We apply the CP-LRT to UK Biobank (UKB) data, analyzing three exemplar diseases: hypercholesterolemia (HC, low-density lipoprotein cholesterol ≥4.9 mmol/L; nc = 24,656), breast cancer (BC, ICD-10 C50; nc = 12,479), and Parkinson disease (PD, ICD-10 G20; nc = 2,940). For PRS, we utilize UKB-derived values, and for RVs, we analyze ClinVar pathogenic/likely pathogenic variants and loss-of-function mutations in disease-relevant genes: LDLR for HC, BRCA1 for BC, and GBA1 for PD. Significant CP-LRT signals were detected for all three diseases. Cross-disease and synonymous variant analyses serve as controls. We further develop ancestry adjustment using matching and inverse probability weighting as well as regression and doubly robust methods; we extend this to examine oligogenic burden in the lysosomal storage pathway in PD. The CP reveals an approach to address heterogeneity and is an extensible method for inference and discovery in complex disease genetics.
{"title":"The Causal Pivot: A structural approach to genetic heterogeneity and variant discovery in complex diseases.","authors":"Chad A Shaw, C J Williams, Taotao Tan, Daniel Illera, Nicholas Di, Joshua M Shulman, John W Belmont","doi":"10.1016/j.ajhg.2025.07.012","DOIUrl":"10.1016/j.ajhg.2025.07.012","url":null,"abstract":"<p><p>We present the Causal Pivot (CP) as a structural causal model (SCM) for analyzing genetic heterogeneity in complex diseases. The CP leverages an established causal factor or factors to detect the contribution of additional suspected causes. Specifically, polygenic risk scores (PRSs) serve as known causes, while rare variants (RVs) or RV ensembles are evaluated as candidate causes. The CP incorporates outcome-induced association by conditioning on disease status. We derive a conditional maximum-likelihood procedure for binary and quantitative traits and develop the Causal Pivot likelihood ratio test (CP-LRT) to detect causal signals. Through simulations, we demonstrate the CP-LRT's robust power and superior error control compared to alternatives. We apply the CP-LRT to UK Biobank (UKB) data, analyzing three exemplar diseases: hypercholesterolemia (HC, low-density lipoprotein cholesterol ≥4.9 mmol/L; n<sub>c</sub> = 24,656), breast cancer (BC, ICD-10 C50; n<sub>c</sub> = 12,479), and Parkinson disease (PD, ICD-10 G20; n<sub>c</sub> = 2,940). For PRS, we utilize UKB-derived values, and for RVs, we analyze ClinVar pathogenic/likely pathogenic variants and loss-of-function mutations in disease-relevant genes: LDLR for HC, BRCA1 for BC, and GBA1 for PD. Significant CP-LRT signals were detected for all three diseases. Cross-disease and synonymous variant analyses serve as controls. We further develop ancestry adjustment using matching and inverse probability weighting as well as regression and doubly robust methods; we extend this to examine oligogenic burden in the lysosomal storage pathway in PD. The CP reveals an approach to address heterogeneity and is an extensible method for inference and discovery in complex disease genetics.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2232-2246"},"PeriodicalIF":8.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12461002/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144881877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-04Epub Date: 2025-08-13DOI: 10.1016/j.ajhg.2025.07.008
Joshua G Schraiber, Jeffrey P Spence, Michael D Edge
As genetic sequencing costs have plummeted, datasets with sizes previously unthinkable have begun to appear. Such datasets present opportunities to learn about evolutionary history, particularly via rare alleles that record the very recent past. However, beyond the computational challenges inherent in the analysis of many large-scale datasets, large population-genetic datasets present theoretical problems. In particular, the majority of population-genetic tools require the assumption that each mutant allele in the sample is the result of a single mutation (the "infinite-sites" assumption), which is violated in large samples. Here, we present DR EVIL, a method for estimating mutation rates and recent demographic history from very large samples. DR EVIL avoids the infinite-sites assumption by using a diffusion approximation to a branching-process model with recurrent mutation. This approach results in tractable likelihoods that are accurate for rare alleles. We show that DR EVIL performs well in simulations and apply it to rare-variant data from one million haploid samples. We identify mutation-rate heterogeneity even after accounting for trinucleotide context and methylation status. We also predict that at modern sample sizes, the alleles at most polymorphic sites with high mutation rates represent the descendants of multiple mutation events.
{"title":"Estimation of demography and mutation rates from one million haploid genomes.","authors":"Joshua G Schraiber, Jeffrey P Spence, Michael D Edge","doi":"10.1016/j.ajhg.2025.07.008","DOIUrl":"10.1016/j.ajhg.2025.07.008","url":null,"abstract":"<p><p>As genetic sequencing costs have plummeted, datasets with sizes previously unthinkable have begun to appear. Such datasets present opportunities to learn about evolutionary history, particularly via rare alleles that record the very recent past. However, beyond the computational challenges inherent in the analysis of many large-scale datasets, large population-genetic datasets present theoretical problems. In particular, the majority of population-genetic tools require the assumption that each mutant allele in the sample is the result of a single mutation (the \"infinite-sites\" assumption), which is violated in large samples. Here, we present DR EVIL, a method for estimating mutation rates and recent demographic history from very large samples. DR EVIL avoids the infinite-sites assumption by using a diffusion approximation to a branching-process model with recurrent mutation. This approach results in tractable likelihoods that are accurate for rare alleles. We show that DR EVIL performs well in simulations and apply it to rare-variant data from one million haploid samples. We identify mutation-rate heterogeneity even after accounting for trinucleotide context and methylation status. We also predict that at modern sample sizes, the alleles at most polymorphic sites with high mutation rates represent the descendants of multiple mutation events.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2152-2166"},"PeriodicalIF":8.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12461025/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144854254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-04Epub Date: 2025-08-12DOI: 10.1016/j.ajhg.2025.08.002
Chelsea X Alvarado, Mary B Makarious, Cory A Weller, Dan Vitale, Mathew J Koretsky, Sara Bandres-Ciga, Hirotaka Iwaki, Kristin Levine, Andrew Singleton, Faraz Faghri, Mike A Nalls, Hampton L Leonard
{"title":"omicSynth: An open multi-omic community resource for identifying druggable targets across neurodegenerative diseases.","authors":"Chelsea X Alvarado, Mary B Makarious, Cory A Weller, Dan Vitale, Mathew J Koretsky, Sara Bandres-Ciga, Hirotaka Iwaki, Kristin Levine, Andrew Singleton, Faraz Faghri, Mike A Nalls, Hampton L Leonard","doi":"10.1016/j.ajhg.2025.08.002","DOIUrl":"10.1016/j.ajhg.2025.08.002","url":null,"abstract":"","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2248"},"PeriodicalIF":8.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12461009/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144844019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}