Pub Date : 2024-11-14DOI: 10.1016/j.ajhg.2024.10.010
Iftikhar J Kullo, Matthew P Conomos, Sarah C Nelson, Sally N Adebamowo, Ananyo Choudhury, David Conti, Stephanie M Fullerton, Stephanie M Gogarten, Ben Heavner, Whitney E Hornsby, Eimear E Kenny, Alyna Khan, Amit V Khera, Yun Li, Iman Martin, Josep M Mercader, Maggie Ng, Laura M Raffield, Alex Reiner, Robb Rowley, Daniel Schaid, Adrienne Stilp, Ken Wiley, Riley Wilson, John S Witte, Pradeep Natarajan
By improving disease risk prediction, polygenic risk scores (PRSs) could have a significant impact on health promotion and disease prevention. Due to the historical oversampling of populations with European ancestry for genome-wide association studies, PRSs perform less well in other, understudied populations, leading to concerns that clinical use in their current forms could widen health care disparities. The PRIMED Consortium was established to develop methods to improve the performance of PRSs in global populations and individuals of diverse genetic ancestry. To this end, PRIMED is aggregating and harmonizing multiple phenotype and genotype datasets on AnVIL, an interoperable secure cloud-based platform, to perform individual- and summary-level analyses using population and statistical genetics approaches. Study sites, the coordinating center, and representatives from the NIH work alongside other NHGRI and global consortia to achieve these goals. PRIMED is also evaluating ethical and social implications of PRS implementation and investigating the joint modeling of social determinants of health and PRS in computing disease risk. The phenotypes of interest are primarily cardiometabolic diseases and cancer, the leading causes of death and disability worldwide. Early deliverables of the consortium include methods for data sharing on AnVIL, development of a common data model to harmonize phenotype and genotype data from cohort studies as well as electronic health records, adaptation of recent guidelines for population descriptors to global cohorts, and sharing of PRS methods/tools. As a multisite collaboration, PRIMED aims to foster equity in the development and use of polygenic risk assessment.
{"title":"The PRIMED Consortium: Reducing disparities in polygenic risk assessment.","authors":"Iftikhar J Kullo, Matthew P Conomos, Sarah C Nelson, Sally N Adebamowo, Ananyo Choudhury, David Conti, Stephanie M Fullerton, Stephanie M Gogarten, Ben Heavner, Whitney E Hornsby, Eimear E Kenny, Alyna Khan, Amit V Khera, Yun Li, Iman Martin, Josep M Mercader, Maggie Ng, Laura M Raffield, Alex Reiner, Robb Rowley, Daniel Schaid, Adrienne Stilp, Ken Wiley, Riley Wilson, John S Witte, Pradeep Natarajan","doi":"10.1016/j.ajhg.2024.10.010","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.10.010","url":null,"abstract":"<p><p>By improving disease risk prediction, polygenic risk scores (PRSs) could have a significant impact on health promotion and disease prevention. Due to the historical oversampling of populations with European ancestry for genome-wide association studies, PRSs perform less well in other, understudied populations, leading to concerns that clinical use in their current forms could widen health care disparities. The PRIMED Consortium was established to develop methods to improve the performance of PRSs in global populations and individuals of diverse genetic ancestry. To this end, PRIMED is aggregating and harmonizing multiple phenotype and genotype datasets on AnVIL, an interoperable secure cloud-based platform, to perform individual- and summary-level analyses using population and statistical genetics approaches. Study sites, the coordinating center, and representatives from the NIH work alongside other NHGRI and global consortia to achieve these goals. PRIMED is also evaluating ethical and social implications of PRS implementation and investigating the joint modeling of social determinants of health and PRS in computing disease risk. The phenotypes of interest are primarily cardiometabolic diseases and cancer, the leading causes of death and disability worldwide. Early deliverables of the consortium include methods for data sharing on AnVIL, development of a common data model to harmonize phenotype and genotype data from cohort studies as well as electronic health records, adaptation of recent guidelines for population descriptors to global cohorts, and sharing of PRS methods/tools. As a multisite collaboration, PRIMED aims to foster equity in the development and use of polygenic risk assessment.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142674908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13DOI: 10.1016/j.ajhg.2024.10.011
Mafalda Dias, Rose Orenbuch, Debora S Marks, Jonathan Frazer
There has been considerable progress in building models to predict the effect of missense substitutions in protein-coding genes, fueled in large part by progress in applying deep learning methods to sequence data. These models have the potential to enable clinical variant annotation on a large scale and hence increase the impact of patient sequencing in guiding diagnosis and treatment. To realize this potential, it is essential to provide reliable assessments of model performance, scope of applicability, and robustness. As a response to this need, the ClinGen Sequence Variant Interpretation Working Group, Pejaver et al., recently proposed a strategy for validation and calibration of in-silico predictions in the context of guidelines for variant annotation. While this work marks an important step forward, the strategy presented still has important limitations. We propose core principles and recommendations to overcome these limitations that can enable both more reliable and more impactful use of variant effect prediction models in the future.
在建立预测蛋白质编码基因错义置换影响的模型方面取得了长足的进步,这在很大程度上得益于将深度学习方法应用于序列数据的进展。这些模型有可能实现大规模的临床变异注释,从而提高患者测序在指导诊断和治疗方面的影响力。要实现这一潜力,必须对模型的性能、适用范围和稳健性进行可靠的评估。作为对这一需求的回应,ClinGen 序列变异解释工作组(ClinGen Sequence Variant Interpretation Working Group)的 Pejaver 等人最近在变异注释指南的背景下提出了验证和校准实验室内预测的策略。虽然这项工作标志着向前迈出了重要的一步,但所提出的策略仍有重要的局限性。我们提出了克服这些局限性的核心原则和建议,这些原则和建议可以使变异效应预测模型在未来得到更可靠和更有影响力的使用。
{"title":"Toward trustable use of machine learning models of variant effects in the clinic.","authors":"Mafalda Dias, Rose Orenbuch, Debora S Marks, Jonathan Frazer","doi":"10.1016/j.ajhg.2024.10.011","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.10.011","url":null,"abstract":"<p><p>There has been considerable progress in building models to predict the effect of missense substitutions in protein-coding genes, fueled in large part by progress in applying deep learning methods to sequence data. These models have the potential to enable clinical variant annotation on a large scale and hence increase the impact of patient sequencing in guiding diagnosis and treatment. To realize this potential, it is essential to provide reliable assessments of model performance, scope of applicability, and robustness. As a response to this need, the ClinGen Sequence Variant Interpretation Working Group, Pejaver et al., recently proposed a strategy for validation and calibration of in-silico predictions in the context of guidelines for variant annotation. While this work marks an important step forward, the strategy presented still has important limitations. We propose core principles and recommendations to overcome these limitations that can enable both more reliable and more impactful use of variant effect prediction models in the future.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142674909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13DOI: 10.1016/j.ajhg.2024.10.016
Sai Swaroop Chittoor, Simona Giunta
Secondary structures are non-canonical arrangements of nucleic acids due to intra-strand interactions, including base pairing, stacking, or other higher-order features that deviate from the standard double-helical conformation. While these structures are extensively studied in RNA, they can also form when DNA becomes single stranded, creating topological roadblocks that can impact essential DNA-based processes such as replication, transcription, and repair, ultimately affecting genome stability. The availability of a complete linear sequence of human genomes, including repetitive loci, enables the prediction of DNA secondary structures comparing across various regions. Here, we evaluate the intrinsic properties of linear single-stranded DNA sequences derived from sampling specialized human loci such as centromeres, pericentromeres, ribosomal DNA (rDNA), and coding regions from the CHM13 genome. Our comparative analysis of predicted secondary structures across human chromosomes revealed the heightened presence, complexity, and instability of secondary structures within the centromere, which gradually decreased toward the pericentromere onto chromosomes' arms, on average lowest in coding regions. Notably, centromeric repeats exhibited the highest level of topological complexity within both the active and divergent domains, even when compared to other repetitive tandem satellites, such as rDNA in acrocentric chromosomes. Our findings provide evidence of the intrinsic self-hybridizing properties of centromere repeats, which are capable of generating complex topological structures that may functionally correlate with chromosome missegregation, especially when centromeric chromatin is disrupted. Processes such as long non-coding RNA transcription, recombination, and other mechanisms that dechromatinize and unwind stretches of linear DNA in these regions create in vivo opportunities for the DNA acrobatics hereby predicted.
二级结构是核酸因链内相互作用(包括碱基配对、堆叠或其他偏离标准双螺旋构象的高阶特征)而形成的非规范排列。虽然这些结构在 RNA 中被广泛研究,但当 DNA 变为单链时也会形成这些结构,从而产生拓扑路障,影响以 DNA 为基础的基本过程,如复制、转录和修复,最终影响基因组的稳定性。人类基因组包括重复位点在内的完整线性序列的出现,使得对不同区域的 DNA 二级结构进行比较预测成为可能。在这里,我们评估了线性单链 DNA 序列的内在特性,这些序列来自于取样专门的人类基因座,如中心粒、周中心粒、核糖体 DNA (rDNA) 和 CHM13 基因组的编码区。我们对人类染色体上预测的二级结构进行了比较分析,发现中心粒内二级结构的存在性、复杂性和不稳定性都很高,向染色体臂的近中心粒方向逐渐降低,平均而言,编码区的二级结构最低。值得注意的是,即使与其他重复串联卫星(如非中心染色体中的 rDNA)相比,中心粒重复序列在活跃域和发散域内都表现出最高的拓扑复杂性。我们的研究结果提供了中心粒重复序列内在自杂交特性的证据,它能够产生复杂的拓扑结构,在功能上可能与染色体错分离相关,尤其是当中心粒染色质被破坏时。长非编码 RNA 转录、重组等过程,以及其他使这些区域的线性 DNA 片段脱染色质和解旋的机制,为此处预测的 DNA 杂技表演创造了活体机会。
{"title":"Comparative analysis of predicted DNA secondary structures infers complex human centromere topology.","authors":"Sai Swaroop Chittoor, Simona Giunta","doi":"10.1016/j.ajhg.2024.10.016","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.10.016","url":null,"abstract":"<p><p>Secondary structures are non-canonical arrangements of nucleic acids due to intra-strand interactions, including base pairing, stacking, or other higher-order features that deviate from the standard double-helical conformation. While these structures are extensively studied in RNA, they can also form when DNA becomes single stranded, creating topological roadblocks that can impact essential DNA-based processes such as replication, transcription, and repair, ultimately affecting genome stability. The availability of a complete linear sequence of human genomes, including repetitive loci, enables the prediction of DNA secondary structures comparing across various regions. Here, we evaluate the intrinsic properties of linear single-stranded DNA sequences derived from sampling specialized human loci such as centromeres, pericentromeres, ribosomal DNA (rDNA), and coding regions from the CHM13 genome. Our comparative analysis of predicted secondary structures across human chromosomes revealed the heightened presence, complexity, and instability of secondary structures within the centromere, which gradually decreased toward the pericentromere onto chromosomes' arms, on average lowest in coding regions. Notably, centromeric repeats exhibited the highest level of topological complexity within both the active and divergent domains, even when compared to other repetitive tandem satellites, such as rDNA in acrocentric chromosomes. Our findings provide evidence of the intrinsic self-hybridizing properties of centromere repeats, which are capable of generating complex topological structures that may functionally correlate with chromosome missegregation, especially when centromeric chromatin is disrupted. Processes such as long non-coding RNA transcription, recombination, and other mechanisms that dechromatinize and unwind stretches of linear DNA in these regions create in vivo opportunities for the DNA acrobatics hereby predicted.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142674906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-12DOI: 10.1016/j.ajhg.2024.10.009
Marie Saitou, Andy Dahl, Qingbo Wang, Xuanyao Liu
Population-level genetic studies are overwhelmingly biased toward European ancestries. Transferring genetic predictions from European ancestries to other ancestries results in a substantial loss of accuracy. Yet, it remains unclear how much various genetic factors, such as causal effect differences, linkage disequilibrium (LD) differences, or allele frequency differences, contribute to the loss of prediction accuracy across ancestries. In this study, we used gene expression levels in lymphoblastoid cell lines to understand how much each genetic factor contributes to lowered portability of gene expression prediction from European to African ancestries. We found that cis-genetic effects on gene expression are highly similar between European and African individuals. However, we found that allele frequency differences of causal variants have a striking impact on prediction portability. For example, portability is reduced by more than 32% when the causal cis-variant is common (minor allele frequency, MAF >5%) in European samples (training population) but is rarer (MAF <5%) in African samples (prediction population). While large allele frequency differences can decrease portability through increasing LD differences, we also determined that causal allele frequency can significantly impact portability when the impact from LD is substantially controlled. This observation suggests that improving statistical fine-mapping alone does not overcome the loss of portability resulting from differences in causal allele frequency. We conclude that causal cis-eQTL effects are highly similar in European and African individuals, and allele frequency differences have a large impact on the accuracy of gene expression prediction.
{"title":"Allele frequency impacts the cross-ancestry portability of gene expression prediction in lymphoblastoid cell lines.","authors":"Marie Saitou, Andy Dahl, Qingbo Wang, Xuanyao Liu","doi":"10.1016/j.ajhg.2024.10.009","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.10.009","url":null,"abstract":"<p><p>Population-level genetic studies are overwhelmingly biased toward European ancestries. Transferring genetic predictions from European ancestries to other ancestries results in a substantial loss of accuracy. Yet, it remains unclear how much various genetic factors, such as causal effect differences, linkage disequilibrium (LD) differences, or allele frequency differences, contribute to the loss of prediction accuracy across ancestries. In this study, we used gene expression levels in lymphoblastoid cell lines to understand how much each genetic factor contributes to lowered portability of gene expression prediction from European to African ancestries. We found that cis-genetic effects on gene expression are highly similar between European and African individuals. However, we found that allele frequency differences of causal variants have a striking impact on prediction portability. For example, portability is reduced by more than 32% when the causal cis-variant is common (minor allele frequency, MAF >5%) in European samples (training population) but is rarer (MAF <5%) in African samples (prediction population). While large allele frequency differences can decrease portability through increasing LD differences, we also determined that causal allele frequency can significantly impact portability when the impact from LD is substantially controlled. This observation suggests that improving statistical fine-mapping alone does not overcome the loss of portability resulting from differences in causal allele frequency. We conclude that causal cis-eQTL effects are highly similar in European and African individuals, and allele frequency differences have a large impact on the accuracy of gene expression prediction.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142643243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-12DOI: 10.1016/j.ajhg.2024.10.018
Sanni Ruotsalainen, Juha Karjalainen, Mitja Kurki, Elisa Lahtela, Matti Pirinen, Juha Riikonen, Jarmo Ritari, Silja Tammi, Jukka Partanen, Hannele Laivuori, Aarno Palotie, Henrike Heyne, Mark Daly, Elisabeth Widen
Female infertility is a common and complex health problem affecting millions of women worldwide. While multiple factors can contribute to this condition, the underlying cause remains elusive in up to 15%-30% of affected individuals. In our large genome-wide association study (GWAS) of 22,849 women with infertility and 198,989 control individuals from the Finnish population cohort FinnGen, we unveil a landscape of genetic factors associated with the disorder. Our recessive analysis identified a low-frequency stop-gained mutation in TATA-box binding protein-like 2 (TBPL2; c.895A>T [p.Arg299Ter]; minor-allele frequency [MAF] = 1.2%) with an impact comparable to highly penetrant monogenic mutations (odds ratio [OR] = 650, p = 4.1 × 10-25). While previous studies have linked the orthologous gene to anovulation and sterility in knockout mice, the severe consequence of the p.Arg299Ter variant was evidenced by individuals carrying two copies of that variant having significantly fewer offspring (average of 0.16) compared to women belonging to the other genotype groups (average of 1.75 offspring, p = 1.4 × 10-15). Notably, all homozygous women who had given birth had received infertility therapy. Moreover, our age-stratified analyses identified three additional genome-wide significant loci. Two loci were associated with early-onset infertility (diagnosed before age 30), located near CHEK2 and within the major histocompatibility complex (MHC) region. The third locus, associated with late-onset infertility, had its lead SNP located in an intron of a long non-coding RNA (lncRNA) gene. Taken together, our data highlight the significance of rare recessive alleles in shaping female infertility risk. The results further provide evidence supporting specific age-dependent mechanisms underlying this complex disorder.
{"title":"Inherited infertility: Mapping loci associated with impaired female reproduction.","authors":"Sanni Ruotsalainen, Juha Karjalainen, Mitja Kurki, Elisa Lahtela, Matti Pirinen, Juha Riikonen, Jarmo Ritari, Silja Tammi, Jukka Partanen, Hannele Laivuori, Aarno Palotie, Henrike Heyne, Mark Daly, Elisabeth Widen","doi":"10.1016/j.ajhg.2024.10.018","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.10.018","url":null,"abstract":"<p><p>Female infertility is a common and complex health problem affecting millions of women worldwide. While multiple factors can contribute to this condition, the underlying cause remains elusive in up to 15%-30% of affected individuals. In our large genome-wide association study (GWAS) of 22,849 women with infertility and 198,989 control individuals from the Finnish population cohort FinnGen, we unveil a landscape of genetic factors associated with the disorder. Our recessive analysis identified a low-frequency stop-gained mutation in TATA-box binding protein-like 2 (TBPL2; c.895A>T [p.Arg299Ter]; minor-allele frequency [MAF] = 1.2%) with an impact comparable to highly penetrant monogenic mutations (odds ratio [OR] = 650, p = 4.1 × 10<sup>-25</sup>). While previous studies have linked the orthologous gene to anovulation and sterility in knockout mice, the severe consequence of the p.Arg299Ter variant was evidenced by individuals carrying two copies of that variant having significantly fewer offspring (average of 0.16) compared to women belonging to the other genotype groups (average of 1.75 offspring, p = 1.4 × 10<sup>-15</sup>). Notably, all homozygous women who had given birth had received infertility therapy. Moreover, our age-stratified analyses identified three additional genome-wide significant loci. Two loci were associated with early-onset infertility (diagnosed before age 30), located near CHEK2 and within the major histocompatibility complex (MHC) region. The third locus, associated with late-onset infertility, had its lead SNP located in an intron of a long non-coding RNA (lncRNA) gene. Taken together, our data highlight the significance of rare recessive alleles in shaping female infertility risk. The results further provide evidence supporting specific age-dependent mechanisms underlying this complex disorder.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142680622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-12DOI: 10.1016/j.ajhg.2024.10.015
Michelle A Ramos, Katherine E Bonini, Laura Scarimbolo, Nicole R Kelly, Beverly Insel, Sabrina A Suckiel, Kaitlyn Brown, Miranda Di Biase, Katie M Gallagher, Jessenia Lopez, Karla López Aguiñiga, Priya N Marathe, Estefany Maria, Jacqueline A Odgis, Jessica E Rodriguez, Michelle A Rodriguez, Nairovylex Ruiz, Monisha Sebastin, Nicole M Yelton, Charlotte Cunningham-Rundles, Melvin Gertner, Irma Laguerre, Thomas V McDonald, Patricia E McGoldrick, Mimsie Robinson, Arye Rubinstein, Lisa H Shulman, Trinisha Williams, Steven M Wolf, Elissa G Yozawitz, Randi E Zinberg, Noura S Abul-Husn, Laurie J Bauman, George A Diaz, Bart S Ferket, John M Greally, Vaidehi Jobanputra, Bruce D Gelb, Eimear E Kenny, Melissa P Wasserstein, Carol R Horowitz
Underrepresentation in clinical genomics research limits the generalizability of findings and the benefits of scientific discoveries. We describe the impact of patient-centered, data-driven recruitment and retention strategies in a pediatric genome sequencing study. We collaborated with a stakeholder board, conducted formative research with adults whose children had undergone genomic testing, and piloted and revised study approaches and materials. Our approaches included racially, ethnically, and linguistically congruent study staff, relational interactions, study visit flexibility, and data-informed quality improvement. Of 1,656 eligible children, only 6.5% declined. Their parents/legal guardians were 76.9% non-White, 65.6% had public health insurance for the child, 49.9% lived below the federal poverty level, and 52.8% resided in a medically underserved area. Among those enrolled, 87.3% completed all study procedures. There were no sociodemographic differences between those who enrolled and declined or between those retained and lost to follow-up. We outline stakeholder-engaged approaches that may have led to the successful enrollment and retention of diverse families. These approaches may inform future research initiatives aiming to engage and retain underrepresented populations in genomics medicine research.
{"title":"Employing effective recruitment and retention strategies to engage a diverse pediatric population in genomics research.","authors":"Michelle A Ramos, Katherine E Bonini, Laura Scarimbolo, Nicole R Kelly, Beverly Insel, Sabrina A Suckiel, Kaitlyn Brown, Miranda Di Biase, Katie M Gallagher, Jessenia Lopez, Karla López Aguiñiga, Priya N Marathe, Estefany Maria, Jacqueline A Odgis, Jessica E Rodriguez, Michelle A Rodriguez, Nairovylex Ruiz, Monisha Sebastin, Nicole M Yelton, Charlotte Cunningham-Rundles, Melvin Gertner, Irma Laguerre, Thomas V McDonald, Patricia E McGoldrick, Mimsie Robinson, Arye Rubinstein, Lisa H Shulman, Trinisha Williams, Steven M Wolf, Elissa G Yozawitz, Randi E Zinberg, Noura S Abul-Husn, Laurie J Bauman, George A Diaz, Bart S Ferket, John M Greally, Vaidehi Jobanputra, Bruce D Gelb, Eimear E Kenny, Melissa P Wasserstein, Carol R Horowitz","doi":"10.1016/j.ajhg.2024.10.015","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.10.015","url":null,"abstract":"<p><p>Underrepresentation in clinical genomics research limits the generalizability of findings and the benefits of scientific discoveries. We describe the impact of patient-centered, data-driven recruitment and retention strategies in a pediatric genome sequencing study. We collaborated with a stakeholder board, conducted formative research with adults whose children had undergone genomic testing, and piloted and revised study approaches and materials. Our approaches included racially, ethnically, and linguistically congruent study staff, relational interactions, study visit flexibility, and data-informed quality improvement. Of 1,656 eligible children, only 6.5% declined. Their parents/legal guardians were 76.9% non-White, 65.6% had public health insurance for the child, 49.9% lived below the federal poverty level, and 52.8% resided in a medically underserved area. Among those enrolled, 87.3% completed all study procedures. There were no sociodemographic differences between those who enrolled and declined or between those retained and lost to follow-up. We outline stakeholder-engaged approaches that may have led to the successful enrollment and retention of diverse families. These approaches may inform future research initiatives aiming to engage and retain underrepresented populations in genomics medicine research.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142680668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Polycystic ovarian syndrome (PCOS) is an endocrine syndrome that affects a large portion of women worldwide. This proteogenomic and functional study aimed to uncover candidate therapeutic targets for PCOS. We comprehensively investigated the causal association between circulating proteins and PCOS using two-sample Mendelian randomization analysis. Cis-protein quantitative trait loci were derived from six genome-wide association studies (GWASs) on plasma proteome. Genetic associations with PCOS were obtained from a large-scale GWAS meta-analysis, FinnGen cohort, and UK Biobank. Colocalization analyses were performed to prioritize the causal role of candidate proteins. Protein-protein interaction (PPI) and druggability evaluation assessed the druggability of candidate proteins. We evaluated the enrichment of tier 1 and 2 candidate proteins in individuals with PCOS and a mouse model and explored the potential application of the identified drug target. Genetically predicted levels of 65 proteins exhibited associations with PCOS risk, with 30 proteins showing elevated levels and 35 proteins showing decreased levels linked to higher susceptibility. PPI analyses revealed that FSHB, POSTN, CCN2, and CXCL11 interacted with targets of current PCOS medications. Eighty medications targeting 20 proteins showed their potential for repurposing as therapeutic targets for PCOS. EGLN1 levels were elevated in granulosa cells and the plasma of individuals with PCOS and in the plasma and ovaries of dehydroepiandrosterone (DHEA)-induced PCOS mouse model. As an EGLN1 inhibitor, administration of roxadustat in the PCOS mouse model elucidated the EGLN1-HIF1α-ferroptosis axis in inducing PCOS and validated its therapeutic effect in PCOS. Our study identifies candidate proteins causally associated with PCOS risk and suggests that targeting EGLN1 provides a promising treatment strategy.
{"title":"Proteome-wide Mendelian randomization and functional studies uncover therapeutic targets for polycystic ovarian syndrome.","authors":"Feida Ni, Feixia Wang, Jing Sun, Mixue Tu, Jianpeng Chen, Xiling Shen, Xiaohang Ye, Ruixue Chen, Yifeng Liu, Xiao Sun, Jianhua Chen, Xue Li, Dan Zhang","doi":"10.1016/j.ajhg.2024.10.008","DOIUrl":"10.1016/j.ajhg.2024.10.008","url":null,"abstract":"<p><p>Polycystic ovarian syndrome (PCOS) is an endocrine syndrome that affects a large portion of women worldwide. This proteogenomic and functional study aimed to uncover candidate therapeutic targets for PCOS. We comprehensively investigated the causal association between circulating proteins and PCOS using two-sample Mendelian randomization analysis. Cis-protein quantitative trait loci were derived from six genome-wide association studies (GWASs) on plasma proteome. Genetic associations with PCOS were obtained from a large-scale GWAS meta-analysis, FinnGen cohort, and UK Biobank. Colocalization analyses were performed to prioritize the causal role of candidate proteins. Protein-protein interaction (PPI) and druggability evaluation assessed the druggability of candidate proteins. We evaluated the enrichment of tier 1 and 2 candidate proteins in individuals with PCOS and a mouse model and explored the potential application of the identified drug target. Genetically predicted levels of 65 proteins exhibited associations with PCOS risk, with 30 proteins showing elevated levels and 35 proteins showing decreased levels linked to higher susceptibility. PPI analyses revealed that FSHB, POSTN, CCN2, and CXCL11 interacted with targets of current PCOS medications. Eighty medications targeting 20 proteins showed their potential for repurposing as therapeutic targets for PCOS. EGLN1 levels were elevated in granulosa cells and the plasma of individuals with PCOS and in the plasma and ovaries of dehydroepiandrosterone (DHEA)-induced PCOS mouse model. As an EGLN1 inhibitor, administration of roxadustat in the PCOS mouse model elucidated the EGLN1-HIF1α-ferroptosis axis in inducing PCOS and validated its therapeutic effect in PCOS. Our study identifies candidate proteins causally associated with PCOS risk and suggests that targeting EGLN1 provides a promising treatment strategy.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142612266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-09DOI: 10.1016/j.ajhg.2024.10.014
Jérôme Delplanque, Lauriane Le Collen, Hélène Loiselle, Audrey Leloire, Bénédicte Toussaint, Emmanuel Vaillant, Guillaume Charpentier, Sylvia Franc, Beverley Balkau, Michel Marre, Emma Henriques, Emmanuel Buse Falay, Mehdi Derhourhi, Philippe Froguel, Amélie Bonnefond
Individuals with obesity caused by biallelic pathogenic LEPR (leptin receptor) variants can benefit from setmelanotide, the novel MC4R agonist. An ongoing phase 3 clinical trial (NCT05093634) includes individuals with obesity who carry a heterozygous LEPR variant, although the obesogenic impact of these variants remains incompletely evaluated. The aim of this study was to functionally assess heterozygous variants in LEPR and to evaluate their effect on obesity. We sequenced LEPR in ∼10,000 participants from the French RaDiO study. We found 86 rare heterozygous variants. Each identified variant was then investigated in vitro using luciferase and western blot assays. Using the criteria of the American College of Medical Genetics and Genomics (ACMG), including the strong criterion related to functional assays, we found 12 pathogenic LEPR variants. Most heterozygotes did not present with obesity, and we found no association between these pathogenic variants and body mass index (BMI). This lack of association between pathogenic LEPR variants and obesity risk or BMI was confirmed using exome data from 200,000 individuals in the UK Biobank. In the literature, among 55 reported heterozygotes for of a rare pathogenic LEPR variant, only 27% had obesity. In conclusion, monoallelic pathogenic LEPR variants were functionally tested, and they do not elevate the risk of obesity or BMI levels. This raises questions about the use of setmelanotide, a costly drug with potential side effects, based solely on the presence of a heterozygous LEPR variant.
{"title":"Monoallelic pathogenic variants in LEPR do not cause obesity.","authors":"Jérôme Delplanque, Lauriane Le Collen, Hélène Loiselle, Audrey Leloire, Bénédicte Toussaint, Emmanuel Vaillant, Guillaume Charpentier, Sylvia Franc, Beverley Balkau, Michel Marre, Emma Henriques, Emmanuel Buse Falay, Mehdi Derhourhi, Philippe Froguel, Amélie Bonnefond","doi":"10.1016/j.ajhg.2024.10.014","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.10.014","url":null,"abstract":"<p><p>Individuals with obesity caused by biallelic pathogenic LEPR (leptin receptor) variants can benefit from setmelanotide, the novel MC4R agonist. An ongoing phase 3 clinical trial (NCT05093634) includes individuals with obesity who carry a heterozygous LEPR variant, although the obesogenic impact of these variants remains incompletely evaluated. The aim of this study was to functionally assess heterozygous variants in LEPR and to evaluate their effect on obesity. We sequenced LEPR in ∼10,000 participants from the French RaDiO study. We found 86 rare heterozygous variants. Each identified variant was then investigated in vitro using luciferase and western blot assays. Using the criteria of the American College of Medical Genetics and Genomics (ACMG), including the strong criterion related to functional assays, we found 12 pathogenic LEPR variants. Most heterozygotes did not present with obesity, and we found no association between these pathogenic variants and body mass index (BMI). This lack of association between pathogenic LEPR variants and obesity risk or BMI was confirmed using exome data from 200,000 individuals in the UK Biobank. In the literature, among 55 reported heterozygotes for of a rare pathogenic LEPR variant, only 27% had obesity. In conclusion, monoallelic pathogenic LEPR variants were functionally tested, and they do not elevate the risk of obesity or BMI levels. This raises questions about the use of setmelanotide, a costly drug with potential side effects, based solely on the presence of a heterozygous LEPR variant.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142674907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-07Epub Date: 2024-10-08DOI: 10.1016/j.ajhg.2024.09.004
Neke Ibeh, Pradiptajati Kusuma, Chelzie Crenna Darusallam, Safarina G Malik, Herawati Sudoyo, Davis J McCarthy, Irene Gallego Romero
One of the regulatory mechanisms influencing the functional capacity of genes is alternative splicing (AS). Previous studies exploring the splicing landscape of human tissues have shown that AS has contributed to human biology, especially in disease progression and the immune response. Nonetheless, this phenomenon remains poorly characterized across human populations, and it is unclear how genetic and environmental variation contribute to AS. Here, we examine a set of 115 Indonesian samples from three traditional island populations spanning the genetic ancestry cline that characterizes Island Southeast Asia. We conduct a global AS analysis between islands to ascertain the degree of functionally significant AS events and their consequences. Using an event-based statistical model, we detected over 1,500 significant differential AS events across all comparisons. Additionally, we identify over 6,000 genetic variants associated with changes in splicing (splicing quantitative trait loci [sQTLs]), some of which are driven by Papuan-like genetic ancestry, and only show partial overlap with other publicly available sQTL datasets derived from other populations. Computational predictions of RNA binding activity reveal that a fraction of these sQTLs directly modulate the binding propensity of proteins involved in the splicing regulation of immune genes. Overall, these results contribute toward elucidating the role of genetic variation in shaping gene regulation in one of the most diverse regions in the world.
{"title":"Profiling genetically driven alternative splicing across the Indonesian archipelago.","authors":"Neke Ibeh, Pradiptajati Kusuma, Chelzie Crenna Darusallam, Safarina G Malik, Herawati Sudoyo, Davis J McCarthy, Irene Gallego Romero","doi":"10.1016/j.ajhg.2024.09.004","DOIUrl":"10.1016/j.ajhg.2024.09.004","url":null,"abstract":"<p><p>One of the regulatory mechanisms influencing the functional capacity of genes is alternative splicing (AS). Previous studies exploring the splicing landscape of human tissues have shown that AS has contributed to human biology, especially in disease progression and the immune response. Nonetheless, this phenomenon remains poorly characterized across human populations, and it is unclear how genetic and environmental variation contribute to AS. Here, we examine a set of 115 Indonesian samples from three traditional island populations spanning the genetic ancestry cline that characterizes Island Southeast Asia. We conduct a global AS analysis between islands to ascertain the degree of functionally significant AS events and their consequences. Using an event-based statistical model, we detected over 1,500 significant differential AS events across all comparisons. Additionally, we identify over 6,000 genetic variants associated with changes in splicing (splicing quantitative trait loci [sQTLs]), some of which are driven by Papuan-like genetic ancestry, and only show partial overlap with other publicly available sQTL datasets derived from other populations. Computational predictions of RNA binding activity reveal that a fraction of these sQTLs directly modulate the binding propensity of proteins involved in the splicing regulation of immune genes. Overall, these results contribute toward elucidating the role of genetic variation in shaping gene regulation in one of the most diverse regions in the world.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2458-2477"},"PeriodicalIF":5.4,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568790/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142387312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-07Epub Date: 2024-10-16DOI: 10.1016/j.ajhg.2024.09.007
Genevieve H L Roberts, Pamela R Fain, Stephanie A Santorico, Richard A Spritz
Vitiligo is a common autoimmune disease characterized by patches of depigmented skin and overlying hair due to destruction of melanocytes in the involved regions. We investigated the relationship between vitiligo risk and vitiligo age of onset (AOO) using a vitiligo polygenic risk score that incorporated the most significant SNPs from genome-wide association studies. We find that vitiligo genetic risk and AOO are strongly inversely correlated; subjects with higher common-variant polygenic risk tend to develop vitiligo at an earlier age. Nevertheless, the correlation is not simple. In individuals who carry a single high-risk major histocompatibility complex class II haplotype, the effect of additional polygenic risk on vitiligo AOO is reduced. Particularly among those with early-AOO vitiligo (onset ≤12 years of age), genetic risk can reflect contributions from high common-variant burden but also rare variants of high effect and sometimes both. While the heritability of vitiligo is relatively high, and we here show that genetic risk factors predict vitiligo AOO, vitiligo is never congenital, and thus environmental triggers also play an important role in disease onset.
白癜风是一种常见的自身免疫性疾病,其特征是由于受累区域的黑色素细胞遭到破坏而导致皮肤和毛发上出现色素脱失斑。我们使用白癜风多基因风险评分,结合全基因组关联研究中最重要的 SNPs,研究了白癜风风险与白癜风发病年龄(AOO)之间的关系。我们发现,白癜风遗传风险与发病年龄呈强烈的反比关系;共变异多基因风险较高的受试者往往会在较早的年龄患上白癜风。然而,这种相关性并不简单。在携带单一高风险主要组织相容性复合体 II 类单倍型的个体中,额外的多基因风险对白癜风 AOO 的影响会减弱。特别是在早期AOO型白癜风患者中(发病年龄小于12岁),遗传风险可能反映了高共变异负担的贡献,也可能反映了高效应的罕见变异,有时两者兼而有之。虽然白癜风的遗传率相对较高,而且我们在此表明,遗传风险因素可预测白癜风AOO,但白癜风从来都不是先天性的,因此环境诱因在疾病发病中也发挥着重要作用。
{"title":"Inverse relationship between polygenic risk burden and age of onset of autoimmune vitiligo.","authors":"Genevieve H L Roberts, Pamela R Fain, Stephanie A Santorico, Richard A Spritz","doi":"10.1016/j.ajhg.2024.09.007","DOIUrl":"10.1016/j.ajhg.2024.09.007","url":null,"abstract":"<p><p>Vitiligo is a common autoimmune disease characterized by patches of depigmented skin and overlying hair due to destruction of melanocytes in the involved regions. We investigated the relationship between vitiligo risk and vitiligo age of onset (AOO) using a vitiligo polygenic risk score that incorporated the most significant SNPs from genome-wide association studies. We find that vitiligo genetic risk and AOO are strongly inversely correlated; subjects with higher common-variant polygenic risk tend to develop vitiligo at an earlier age. Nevertheless, the correlation is not simple. In individuals who carry a single high-risk major histocompatibility complex class II haplotype, the effect of additional polygenic risk on vitiligo AOO is reduced. Particularly among those with early-AOO vitiligo (onset ≤12 years of age), genetic risk can reflect contributions from high common-variant burden but also rare variants of high effect and sometimes both. While the heritability of vitiligo is relatively high, and we here show that genetic risk factors predict vitiligo AOO, vitiligo is never congenital, and thus environmental triggers also play an important role in disease onset.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2561-2565"},"PeriodicalIF":5.4,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568747/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142455976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}