Pub Date : 2025-02-04DOI: 10.1016/j.xhgg.2025.100413
Pooja Middha, Linda Kachuri, Jovia L Nierenberg, Rebecca E Graff, Taylor B Cavazos, Thomas J Hoffmann, Jie Zhang, Stacey Alexeeff, Laurel Habel, Douglas A Corley, Stephen Van Den Eeden, Lawrence H Kushi, Elad Ziv, Lori C Sakoda, John S Witte
With advances in cancer screening and treatment, there is a growing population of cancer survivors who may develop subsequent primary cancers. While hereditary cancer syndromes account for only a portion of multiple cancer cases, we sought to explore the role of common genetic variation in susceptibility to multiple primary tumors. We conducted a cross-ancestry genome-wide association study (GWAS) and transcriptome-wide association study (TWAS) of 10,983 individuals with multiple primary cancers, 84,475 individuals with single cancer, and 420,944 cancer-free controls from two large-scale studies. Our GWAS identified six lead variants across five genomic regions that were significantly associated (P<5×10-8) with the risk of developing multiple primary tumors (overall and invasive) relative to cancer-free controls (at 3q26, 8q24, 10q24, 11q13.3, and 17p13). We also found one variant significantly associated with multiple cancers when comparing to single cancer cases (at 22q13.1). Multi-tissue TWAS detected associations with genes involved in telomere maintenance in two of these regions (ACTRT3 in 3q26 and SLK and STN1 in 10q24) and the development of multiple cancers. Additionally, the TWAS also identified several novel genes associated with multiple cancers, including two immune-related genes, IRF4 and TNFRSF6B. Telomere maintenance and immune dysregulation emerge as central, common pathways influencing susceptibility to multiple cancers. These findings underscore the importance of exploring shared mechanisms in carcinogenesis, offering insights for targeted prevention and intervention strategies.
{"title":"Unraveling the genetic landscape of susceptibility to multiple primary cancers.","authors":"Pooja Middha, Linda Kachuri, Jovia L Nierenberg, Rebecca E Graff, Taylor B Cavazos, Thomas J Hoffmann, Jie Zhang, Stacey Alexeeff, Laurel Habel, Douglas A Corley, Stephen Van Den Eeden, Lawrence H Kushi, Elad Ziv, Lori C Sakoda, John S Witte","doi":"10.1016/j.xhgg.2025.100413","DOIUrl":"10.1016/j.xhgg.2025.100413","url":null,"abstract":"<p><p>With advances in cancer screening and treatment, there is a growing population of cancer survivors who may develop subsequent primary cancers. While hereditary cancer syndromes account for only a portion of multiple cancer cases, we sought to explore the role of common genetic variation in susceptibility to multiple primary tumors. We conducted a cross-ancestry genome-wide association study (GWAS) and transcriptome-wide association study (TWAS) of 10,983 individuals with multiple primary cancers, 84,475 individuals with single cancer, and 420,944 cancer-free controls from two large-scale studies. Our GWAS identified six lead variants across five genomic regions that were significantly associated (P<5×10<sup>-8</sup>) with the risk of developing multiple primary tumors (overall and invasive) relative to cancer-free controls (at 3q26, 8q24, 10q24, 11q13.3, and 17p13). We also found one variant significantly associated with multiple cancers when comparing to single cancer cases (at 22q13.1). Multi-tissue TWAS detected associations with genes involved in telomere maintenance in two of these regions (ACTRT3 in 3q26 and SLK and STN1 in 10q24) and the development of multiple cancers. Additionally, the TWAS also identified several novel genes associated with multiple cancers, including two immune-related genes, IRF4 and TNFRSF6B. Telomere maintenance and immune dysregulation emerge as central, common pathways influencing susceptibility to multiple cancers. These findings underscore the importance of exploring shared mechanisms in carcinogenesis, offering insights for targeted prevention and intervention strategies.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100413"},"PeriodicalIF":3.3,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143256912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-30DOI: 10.1016/j.xhgg.2025.100412
Peiyao Wang, Zhaotong Lin, Wei Pan
Mendelian randomization (MR) facilitates causal inference with observational data using publicly available genome-wide association study (GWAS) results. In GWAS one or more heritable covariates may be adjusted for to estimate the direct effects of SNPs on a focal trait or to improve statistical power, which however may introduce collider bias in SNP-trait association estimates, thus affecting downstream MR analyses. Numerical studies suggested that using covariate-adjusted GWAS summary data might introduce bias in univariable Mendelian randomization (UVMR), which can be mitigated by multivariable Mendelian randomization (MVMR). However, it remains unclear and even mysterious why/how MVMR works; a rigorous theory is needed to explain and substantiate the above empirical observation. In this paper, we derive some analytical results when multiple covariates are adjusted for in the GWAS of exposure and/or the GWAS of outcome, thus supporting and explaining the empirical results. Our analytical results offer insights to how bias arises in UVMR and how it is avoided in MVMR, regardless of whether collider bias is present. We also consider applying UVMR or MVMR methods after collider-bias correction. We conducted extensive simulations to demonstrate that with covariate-adjusted GWAS summary data, MVMR had an advantage over UVMR by producing nearly unbiased causal estimates; however, in some situations it is advantageous to apply UVMR after bias correction. In real data analyses of the GWAS data with body mass index (BMI) being adjusted for metabolomic principal components, we examined the causal effect of BMI on blood pressure, confirming the above points.
{"title":"Unbiased causal inference with Mendelian randomization and covariate-adjusted GWAS data.","authors":"Peiyao Wang, Zhaotong Lin, Wei Pan","doi":"10.1016/j.xhgg.2025.100412","DOIUrl":"https://doi.org/10.1016/j.xhgg.2025.100412","url":null,"abstract":"<p><p>Mendelian randomization (MR) facilitates causal inference with observational data using publicly available genome-wide association study (GWAS) results. In GWAS one or more heritable covariates may be adjusted for to estimate the direct effects of SNPs on a focal trait or to improve statistical power, which however may introduce collider bias in SNP-trait association estimates, thus affecting downstream MR analyses. Numerical studies suggested that using covariate-adjusted GWAS summary data might introduce bias in univariable Mendelian randomization (UVMR), which can be mitigated by multivariable Mendelian randomization (MVMR). However, it remains unclear and even mysterious why/how MVMR works; a rigorous theory is needed to explain and substantiate the above empirical observation. In this paper, we derive some analytical results when multiple covariates are adjusted for in the GWAS of exposure and/or the GWAS of outcome, thus supporting and explaining the empirical results. Our analytical results offer insights to how bias arises in UVMR and how it is avoided in MVMR, regardless of whether collider bias is present. We also consider applying UVMR or MVMR methods after collider-bias correction. We conducted extensive simulations to demonstrate that with covariate-adjusted GWAS summary data, MVMR had an advantage over UVMR by producing nearly unbiased causal estimates; however, in some situations it is advantageous to apply UVMR after bias correction. In real data analyses of the GWAS data with body mass index (BMI) being adjusted for metabolomic principal components, we examined the causal effect of BMI on blood pressure, confirming the above points.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100412"},"PeriodicalIF":3.3,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143075715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-29DOI: 10.1016/j.xhgg.2025.100411
Carolina G Downie, Poojan Shrestha, Samson Okello, Mohammad Yaser, Harold H Lee, Yujie Wang, Mohanraj Krishnan, Hung-Hsin Chen, Anne E Justice, Geetha Chittoor, Navya Shilpa Josyula, Sheila Gahagan, Estela Blanco, Raquel Burrows, Paulina Correa-Burrows, Cecilia Albala, José L Santos, Bárbara Angel, Betsy Lozoff, Fernando Pires Hartwig, Bernardo Horta, Karisa Roxo Brina, Carmen R Isasi, Qibin Qi, Linda C Gallo, Krista M Perreira, Bharat Thyagarajan, Martha Daviglus, Linda Van Horn, Franklyn Gonzalez, Jonathan P Bradfield, Hakon Hakonarson, Struan Grant, Jennifer E Below, Janine Felix, Mariaelisa Graff, Kimon Divaris, Kari E North
Over the past 30 years, obesity prevalence has markedly increased globally, including among children. Although genome-wide association studies (GWAS) have identified over 1,000 genetic loci associated with obesity-related traits in adults, the genetic architecture of childhood obesity is less well-characterized. Moreover, most childhood obesity GWAS have been restricted to severely obese children, in relatively small sample sizes, and in primarily European ancestry populations. To identify genetic loci associated with early childhood BMI, we performed GWAS of BMI z-scores in eight ancestrally diverse cohorts: ZOE 2.0 cohort, the Santiago Longitudinal Study (SLS), the Vanderbilt University BioVU biobank, the Geisinger MyCode Health Initiative biobank, SOL Youth, Pelotas (Brazil) Birth Cohort, Cameron County Hispanic Cohort (CCHC), and Viva La Familia cohort. We subsequently performed inverse variance weighted fixed-effect meta-analysis of these results with previously published GWAS summary statistics of BMI z-scores of children in the Early Growth Genetics (EGG) Consortium and the Norwegian Mother and Child Cohort (MoBa), constituting a final total of 84,804 individuals. We identified 39 genome-wide significant loci associated with childhood BMI, including three putatively novel loci (EFNA5 and DTWD2, RP11-2N5.1 on chromosome 5, and LSM14A on chromosome 19). We also observed a dynamic nature of genetic loci-BMI associations across the life course, with distinct effects across childhood and adulthood, highlighting possible critical periods for early childhood interventions. These findings strengthen calls for larger population-based studies of children across age strata and across diverse populations.
{"title":"Trans-ancestry genome wide association study of childhood body mass index identifies novel loci and age-specific effects.","authors":"Carolina G Downie, Poojan Shrestha, Samson Okello, Mohammad Yaser, Harold H Lee, Yujie Wang, Mohanraj Krishnan, Hung-Hsin Chen, Anne E Justice, Geetha Chittoor, Navya Shilpa Josyula, Sheila Gahagan, Estela Blanco, Raquel Burrows, Paulina Correa-Burrows, Cecilia Albala, José L Santos, Bárbara Angel, Betsy Lozoff, Fernando Pires Hartwig, Bernardo Horta, Karisa Roxo Brina, Carmen R Isasi, Qibin Qi, Linda C Gallo, Krista M Perreira, Bharat Thyagarajan, Martha Daviglus, Linda Van Horn, Franklyn Gonzalez, Jonathan P Bradfield, Hakon Hakonarson, Struan Grant, Jennifer E Below, Janine Felix, Mariaelisa Graff, Kimon Divaris, Kari E North","doi":"10.1016/j.xhgg.2025.100411","DOIUrl":"https://doi.org/10.1016/j.xhgg.2025.100411","url":null,"abstract":"<p><p>Over the past 30 years, obesity prevalence has markedly increased globally, including among children. Although genome-wide association studies (GWAS) have identified over 1,000 genetic loci associated with obesity-related traits in adults, the genetic architecture of childhood obesity is less well-characterized. Moreover, most childhood obesity GWAS have been restricted to severely obese children, in relatively small sample sizes, and in primarily European ancestry populations. To identify genetic loci associated with early childhood BMI, we performed GWAS of BMI z-scores in eight ancestrally diverse cohorts: ZOE 2.0 cohort, the Santiago Longitudinal Study (SLS), the Vanderbilt University BioVU biobank, the Geisinger MyCode Health Initiative biobank, SOL Youth, Pelotas (Brazil) Birth Cohort, Cameron County Hispanic Cohort (CCHC), and Viva La Familia cohort. We subsequently performed inverse variance weighted fixed-effect meta-analysis of these results with previously published GWAS summary statistics of BMI z-scores of children in the Early Growth Genetics (EGG) Consortium and the Norwegian Mother and Child Cohort (MoBa), constituting a final total of 84,804 individuals. We identified 39 genome-wide significant loci associated with childhood BMI, including three putatively novel loci (EFNA5 and DTWD2, RP11-2N5.1 on chromosome 5, and LSM14A on chromosome 19). We also observed a dynamic nature of genetic loci-BMI associations across the life course, with distinct effects across childhood and adulthood, highlighting possible critical periods for early childhood interventions. These findings strengthen calls for larger population-based studies of children across age strata and across diverse populations.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100411"},"PeriodicalIF":3.3,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143068459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-27DOI: 10.1016/j.xhgg.2025.100410
Trisha Dalapati, Liuyang Wang, Angela G Jones, Jonathan Cardwell, Iain R Konigsberg, Yohan Bossé, Don D Sin, Wim Timens, Ke Hao, Ivana Yang, Dennis C Ko
Most genetic variants identified through genome-wide association studies (GWAS) are suspected to be regulatory in nature, but only a small fraction colocalize with expression quantitative trait loci (eQTLs, variants associated with expression of a gene). Therefore, it is hypothesized but largely untested that integration of disease GWAS with context-specific eQTLs will reveal the underlying genes driving disease associations. We used colocalization and transcriptomic analyses to identify shared genetic variants and likely causal genes associated with critically ill COVID-19 and idiopathic pulmonary fibrosis. We first identified five genome-wide significant variants associated with both diseases. Four of the variants did not demonstrate clear colocalization between GWAS and healthy lung eQTL signals. Instead, two of the four variants colocalized only in cell-type and disease-specific eQTL datasets. These analyses pointed to higher ATP11A expression from the C allele of rs12585036, in monocytes and in lung tissue from primarily smokers, which increased risk of IPF and decreased risk of critically ill COVID-19. We also found lower DPP9 expression (and higher methylation at a specific CpG) from the G allele of rs12610495, acting in fibroblasts and in IPF lungs, and increased risk of IPF and critically ill COVID-19. We further found differential expression of the identified causal genes in diseased lungs when compared to non-diseased lungs, specifically in epithelial and immune cell types. These findings highlight the power of integrating GWAS, context-specific eQTLs, and transcriptomics of diseased tissue to harness human genetic variation to identify causal genes and where they function during multiple diseases.
{"title":"Context-specific eQTLs provide deeper insight into causal genes underlying shared genetic architecture of critically ill COVID-19 and idiopathic pulmonary fibrosis.","authors":"Trisha Dalapati, Liuyang Wang, Angela G Jones, Jonathan Cardwell, Iain R Konigsberg, Yohan Bossé, Don D Sin, Wim Timens, Ke Hao, Ivana Yang, Dennis C Ko","doi":"10.1016/j.xhgg.2025.100410","DOIUrl":"10.1016/j.xhgg.2025.100410","url":null,"abstract":"<p><p>Most genetic variants identified through genome-wide association studies (GWAS) are suspected to be regulatory in nature, but only a small fraction colocalize with expression quantitative trait loci (eQTLs, variants associated with expression of a gene). Therefore, it is hypothesized but largely untested that integration of disease GWAS with context-specific eQTLs will reveal the underlying genes driving disease associations. We used colocalization and transcriptomic analyses to identify shared genetic variants and likely causal genes associated with critically ill COVID-19 and idiopathic pulmonary fibrosis. We first identified five genome-wide significant variants associated with both diseases. Four of the variants did not demonstrate clear colocalization between GWAS and healthy lung eQTL signals. Instead, two of the four variants colocalized only in cell-type and disease-specific eQTL datasets. These analyses pointed to higher ATP11A expression from the C allele of rs12585036, in monocytes and in lung tissue from primarily smokers, which increased risk of IPF and decreased risk of critically ill COVID-19. We also found lower DPP9 expression (and higher methylation at a specific CpG) from the G allele of rs12610495, acting in fibroblasts and in IPF lungs, and increased risk of IPF and critically ill COVID-19. We further found differential expression of the identified causal genes in diseased lungs when compared to non-diseased lungs, specifically in epithelial and immune cell types. These findings highlight the power of integrating GWAS, context-specific eQTLs, and transcriptomics of diseased tissue to harness human genetic variation to identify causal genes and where they function during multiple diseases.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100410"},"PeriodicalIF":3.3,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143060892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-23DOI: 10.1016/j.xhgg.2025.100409
Dandan Tan, Yiheng Chen, Yann Ilboudo, Kevin Y H Liang, Guillaume Butler-Laporte, J Brent Richards
Identifying novel, high-yield drug targets is challenging and often results in a high failure rate. However, recent data indicate that leveraging human genetic evidence to identify and validate these targets significantly increases the likelihood of success in drug development. Two recent papers from Open Targets claimed that around half of US Food and Drug Administration-approved drugs had targets with direct human genetic evidence. By expanding target identification to include protein network partners-molecules in physical contact-the proportion of drug targets with genetic evidence support increased to two-thirds. However, the efficacy of using these network partners for target identification was not formally tested. To address this, we tested the approach on a list of robust positive control genes. We used the IntAct database to find physically interacting proteins of genes identified by exome-wide association studies (ExWASs), genome-wide association studies (GWASs) combined with a locus-to-gene mapping algorithm called the Effector Index, and Genetic Priority Score (GPS), which integrated eight genetic features with drug indications from the Open Targets and SIDER databases. We assessed how accurately including interacting genes with the ExWAS-, Effector Index-, and GPS-selected genes identified positive controls, focusing on precision, sensitivity, and specificity. Our results indicated that although molecular interactions led to higher sensitivity in identifying positive control genes, their practical application is limited by low precision. Expanding genetically identified targets to include network partners using IntAct did not increase the likelihood of identifying drug targets across the 412 tested traits, suggesting that such results should be interpreted with caution.
{"title":"Caution when using network partners for target identification in drug discovery.","authors":"Dandan Tan, Yiheng Chen, Yann Ilboudo, Kevin Y H Liang, Guillaume Butler-Laporte, J Brent Richards","doi":"10.1016/j.xhgg.2025.100409","DOIUrl":"10.1016/j.xhgg.2025.100409","url":null,"abstract":"<p><p>Identifying novel, high-yield drug targets is challenging and often results in a high failure rate. However, recent data indicate that leveraging human genetic evidence to identify and validate these targets significantly increases the likelihood of success in drug development. Two recent papers from Open Targets claimed that around half of US Food and Drug Administration-approved drugs had targets with direct human genetic evidence. By expanding target identification to include protein network partners-molecules in physical contact-the proportion of drug targets with genetic evidence support increased to two-thirds. However, the efficacy of using these network partners for target identification was not formally tested. To address this, we tested the approach on a list of robust positive control genes. We used the IntAct database to find physically interacting proteins of genes identified by exome-wide association studies (ExWASs), genome-wide association studies (GWASs) combined with a locus-to-gene mapping algorithm called the Effector Index, and Genetic Priority Score (GPS), which integrated eight genetic features with drug indications from the Open Targets and SIDER databases. We assessed how accurately including interacting genes with the ExWAS-, Effector Index-, and GPS-selected genes identified positive controls, focusing on precision, sensitivity, and specificity. Our results indicated that although molecular interactions led to higher sensitivity in identifying positive control genes, their practical application is limited by low precision. Expanding genetically identified targets to include network partners using IntAct did not increase the likelihood of identifying drug targets across the 412 tested traits, suggesting that such results should be interpreted with caution.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100409"},"PeriodicalIF":3.3,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143042160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-20DOI: 10.1016/j.xhgg.2025.100408
Melanie P Napier, Erin Ryan, Adi Reich, Joshua A Suhl, Diane Masser-Frye, Marilyn Jones, Celese Beaudreau, Nathaniel Robin, Dana Goodloe, Leandra Folk, Michelle M Morrow, Deanna Alexis Carere
The ARHGEF40 gene, also known as SOLO, encodes a RhoA-targeting guanine nucleotide exchange factor (GEF) and is currently considered a candidate gene with a potential relationship to disease. Our laboratory has confirmed variants at position p.Arg225 of the ARHGEF40 protein in multiple unrelated individuals with a phenotype including dysmorphic features, congenital anomalies and neurodevelopmental abnormalities. Here, we provide genetic and phenotypic information for two individuals harboring de novo variants at p.Arg225 and sharing a highly similar phenotype. This report suggests a relationship between variants at this amino acid position and autosomal dominant disease, and further studies will be needed to characterize this disease-gene relationship and elucidate the disease mechanism.
{"title":"Missense variants at the p.Arg225 residue in ARHGEF40 identified in individuals with multiple congenital anomalies and developmental delay.","authors":"Melanie P Napier, Erin Ryan, Adi Reich, Joshua A Suhl, Diane Masser-Frye, Marilyn Jones, Celese Beaudreau, Nathaniel Robin, Dana Goodloe, Leandra Folk, Michelle M Morrow, Deanna Alexis Carere","doi":"10.1016/j.xhgg.2025.100408","DOIUrl":"10.1016/j.xhgg.2025.100408","url":null,"abstract":"<p><p>The ARHGEF40 gene, also known as SOLO, encodes a RhoA-targeting guanine nucleotide exchange factor (GEF) and is currently considered a candidate gene with a potential relationship to disease. Our laboratory has confirmed variants at position p.Arg225 of the ARHGEF40 protein in multiple unrelated individuals with a phenotype including dysmorphic features, congenital anomalies and neurodevelopmental abnormalities. Here, we provide genetic and phenotypic information for two individuals harboring de novo variants at p.Arg225 and sharing a highly similar phenotype. This report suggests a relationship between variants at this amino acid position and autosomal dominant disease, and further studies will be needed to characterize this disease-gene relationship and elucidate the disease mechanism.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100408"},"PeriodicalIF":3.3,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143012951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-15DOI: 10.1016/j.xhgg.2025.100406
Zachary R McCaw, Rounak Dey, Hari Somineni, David Amar, Sumit Mukherjee, Kaitlin Sandor, Theofanis Karaletsos, Daphne Koller, Hugues Aschard, George Davey Smith, Daniel MacArthur, Colm O'Dushlaine, Thomas W Soare
Genome-wide association studies (GWASs) are often performed on ratios composed of a numerator trait divided by a denominator trait. Examples include body mass index (BMI) and the waist-to-hip ratio, among many others. Explicitly or implicitly, the goal of forming the ratio is typically to adjust for an association between the numerator and denominator. While forming ratios may be clinically expedient, there are several important issues with performing GWAS on ratios. Forming a ratio does not "adjust" for the denominator in the sense of conditioning on it, and it is unclear whether associations with ratios are attributable to the numerator, the denominator, or both. Here we demonstrate that associations arising in ratio GWAS can be entirely denominator driven, implying that at least some associations uncovered by ratio GWAS may be due solely to a putative adjustment variable. In a survey of 10 common ratio traits, we find that the ratio model disagrees with the adjusted model (performing GWAS on the numerator while conditioning on the denominator) at around 1/3 of loci. Using BMI as an example, we show that variants detected by only the ratio model are more strongly associated with the denominator (height), while variants detected by only the adjusted model are more strongly associated with the numerator (weight). Although the adjusted model provides effect sizes with a clearer interpretation, it is susceptible to collider bias. We propose and validate a simple method of correcting for the genetic component of collider bias via leave-one-chromosome-out polygenic scoring.
{"title":"Pitfalls in performing genome-wide association studies on ratio traits.","authors":"Zachary R McCaw, Rounak Dey, Hari Somineni, David Amar, Sumit Mukherjee, Kaitlin Sandor, Theofanis Karaletsos, Daphne Koller, Hugues Aschard, George Davey Smith, Daniel MacArthur, Colm O'Dushlaine, Thomas W Soare","doi":"10.1016/j.xhgg.2025.100406","DOIUrl":"10.1016/j.xhgg.2025.100406","url":null,"abstract":"<p><p>Genome-wide association studies (GWASs) are often performed on ratios composed of a numerator trait divided by a denominator trait. Examples include body mass index (BMI) and the waist-to-hip ratio, among many others. Explicitly or implicitly, the goal of forming the ratio is typically to adjust for an association between the numerator and denominator. While forming ratios may be clinically expedient, there are several important issues with performing GWAS on ratios. Forming a ratio does not \"adjust\" for the denominator in the sense of conditioning on it, and it is unclear whether associations with ratios are attributable to the numerator, the denominator, or both. Here we demonstrate that associations arising in ratio GWAS can be entirely denominator driven, implying that at least some associations uncovered by ratio GWAS may be due solely to a putative adjustment variable. In a survey of 10 common ratio traits, we find that the ratio model disagrees with the adjusted model (performing GWAS on the numerator while conditioning on the denominator) at around 1/3 of loci. Using BMI as an example, we show that variants detected by only the ratio model are more strongly associated with the denominator (height), while variants detected by only the adjusted model are more strongly associated with the numerator (weight). Although the adjusted model provides effect sizes with a clearer interpretation, it is susceptible to collider bias. We propose and validate a simple method of correcting for the genetic component of collider bias via leave-one-chromosome-out polygenic scoring.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100406"},"PeriodicalIF":3.3,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143012953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-11DOI: 10.1016/j.xhgg.2025.100405
Blaine A Bates, Kylee E Bates, Spencer A Boris, Colin Wessman, David Stone, Justin Bryan, Mary F Davis, Matthew H Bailey
Using rare cancer predisposition alleles derived from The Cancer Genome Atlas (TCGA) and high cancer prevalence (14% of participants) in All of Us (version 6), we assessed the impact of these rare alleles on cancer occurrence in six broad groups of genetic similarity provided by All of Us: African/African American (AFR), Admixed American/Latino (AMR), East Asian (EAS), European (EUR), Middle Eastern (MID), or South Asian (SAS). We observed that germline susceptibility to cancer consistently replicates in EUR-like participants but less so in other participants. We found that All of Us participants from the EUR (p = 1.8 × 10-7), AFR (p = 0.018), and MID (p = 0.0083) genetic similarity groups who carry a rare pathogenic mutation are more likely to have cancer than those without a rare pathogenic mutation. With the advent of combining medical records and genetic mutations, we also performed a phenome-wide association study (PheWAS) to assess the effect of pathogenic variants on additional phenotypes. This analysis again showed several associations between predisposition variants and cancer in EUR-like participants, but fewer in those of the other genetic similarity groups. As All of Us grows to 1 million participants, our projections suggest sufficient power (>99%) to detect cancer-associated variants that are common, but limited power (∼28%) to detect rare mutations when using the entire cohort. This study provides preliminary insights into genetic predispositions to cancer across a diverse cohort and demonstrates the value of All of Us as a resource for cancer research.
使用来自癌症基因组图谱(TCGA)的罕见癌症易感等位基因和All of Us(版本6)的高癌症患病率(14%的参与者),我们评估了这些罕见等位基因对由All of Us提供的六大类遗传相似性群体癌症发生的影响:非洲/非裔美国人(AFR),混合美国/拉丁裔(AMR),东亚(EAS),欧洲(EUR),中东(MID)或南亚(SAS)。我们观察到,种系对癌症的易感性在eur样参与者中持续复制,而在其他参与者中则较少。我们发现所有来自EUR (p = 1.8 x 10-7), AFR (p = 0.018)和MID (p = 0.0083)基因相似组携带罕见致病突变的参与者比没有罕见致病突变的参与者更容易患癌症。随着医疗记录和基因突变相结合的出现,我们还进行了全表型关联研究(PheWAS),以评估致病变异对其他表型的影响。这一分析再次显示了易感性变异与eur样参与者癌症之间的几种关联,但在其他遗传相似组中则较少。随着All of Us参与者增加到100万,我们的预测表明,在使用整个队列时,检测常见的癌症相关变异的能力足够(约99%),但检测罕见突变的能力有限(约28%)。这项研究提供了对不同人群中癌症遗传易感性的初步见解,并证明了All of Us作为癌症研究资源的价值。
{"title":"Intersection of rare pathogenic variants from TCGA in the All of Us Research Program v6.","authors":"Blaine A Bates, Kylee E Bates, Spencer A Boris, Colin Wessman, David Stone, Justin Bryan, Mary F Davis, Matthew H Bailey","doi":"10.1016/j.xhgg.2025.100405","DOIUrl":"10.1016/j.xhgg.2025.100405","url":null,"abstract":"<p><p>Using rare cancer predisposition alleles derived from The Cancer Genome Atlas (TCGA) and high cancer prevalence (14% of participants) in All of Us (version 6), we assessed the impact of these rare alleles on cancer occurrence in six broad groups of genetic similarity provided by All of Us: African/African American (AFR), Admixed American/Latino (AMR), East Asian (EAS), European (EUR), Middle Eastern (MID), or South Asian (SAS). We observed that germline susceptibility to cancer consistently replicates in EUR-like participants but less so in other participants. We found that All of Us participants from the EUR (p = 1.8 × 10<sup>-7</sup>), AFR (p = 0.018), and MID (p = 0.0083) genetic similarity groups who carry a rare pathogenic mutation are more likely to have cancer than those without a rare pathogenic mutation. With the advent of combining medical records and genetic mutations, we also performed a phenome-wide association study (PheWAS) to assess the effect of pathogenic variants on additional phenotypes. This analysis again showed several associations between predisposition variants and cancer in EUR-like participants, but fewer in those of the other genetic similarity groups. As All of Us grows to 1 million participants, our projections suggest sufficient power (>99%) to detect cancer-associated variants that are common, but limited power (∼28%) to detect rare mutations when using the entire cohort. This study provides preliminary insights into genetic predispositions to cancer across a diverse cohort and demonstrates the value of All of Us as a resource for cancer research.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100405"},"PeriodicalIF":3.3,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142971088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SOX9 encodes an SRY-related transcription factor critical for chondrogenesis and sex determination among other processes. Loss-of-function variants cause campomelic dysplasia and Pierre Robin sequence, while both gain- and loss-of-function variants cause disorders of sex development. SOX9 has also been linked to scoliosis and cancers, but variants are undetermined. It is highly expressed in tooth progenitor cells, but its odontogenic roles remain elusive, and tooth defects are unreported in SOX9-related conditions. Here, we performed whole-exome sequencing for nine unrelated children with tooth eruption delay and no known syndromes and identified a 7-year-old girl heterozygous for a SOX9 p.Thr239Pro variant and a 10-year-old boy heterozygous for presumably adjacent p.Thr239Pro and p.Thr240Pro variants. These variants were de novo and rare in control populations. Both cases had primary tooth eruption delay. Additionally, the boy had mesiodens blocking permanent central upper incisor eruption, severe scoliosis, and mild craniofacial and appendicular skeleton abnormalities. p.Thr239 and p.Thr240 occupy variable and obligatory positions, respectively, in a cell division control protein 4 (Cdc4)/FBXW7-targeted phosphodegron motif (CPD) fully conserved in SOX9 vertebrate orthologs and SOX8 and SOX10 paralogs, but functionally uncharacterized in vivo. Structural modeling predicted p.Thr240Pro and p.Thr239Pro/p.Thr240Pro but not p.Thr239Pro to strongly reduce SOX9/FBXW7 interaction. Accordingly, p.Thr240Pro and p.Thr239Pro/p.Thr240Pro but not p.Thr239Pro blocked FBXW7-induced SOX9 degradation in cultured cells. All variants increased SOX9-mediated reporter activation independently of protein stabilization, suggesting that CPD may also modulate the transactivation function of SOX9. Altogether, these findings concur that CPD has critical functions, that SOX9 decisively controls odontogenesis, and that gain-of-function variants may markedly perturb both this process and skeletogenesis.
{"title":"Missense variants weakening a SOX9 phosphodegron linked to odontogenesis defects, scoliosis, and other skeletal features.","authors":"Imane Ettaki, Abdul Haseeb, Anirudha Karvande, Ghita Amalou, Asmae Saih, Imane AitRaise, Salsabil Hamdi, Lahcen Wakrim, Abdelhamid Barakat, Hassan Fellah, Mustapha El Alloussi, Véronique Lefebvre","doi":"10.1016/j.xhgg.2025.100404","DOIUrl":"10.1016/j.xhgg.2025.100404","url":null,"abstract":"<p><p>SOX9 encodes an SRY-related transcription factor critical for chondrogenesis and sex determination among other processes. Loss-of-function variants cause campomelic dysplasia and Pierre Robin sequence, while both gain- and loss-of-function variants cause disorders of sex development. SOX9 has also been linked to scoliosis and cancers, but variants are undetermined. It is highly expressed in tooth progenitor cells, but its odontogenic roles remain elusive, and tooth defects are unreported in SOX9-related conditions. Here, we performed whole-exome sequencing for nine unrelated children with tooth eruption delay and no known syndromes and identified a 7-year-old girl heterozygous for a SOX9 p.Thr239Pro variant and a 10-year-old boy heterozygous for presumably adjacent p.Thr239Pro and p.Thr240Pro variants. These variants were de novo and rare in control populations. Both cases had primary tooth eruption delay. Additionally, the boy had mesiodens blocking permanent central upper incisor eruption, severe scoliosis, and mild craniofacial and appendicular skeleton abnormalities. p.Thr239 and p.Thr240 occupy variable and obligatory positions, respectively, in a cell division control protein 4 (Cdc4)/FBXW7-targeted phosphodegron motif (CPD) fully conserved in SOX9 vertebrate orthologs and SOX8 and SOX10 paralogs, but functionally uncharacterized in vivo. Structural modeling predicted p.Thr240Pro and p.Thr239Pro/p.Thr240Pro but not p.Thr239Pro to strongly reduce SOX9/FBXW7 interaction. Accordingly, p.Thr240Pro and p.Thr239Pro/p.Thr240Pro but not p.Thr239Pro blocked FBXW7-induced SOX9 degradation in cultured cells. All variants increased SOX9-mediated reporter activation independently of protein stabilization, suggesting that CPD may also modulate the transactivation function of SOX9. Altogether, these findings concur that CPD has critical functions, that SOX9 decisively controls odontogenesis, and that gain-of-function variants may markedly perturb both this process and skeletogenesis.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100404"},"PeriodicalIF":3.3,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142967213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}