Pub Date : 2024-11-07DOI: 10.1016/j.ajhg.2024.10.012
Shiyang Ma, Fan Wang, Richard Border, Joseph Buxbaum, Noah Zaitlen, Iuliana Ionita-Laza
Local genetic correlation analysis is an important tool for identifying genetic loci with shared biology across traits. Recently, Border et al. have shown that the results of these analyses are confounded by cross-trait assortative mating (xAM), leading to many false-positive findings. Here, we describe LAVA-Knock, a local genetic correlation method that builds off an existing genetic correlation method, LAVA, and augments it by generating synthetic data in a way that preserves local and long-range linkage disequilibrium (LD), allowing us to reduce the confounding induced by xAM. We show in simulations based on a realistic xAM model and in genome-wide association study (GWAS) applications for 630 trait pairs that LAVA-Knock can greatly reduce the bias due to xAM relative to LAVA. Furthermore, we show a significant positive correlation between the reduction in local genetic correlations and estimates in the literature of cross-mate phenotype correlations; in particular, pairs of traits that are known to have high cross-mate phenotype correlation values have a significantly higher reduction in the number of local genetic correlations compared with other trait pairs. A few representative examples include education and intelligence, education and alcohol consumption, and attention-deficit hyperactivity disorder and depression. These results suggest that LAVA-Knock can reduce confounding due to both short-range LD and long-range LD induced by xAM.
局部遗传相关性分析是确定具有跨性状共同生物学特性的遗传位点的重要工具。最近,Border 等人的研究表明,这些分析的结果会受到跨性状同配(xAM)的干扰,从而导致许多假阳性结果。在这里,我们介绍一种局部遗传相关方法 LAVA-Knock,它以现有的遗传相关方法 LAVA 为基础,并通过生成合成数据的方式对其进行增强,从而保留局部和长程连锁不平衡(LD),使我们能够减少 xAM 引起的混杂。我们在基于现实 xAM 模型的模拟和针对 630 个性状对的全基因组关联研究(GWAS)应用中表明,相对于 LAVA,LAVA-Knock 能大大减少 xAM 带来的偏差。此外,我们还发现,局部遗传相关性的降低与文献中对跨配偶表型相关性的估计之间存在显著的正相关;特别是,与其他性状对相比,已知具有较高跨配偶表型相关性值的性状对的局部遗传相关性数量的降低幅度明显更高。一些有代表性的例子包括教育与智力、教育与饮酒、注意力缺陷多动障碍与抑郁。这些结果表明,LAVA-Knock 可以减少由 xAM 引起的短程 LD 和长程 LD 所造成的混杂。
{"title":"Local genetic correlation via knockoffs reduces confounding due to cross-trait assortative mating.","authors":"Shiyang Ma, Fan Wang, Richard Border, Joseph Buxbaum, Noah Zaitlen, Iuliana Ionita-Laza","doi":"10.1016/j.ajhg.2024.10.012","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.10.012","url":null,"abstract":"<p><p>Local genetic correlation analysis is an important tool for identifying genetic loci with shared biology across traits. Recently, Border et al. have shown that the results of these analyses are confounded by cross-trait assortative mating (xAM), leading to many false-positive findings. Here, we describe LAVA-Knock, a local genetic correlation method that builds off an existing genetic correlation method, LAVA, and augments it by generating synthetic data in a way that preserves local and long-range linkage disequilibrium (LD), allowing us to reduce the confounding induced by xAM. We show in simulations based on a realistic xAM model and in genome-wide association study (GWAS) applications for 630 trait pairs that LAVA-Knock can greatly reduce the bias due to xAM relative to LAVA. Furthermore, we show a significant positive correlation between the reduction in local genetic correlations and estimates in the literature of cross-mate phenotype correlations; in particular, pairs of traits that are known to have high cross-mate phenotype correlation values have a significantly higher reduction in the number of local genetic correlations compared with other trait pairs. A few representative examples include education and intelligence, education and alcohol consumption, and attention-deficit hyperactivity disorder and depression. These results suggest that LAVA-Knock can reduce confounding due to both short-range LD and long-range LD induced by xAM.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-07Epub Date: 2024-10-01DOI: 10.1016/j.ajhg.2024.09.002
Xiaoyu Yin, Marcy Richardson, Andreas Laner, Xuemei Shi, Elisabet Ognedal, Valeria Vasta, Thomas V O Hansen, Marta Pineda, Deborah Ritter, Johan de Dunnen, Emadeldin Hassanin, Wencong Lyman Lin, Ester Borras, Karl Krahn, Margareta Nordling, Alexandra Martins, Khalid Mahmood, Emily Nadeau, Victoria Beshay, Carli Tops, Maurizio Genuardi, Tina Pesaran, Ian M Frayling, Gabriel Capellá, Andrew Latchford, Sean V Tavtigian, Carlo Maj, Sharon E Plon, Marc S Greenblatt, Finlay A Macrae, Isabel Spier, Stefan Aretz
Pathogenic constitutional APC variants underlie familial adenomatous polyposis, the most common hereditary gastrointestinal polyposis syndrome. To improve variant classification and resolve the interpretative challenges of variants of uncertain significance (VUSs), APC-specific variant classification criteria were developed by the ClinGen-InSiGHT Hereditary Colorectal Cancer/Polyposis Variant Curation Expert Panel (VCEP) based on the criteria of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP). A streamlined algorithm using the APC-specific criteria was developed and applied to assess all APC variants in ClinVar and the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) international reference APC Leiden Open Variation Database (LOVD) variant database, which included a total of 10,228 unique APC variants. Among the ClinVar and LOVD variants with an initial classification of (likely) benign or (likely) pathogenic, 94% and 96% remained in their original categories, respectively. In contrast, 41% ClinVar and 61% LOVD VUSs were reclassified into clinically meaningful classes, the vast majority as (likely) benign. The total number of VUSs was reduced by 37%. In 24 out of 37 (65%) promising APC variants that remained VUS despite evidence for pathogenicity, a data-mining-driven work-up allowed their reclassification as (likely) pathogenic. These results demonstrated that the application of APC-specific criteria substantially reduced the number of VUSs in ClinVar and LOVD. The study also demonstrated the feasibility of a systematic approach to variant classification in large datasets, which might serve as a generalizable model for other gene- or disease-specific variant interpretation initiatives. It also allowed for the prioritization of VUSs that will benefit from in-depth evidence collection. This subset of APC variants was approved by the VCEP and made publicly available through ClinVar and LOVD for widespread clinical use.
{"title":"Large-scale application of ClinGen-InSiGHT APC-specific ACMG/AMP variant classification criteria leads to substantial reduction in VUS.","authors":"Xiaoyu Yin, Marcy Richardson, Andreas Laner, Xuemei Shi, Elisabet Ognedal, Valeria Vasta, Thomas V O Hansen, Marta Pineda, Deborah Ritter, Johan de Dunnen, Emadeldin Hassanin, Wencong Lyman Lin, Ester Borras, Karl Krahn, Margareta Nordling, Alexandra Martins, Khalid Mahmood, Emily Nadeau, Victoria Beshay, Carli Tops, Maurizio Genuardi, Tina Pesaran, Ian M Frayling, Gabriel Capellá, Andrew Latchford, Sean V Tavtigian, Carlo Maj, Sharon E Plon, Marc S Greenblatt, Finlay A Macrae, Isabel Spier, Stefan Aretz","doi":"10.1016/j.ajhg.2024.09.002","DOIUrl":"10.1016/j.ajhg.2024.09.002","url":null,"abstract":"<p><p>Pathogenic constitutional APC variants underlie familial adenomatous polyposis, the most common hereditary gastrointestinal polyposis syndrome. To improve variant classification and resolve the interpretative challenges of variants of uncertain significance (VUSs), APC-specific variant classification criteria were developed by the ClinGen-InSiGHT Hereditary Colorectal Cancer/Polyposis Variant Curation Expert Panel (VCEP) based on the criteria of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP). A streamlined algorithm using the APC-specific criteria was developed and applied to assess all APC variants in ClinVar and the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) international reference APC Leiden Open Variation Database (LOVD) variant database, which included a total of 10,228 unique APC variants. Among the ClinVar and LOVD variants with an initial classification of (likely) benign or (likely) pathogenic, 94% and 96% remained in their original categories, respectively. In contrast, 41% ClinVar and 61% LOVD VUSs were reclassified into clinically meaningful classes, the vast majority as (likely) benign. The total number of VUSs was reduced by 37%. In 24 out of 37 (65%) promising APC variants that remained VUS despite evidence for pathogenicity, a data-mining-driven work-up allowed their reclassification as (likely) pathogenic. These results demonstrated that the application of APC-specific criteria substantially reduced the number of VUSs in ClinVar and LOVD. The study also demonstrated the feasibility of a systematic approach to variant classification in large datasets, which might serve as a generalizable model for other gene- or disease-specific variant interpretation initiatives. It also allowed for the prioritization of VUSs that will benefit from in-depth evidence collection. This subset of APC variants was approved by the VCEP and made publicly available through ClinVar and LOVD for widespread clinical use.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2427-2443"},"PeriodicalIF":5.4,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568752/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142363999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-05DOI: 10.1016/j.ajhg.2024.10.006
Yulia Mostovoy, Philip M Boone, Yongqing Huang, Kiran V Garimella, Kar-Tong Tan, Bianca E Russell, Monica Salani, Celine E F de Esch, John Lemanski, Benjamin Curall, Jen Hauenstein, Diane Lucente, Tera Bowers, Tim DeSmet, Stacey Gabriel, Cynthia C Morton, Matthew Meyerson, Alex R Hastie, James Gusella, Fabiola Quintero-Rivera, Harrison Brand, Michael E Talkowski
Delineation of structural variants (SVs) at sequence resolution in highly repetitive genomic regions has long been intractable. The sequence properties, origins, and functional effects of classes of genomic rearrangements such as ring chromosomes and Robertsonian translocations thus remain unknown. To resolve these complex structures, we leveraged several recent milestones in the field, including (1) the emergence of long-read sequencing, (2) the gapless telomere-to-telomere (T2T) assembly, and (3) a tool (BigClipper) to discover chromosomal rearrangements from long reads. We applied these technologies across 13 cases with ring chromosomes, Robertsonian translocations, and complex SVs that were unresolved by short reads, followed by validation using optical genome mapping (OGM). Our analyses resolved 10 of 13 cases, including a Robertsonian translocation and all ring chromosomes. Multiple breakpoints were localized to genomic regions previously recalcitrant to sequencing such as acrocentric p-arms, ribosomal DNA arrays, and telomeric repeats, and involved complex structures such as a deletion-inversion and interchromosomal dispersed duplications. We further performed methylation profiling from long-read data to discover phased differential methylation in a gene promoter proximal to a ring fusion, suggesting a long-range position effect (LRPE) with heterochromatin spreading. Breakpoint sequences suggested mechanisms of SV formation such as microhomology-mediated and non-homologous end-joining, as well as non-allelic homologous recombination. These methods provide some of the first glimpses into the sequence resolution of Robertsonian translocations and illuminate the structural diversity of ring chromosomes and complex chromosomal rearrangements with implications for genome biology, prediction of LRPEs from integrated multi-omics technologies, and molecular diagnostics in rare disease cases.
{"title":"Resolution of ring chromosomes, Robertsonian translocations, and complex structural variants from long-read sequencing and telomere-to-telomere assembly.","authors":"Yulia Mostovoy, Philip M Boone, Yongqing Huang, Kiran V Garimella, Kar-Tong Tan, Bianca E Russell, Monica Salani, Celine E F de Esch, John Lemanski, Benjamin Curall, Jen Hauenstein, Diane Lucente, Tera Bowers, Tim DeSmet, Stacey Gabriel, Cynthia C Morton, Matthew Meyerson, Alex R Hastie, James Gusella, Fabiola Quintero-Rivera, Harrison Brand, Michael E Talkowski","doi":"10.1016/j.ajhg.2024.10.006","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.10.006","url":null,"abstract":"<p><p>Delineation of structural variants (SVs) at sequence resolution in highly repetitive genomic regions has long been intractable. The sequence properties, origins, and functional effects of classes of genomic rearrangements such as ring chromosomes and Robertsonian translocations thus remain unknown. To resolve these complex structures, we leveraged several recent milestones in the field, including (1) the emergence of long-read sequencing, (2) the gapless telomere-to-telomere (T2T) assembly, and (3) a tool (BigClipper) to discover chromosomal rearrangements from long reads. We applied these technologies across 13 cases with ring chromosomes, Robertsonian translocations, and complex SVs that were unresolved by short reads, followed by validation using optical genome mapping (OGM). Our analyses resolved 10 of 13 cases, including a Robertsonian translocation and all ring chromosomes. Multiple breakpoints were localized to genomic regions previously recalcitrant to sequencing such as acrocentric p-arms, ribosomal DNA arrays, and telomeric repeats, and involved complex structures such as a deletion-inversion and interchromosomal dispersed duplications. We further performed methylation profiling from long-read data to discover phased differential methylation in a gene promoter proximal to a ring fusion, suggesting a long-range position effect (LRPE) with heterochromatin spreading. Breakpoint sequences suggested mechanisms of SV formation such as microhomology-mediated and non-homologous end-joining, as well as non-allelic homologous recombination. These methods provide some of the first glimpses into the sequence resolution of Robertsonian translocations and illuminate the structural diversity of ring chromosomes and complex chromosomal rearrangements with implications for genome biology, prediction of LRPEs from integrated multi-omics technologies, and molecular diagnostics in rare disease cases.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142612272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The search for prognostic biomarkers capable of predicting patient outcomes, by analyzing gene expression in tissue samples and other molecular profiles, remains largely focused on single-gene-based or global-gene-search approaches. Gene-centric approaches, while foundational, fail to capture the higher-order dependencies that reflect the activities of co-regulated processes, pathway alterations, and regulatory networks, all of which are crucial in determining the patient outcomes in complex diseases like cancer. Here, we introduce GPS-Net, a computational framework that fills the gap in efficiently identifying prognostic modules by incorporating the holistic pathway structures and the network of gene interactions. By innovatively incorporating advanced multiple kernel learning techniques and network-based regularization, the proposed method not only enhances the accuracy of biomarker and pathway identification but also significantly reduces computational complexity, as demonstrated by extensive simulation studies. Applying GPS-Net, we identified key pathways that are predictive of patient outcomes in a cancer immunotherapy study. Overall, our approach provides a novel framework that renders genome-wide pathway-level prognostic analysis both feasible and scalable, synergizing both mechanism-driven and data-driven methodologies for precision genomics.
{"title":"GPS-Net: Discovering prognostic pathway modules based on network regularized kernel learning.","authors":"Sijie Yao, Kaiqiao Li, Tingyi Li, Xiaoqing Yu, Pei Fen Kuan, Xuefeng Wang","doi":"10.1016/j.ajhg.2024.10.004","DOIUrl":"10.1016/j.ajhg.2024.10.004","url":null,"abstract":"<p><p>The search for prognostic biomarkers capable of predicting patient outcomes, by analyzing gene expression in tissue samples and other molecular profiles, remains largely focused on single-gene-based or global-gene-search approaches. Gene-centric approaches, while foundational, fail to capture the higher-order dependencies that reflect the activities of co-regulated processes, pathway alterations, and regulatory networks, all of which are crucial in determining the patient outcomes in complex diseases like cancer. Here, we introduce GPS-Net, a computational framework that fills the gap in efficiently identifying prognostic modules by incorporating the holistic pathway structures and the network of gene interactions. By innovatively incorporating advanced multiple kernel learning techniques and network-based regularization, the proposed method not only enhances the accuracy of biomarker and pathway identification but also significantly reduces computational complexity, as demonstrated by extensive simulation studies. Applying GPS-Net, we identified key pathways that are predictive of patient outcomes in a cancer immunotherapy study. Overall, our approach provides a novel framework that renders genome-wide pathway-level prognostic analysis both feasible and scalable, synergizing both mechanism-driven and data-driven methodologies for precision genomics.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142602781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-29DOI: 10.1016/j.ajhg.2024.10.005
Bin Zhu, Avraam Tapinos, Hela Koka, Priscilla Ming Yi Lee, Tongwu Zhang, Wei Zhu, Xiaoyu Wang, Alyssa Klein, DongHyuk Lee, Gary M Tse, Koon-Ho Tsang, Cherry Wu, Min Hua, Chad A Highfill, Petra Lenz, Weiyin Zhou, Difei Wang, Wen Luo, Kristine Jones, Amy Hutchinson, Belynda Hicks, Montserrat Garcia-Closas, Stephen Chanock, Lap Ah Tse, David C Wedge, Xiaohong R Yang
Normal tissues adjacent to the tumor (NATs) may harbor early breast carcinogenesis events driven by field cancerization. Although previous studies have characterized copy-number (CN) and transcriptomic alterations, the evolutionary history of NATs in breast cancer (BC) remains poorly characterized. Utilizing whole-genome sequencing (WGS), methylation profiling, and RNA sequencing (RNA-seq), we analyzed paired germline, NATs, and tumor samples from 43 individuals with BC in Hong Kong (HK). We found that single-nucleotide variants (SNVs) were common in NATs, with one-third of NAT samples exhibiting SNVs in driver genes, many of which were present in paired tumor samples. The most frequently mutated genes in both tumor and NAT samples were PIK3CA, TP53, GATA3, and AKT1. In contrast, large-scale aberrations such as somatic CN alterations (SCNAs) and structural variants (SVs) were rarely detected in NAT samples. We generated phylogenetic trees to investigate the evolutionary history of paired NAT and tumor samples. They could be categorized into tumor only, shared, and multiple-tree groups, the last of which is concordant with non-genetic field cancerization. These groups exhibited distinct genomic and epigenomic characteristics in both NAT and tumor samples. Specifically, NAT samples in the shared-tree group showed higher number of mutations, while NAT samples belonging to the multiple-tree group showed a less inflammatory tumor microenvironment (TME), characterized by a higher proportion of regulatory T cells (Tregs) and lower presence of CD14 cell populations. In summary, our findings highlight the diverse evolutionary history in BC NAT/tumor pairs and the impact of field cancerization and TME in shaping the genomic evolutionary history of tumors.
{"title":"Genomes and epigenomes of matched normal and tumor breast tissue reveal diverse evolutionary trajectories and tumor-host interactions.","authors":"Bin Zhu, Avraam Tapinos, Hela Koka, Priscilla Ming Yi Lee, Tongwu Zhang, Wei Zhu, Xiaoyu Wang, Alyssa Klein, DongHyuk Lee, Gary M Tse, Koon-Ho Tsang, Cherry Wu, Min Hua, Chad A Highfill, Petra Lenz, Weiyin Zhou, Difei Wang, Wen Luo, Kristine Jones, Amy Hutchinson, Belynda Hicks, Montserrat Garcia-Closas, Stephen Chanock, Lap Ah Tse, David C Wedge, Xiaohong R Yang","doi":"10.1016/j.ajhg.2024.10.005","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.10.005","url":null,"abstract":"<p><p>Normal tissues adjacent to the tumor (NATs) may harbor early breast carcinogenesis events driven by field cancerization. Although previous studies have characterized copy-number (CN) and transcriptomic alterations, the evolutionary history of NATs in breast cancer (BC) remains poorly characterized. Utilizing whole-genome sequencing (WGS), methylation profiling, and RNA sequencing (RNA-seq), we analyzed paired germline, NATs, and tumor samples from 43 individuals with BC in Hong Kong (HK). We found that single-nucleotide variants (SNVs) were common in NATs, with one-third of NAT samples exhibiting SNVs in driver genes, many of which were present in paired tumor samples. The most frequently mutated genes in both tumor and NAT samples were PIK3CA, TP53, GATA3, and AKT1. In contrast, large-scale aberrations such as somatic CN alterations (SCNAs) and structural variants (SVs) were rarely detected in NAT samples. We generated phylogenetic trees to investigate the evolutionary history of paired NAT and tumor samples. They could be categorized into tumor only, shared, and multiple-tree groups, the last of which is concordant with non-genetic field cancerization. These groups exhibited distinct genomic and epigenomic characteristics in both NAT and tumor samples. Specifically, NAT samples in the shared-tree group showed higher number of mutations, while NAT samples belonging to the multiple-tree group showed a less inflammatory tumor microenvironment (TME), characterized by a higher proportion of regulatory T cells (Tregs) and lower presence of CD14 cell populations. In summary, our findings highlight the diverse evolutionary history in BC NAT/tumor pairs and the impact of field cancerization and TME in shaping the genomic evolutionary history of tumors.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142567635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-22DOI: 10.1016/j.ajhg.2024.10.003
John J Y Lee, Michael J Johnston, Hamza Farooq, Huey-Miin Chen, Subhi Talal Younes, Raul Suarez, Melissa Zwaig, Nikoleta Juretic, William A Weiss, Jiannis Ragoussis, Nada Jabado, Michael D Taylor, Marco Gallo
Four main medulloblastoma (MB) molecular subtypes have been identified based on transcriptional, DNA methylation, and genetic profiles. However, it is currently not known whether 3D genome architecture differs between MB subtypes. To address this question, we performed in situ Hi-C to reconstruct the 3D genome architecture of MB subtypes. In total, we generated Hi-C and matching transcriptome data for 28 surgical specimens and Hi-C data for one patient-derived xenograft. The average resolution of the Hi-C maps was 6,833 bp. Using these data, we found that insulation scores of topologically associating domains (TADs) were effective at distinguishing MB molecular subgroups. TAD insulation score differences between subtypes were globally not associated with differential gene expression, although we identified few exceptions near genes expressed in the lineages of origin of specific MB subtypes. Our study therefore supports the notion that TAD insulation scores can distinguish MB subtypes independently of their transcriptional differences.
{"title":"3D genome topology distinguishes molecular subgroups of medulloblastoma.","authors":"John J Y Lee, Michael J Johnston, Hamza Farooq, Huey-Miin Chen, Subhi Talal Younes, Raul Suarez, Melissa Zwaig, Nikoleta Juretic, William A Weiss, Jiannis Ragoussis, Nada Jabado, Michael D Taylor, Marco Gallo","doi":"10.1016/j.ajhg.2024.10.003","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.10.003","url":null,"abstract":"<p><p>Four main medulloblastoma (MB) molecular subtypes have been identified based on transcriptional, DNA methylation, and genetic profiles. However, it is currently not known whether 3D genome architecture differs between MB subtypes. To address this question, we performed in situ Hi-C to reconstruct the 3D genome architecture of MB subtypes. In total, we generated Hi-C and matching transcriptome data for 28 surgical specimens and Hi-C data for one patient-derived xenograft. The average resolution of the Hi-C maps was 6,833 bp. Using these data, we found that insulation scores of topologically associating domains (TADs) were effective at distinguishing MB molecular subgroups. TAD insulation score differences between subtypes were globally not associated with differential gene expression, although we identified few exceptions near genes expressed in the lineages of origin of specific MB subtypes. Our study therefore supports the notion that TAD insulation scores can distinguish MB subtypes independently of their transcriptional differences.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142556951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-21DOI: 10.1016/j.ajhg.2024.09.008
Yosuke Tanigawa,Manolis Kellis
Balancing the tradeoff between quantity and quality of phenotypic data is critical in omics studies. Measurements below the limit of quantification (BLQ) are often tagged in quality control fields, but these flags are currently underutilized in human genetics studies. Extreme phenotype sampling is advantageous for mapping rare variant effects. We hypothesize that genetic drivers, along with environmental and technical factors, contribute to the presence of BLQ flags. Here, we introduce "hypometric genetics" (hMG) analysis and uncover a genetic basis for BLQ flags, indicating an additional source of genetic signal for genetic discovery, especially from phenotypic extremes. Applying our hMG approach to n = 227,469 UK Biobank individuals with metabolomic profiles, we reveal more than 5% heritability for BLQ flags and report biologically relevant associations, for example, at APOC3, APOA5, and PDE3B loci. For common variants, polygenic scores trained only for BLQ flags predict the corresponding quantitative traits with 91% accuracy, validating the genetic basis. For rare coding variant associations, we find an asymmetric 65.4% higher enrichment of metabolite-lowering associations for BLQ flags, highlighting the impact of putative loss-of-function variants with large effects on phenotypic extremes. Joint analysis of binarized BLQ flags and the corresponding quantitative metabolite measurements improves power in Bayesian rare variant aggregation tests, resulting in an average of 181% more prioritized genes. Our approach is broadly applicable to omics profiling. Overall, our results underscore the benefit of integrating quality control flags and quantitative measurements and highlight the advantage of joint analysis of population-based samples and phenotypic extremes in human genetics studies.
{"title":"Hypometric genetics: Improved power in genetic discovery by incorporating quality control flags.","authors":"Yosuke Tanigawa,Manolis Kellis","doi":"10.1016/j.ajhg.2024.09.008","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.09.008","url":null,"abstract":"Balancing the tradeoff between quantity and quality of phenotypic data is critical in omics studies. Measurements below the limit of quantification (BLQ) are often tagged in quality control fields, but these flags are currently underutilized in human genetics studies. Extreme phenotype sampling is advantageous for mapping rare variant effects. We hypothesize that genetic drivers, along with environmental and technical factors, contribute to the presence of BLQ flags. Here, we introduce \"hypometric genetics\" (hMG) analysis and uncover a genetic basis for BLQ flags, indicating an additional source of genetic signal for genetic discovery, especially from phenotypic extremes. Applying our hMG approach to n = 227,469 UK Biobank individuals with metabolomic profiles, we reveal more than 5% heritability for BLQ flags and report biologically relevant associations, for example, at APOC3, APOA5, and PDE3B loci. For common variants, polygenic scores trained only for BLQ flags predict the corresponding quantitative traits with 91% accuracy, validating the genetic basis. For rare coding variant associations, we find an asymmetric 65.4% higher enrichment of metabolite-lowering associations for BLQ flags, highlighting the impact of putative loss-of-function variants with large effects on phenotypic extremes. Joint analysis of binarized BLQ flags and the corresponding quantitative metabolite measurements improves power in Bayesian rare variant aggregation tests, resulting in an average of 181% more prioritized genes. Our approach is broadly applicable to omics profiling. Overall, our results underscore the benefit of integrating quality control flags and quantitative measurements and highlight the advantage of joint analysis of population-based samples and phenotypic extremes in human genetics studies.","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":"25 1","pages":""},"PeriodicalIF":9.8,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142489504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-03Epub Date: 2024-09-05DOI: 10.1016/j.ajhg.2024.08.008
Luoying Jiang, Shao Wei Hu, Zijing Wang, Yi Zhou, Honghai Tang, Yuxin Chen, Daqi Wang, Xintai Fan, Lei Han, Huawei Li, Dazhi Shi, Yingzi He, Yilai Shu
Gene therapy has made significant progress in the treatment of hereditary hearing loss. However, most research has focused on deafness-related genes that are primarily expressed in hair cells with less attention given to multisite-expressed deafness genes. MPZL2, the second leading cause of mild-to-moderate hereditary deafness, is widely expressed in different inner ear cells. We generated a mouse model with a deletion in the Mpzl2 gene, which displayed moderate and slowly progressive hearing loss, mimicking the phenotype of individuals with DFNB111. We developed a gene replacement therapy system mediated by AAV-ie for efficient transduction in various types of cochlear cells. AAV-ie-Mpzl2 administration significantly lowered the auditory brainstem response and distortion product otoacoustic emission thresholds of Mpzl2-/- mice for at least seven months. AAV-ie-Mpzl2 delivery restored the structural integrity in both outer hair cells and Deiters cells. This study suggests the potential of gene therapy for MPZL2-related deafness and provides a proof of concept for gene therapy targeting other deafness-related genes that are expressed in different cell populations in the cochlea.
{"title":"Hearing restoration by gene replacement therapy for a multisite-expressed gene in a mouse model of human DFNB111 deafness.","authors":"Luoying Jiang, Shao Wei Hu, Zijing Wang, Yi Zhou, Honghai Tang, Yuxin Chen, Daqi Wang, Xintai Fan, Lei Han, Huawei Li, Dazhi Shi, Yingzi He, Yilai Shu","doi":"10.1016/j.ajhg.2024.08.008","DOIUrl":"10.1016/j.ajhg.2024.08.008","url":null,"abstract":"<p><p>Gene therapy has made significant progress in the treatment of hereditary hearing loss. However, most research has focused on deafness-related genes that are primarily expressed in hair cells with less attention given to multisite-expressed deafness genes. MPZL2, the second leading cause of mild-to-moderate hereditary deafness, is widely expressed in different inner ear cells. We generated a mouse model with a deletion in the Mpzl2 gene, which displayed moderate and slowly progressive hearing loss, mimicking the phenotype of individuals with DFNB111. We developed a gene replacement therapy system mediated by AAV-ie for efficient transduction in various types of cochlear cells. AAV-ie-Mpzl2 administration significantly lowered the auditory brainstem response and distortion product otoacoustic emission thresholds of Mpzl2<sup>-/-</sup> mice for at least seven months. AAV-ie-Mpzl2 delivery restored the structural integrity in both outer hair cells and Deiters cells. This study suggests the potential of gene therapy for MPZL2-related deafness and provides a proof of concept for gene therapy targeting other deafness-related genes that are expressed in different cell populations in the cochlea.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2253-2264"},"PeriodicalIF":8.1,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11480802/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142144956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-03Epub Date: 2024-08-26DOI: 10.1016/j.ajhg.2024.07.021
Juehan Wang, Zixuan Zhang, Zeyun Lu, Nicholas Mancuso, Steven Gazal
Multi-ancestry genome-wide association studies (GWASs) have highlighted the existence of variants with ancestry-specific effect sizes. Understanding where and why these ancestry-specific effects occur is fundamental to understanding the genetic basis of human diseases and complex traits. Here, we characterized genes differentially expressed across ancestries (ancDE genes) at the cell-type level by leveraging single-cell RNA-sequencing data in peripheral blood mononuclear cells for 21 individuals with East Asian (EAS) ancestry and 23 individuals with European (EUR) ancestry (172,385 cells); then, we tested whether variants surrounding those genes were enriched in disease variants with ancestry-specific effect sizes by leveraging ancestry-matched GWASs of 31 diseases and complex traits (average n ∼ 90,000 and ∼ 267,000 in EAS and EUR, respectively). We observed that ancDE genes tended to be cell-type specific and enriched in genes interacting with the environment and in variants with ancestry-specific disease effect sizes, which suggests cell-type-specific, gene-by-environment interactions shared between regulatory and disease architectures. Finally, we illustrated how different environments might have led to ancestry-specific myeloid cell leukemia 1 (MCL1) expression in B cells and ancestry-specific allele effect sizes in lymphocyte count GWASs for variants surrounding MCL1. Our results imply that large single-cell and GWAS datasets from diverse ancestries are required to improve our understanding of human diseases.
{"title":"Genes with differential expression across ancestries are enriched in ancestry-specific disease effects likely due to gene-by-environment interactions.","authors":"Juehan Wang, Zixuan Zhang, Zeyun Lu, Nicholas Mancuso, Steven Gazal","doi":"10.1016/j.ajhg.2024.07.021","DOIUrl":"10.1016/j.ajhg.2024.07.021","url":null,"abstract":"<p><p>Multi-ancestry genome-wide association studies (GWASs) have highlighted the existence of variants with ancestry-specific effect sizes. Understanding where and why these ancestry-specific effects occur is fundamental to understanding the genetic basis of human diseases and complex traits. Here, we characterized genes differentially expressed across ancestries (ancDE genes) at the cell-type level by leveraging single-cell RNA-sequencing data in peripheral blood mononuclear cells for 21 individuals with East Asian (EAS) ancestry and 23 individuals with European (EUR) ancestry (172,385 cells); then, we tested whether variants surrounding those genes were enriched in disease variants with ancestry-specific effect sizes by leveraging ancestry-matched GWASs of 31 diseases and complex traits (average n ∼ 90,000 and ∼ 267,000 in EAS and EUR, respectively). We observed that ancDE genes tended to be cell-type specific and enriched in genes interacting with the environment and in variants with ancestry-specific disease effect sizes, which suggests cell-type-specific, gene-by-environment interactions shared between regulatory and disease architectures. Finally, we illustrated how different environments might have led to ancestry-specific myeloid cell leukemia 1 (MCL1) expression in B cells and ancestry-specific allele effect sizes in lymphocyte count GWASs for variants surrounding MCL1. Our results imply that large single-cell and GWAS datasets from diverse ancestries are required to improve our understanding of human diseases.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2117-2128"},"PeriodicalIF":8.1,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11480800/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142078889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-03Epub Date: 2024-09-02DOI: 10.1016/j.ajhg.2024.08.002
Patricia J Sullivan, Julian M W Quinn, Weilin Wu, Mark Pinese, Mark J Cowley
Variants that alter gene splicing are estimated to comprise up to a third of all disease-causing variants, yet they are hard to predict from DNA sequencing data alone. To overcome this, many groups are incorporating RNA-based analyses, which are resource intensive, particularly for diagnostic laboratories. There are thousands of functionally validated variants that induce mis-splicing; however, this information is not consolidated, and they are under-represented in ClinVar, which presents a barrier to variant interpretation and can result in duplication of validation efforts. To address this issue, we developed SpliceVarDB, an online database consolidating over 50,000 variants assayed for their effects on splicing in over 8,000 human genes. We evaluated over 500 published data sources and established a spliceogenicity scale to standardize, harmonize, and consolidate variant validation data generated by a range of experimental protocols. According to the strength of their supporting evidence, variants were classified as "splice-altering" (∼25%), "not splice-altering" (∼25%), and "low-frequency splice-altering" (∼50%), which correspond to weak or indeterminate evidence of spliceogenicity. Importantly, 55% of the splice-altering variants in SpliceVarDB are outside the canonical splice sites (5.6% are deep intronic). These variants can support the variant curation diagnostic pathway and can be used to provide the high-quality data necessary to develop more accurate in silico splicing predictors. The variants are accessible through an online platform, SpliceVarDB, with additional features for visualization, variant information, in silico predictions, and validation metrics. SpliceVarDB is a very large collection of splice-altering variants and is available at https://splicevardb.org.
{"title":"SpliceVarDB: A comprehensive database of experimentally validated human splicing variants.","authors":"Patricia J Sullivan, Julian M W Quinn, Weilin Wu, Mark Pinese, Mark J Cowley","doi":"10.1016/j.ajhg.2024.08.002","DOIUrl":"10.1016/j.ajhg.2024.08.002","url":null,"abstract":"<p><p>Variants that alter gene splicing are estimated to comprise up to a third of all disease-causing variants, yet they are hard to predict from DNA sequencing data alone. To overcome this, many groups are incorporating RNA-based analyses, which are resource intensive, particularly for diagnostic laboratories. There are thousands of functionally validated variants that induce mis-splicing; however, this information is not consolidated, and they are under-represented in ClinVar, which presents a barrier to variant interpretation and can result in duplication of validation efforts. To address this issue, we developed SpliceVarDB, an online database consolidating over 50,000 variants assayed for their effects on splicing in over 8,000 human genes. We evaluated over 500 published data sources and established a spliceogenicity scale to standardize, harmonize, and consolidate variant validation data generated by a range of experimental protocols. According to the strength of their supporting evidence, variants were classified as \"splice-altering\" (∼25%), \"not splice-altering\" (∼25%), and \"low-frequency splice-altering\" (∼50%), which correspond to weak or indeterminate evidence of spliceogenicity. Importantly, 55% of the splice-altering variants in SpliceVarDB are outside the canonical splice sites (5.6% are deep intronic). These variants can support the variant curation diagnostic pathway and can be used to provide the high-quality data necessary to develop more accurate in silico splicing predictors. The variants are accessible through an online platform, SpliceVarDB, with additional features for visualization, variant information, in silico predictions, and validation metrics. SpliceVarDB is a very large collection of splice-altering variants and is available at https://splicevardb.org.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"2164-2175"},"PeriodicalIF":8.1,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11480807/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142124576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}