Pub Date : 2024-09-05Epub Date: 2024-08-07DOI: 10.1016/j.ajhg.2024.07.006
Katherine A Wood, R Spencer Tong, Marialetizia Motta, Viviana Cordeddu, Eleanor R Scimone, Stephen J Bush, Dale W Maxwell, Eleni Giannoulatou, Viviana Caputo, Alice Traversa, Cecilia Mancini, Giovanni B Ferrero, Francesco Benedicenti, Paola Grammatico, Daniela Melis, Katharina Steindl, Nicola Brunetti-Pierri, Eva Trevisson, Andrew Om Wilkie, Angela E Lin, Valerie Cormier-Daire, Stephen Rf Twigg, Marco Tartaglia, Anne Goriely
While it is widely thought that de novo mutations (DNMs) occur randomly, we previously showed that some DNMs are enriched because they are positively selected in the testes of aging men. These "selfish" mutations cause disorders with a shared presentation of features, including exclusive paternal origin, significant increase of the father's age, and high apparent germline mutation rate. To date, all known selfish mutations cluster within the components of the RTK-RAS-MAPK signaling pathway, a critical modulator of testicular homeostasis. Here, we demonstrate the selfish nature of the SMAD4 DNMs causing Myhre syndrome (MYHRS). By analyzing 16 informative trios, we show that MYHRS-causing DNMs originated on the paternally derived allele in all cases. We document a statistically significant epidemiological paternal age effect of 6.3 years excess for fathers of MYHRS probands. We developed an ultra-sensitive assay to quantify spontaneous MYHRS-causing SMAD4 variants in sperm and show that pathogenic variants at codon 500 are found at elevated level in sperm of most men and exhibit a strong positive correlation with donor's age, indicative of a high apparent germline mutation rate. Finally, we performed in vitro assays to validate the peculiar functional behavior of the clonally selected DNMs and explored the basis of the pathophysiology of the different SMAD4 sperm-enriched variants. Taken together, these data provide compelling evidence that SMAD4, a gene operating outside the canonical RAS-MAPK signaling pathway, is associated with selfish spermatogonial selection and raises the possibility that other genes/pathways are under positive selection in the aging human testis.
{"title":"SMAD4 mutations causing Myhre syndrome are under positive selection in the male germline.","authors":"Katherine A Wood, R Spencer Tong, Marialetizia Motta, Viviana Cordeddu, Eleanor R Scimone, Stephen J Bush, Dale W Maxwell, Eleni Giannoulatou, Viviana Caputo, Alice Traversa, Cecilia Mancini, Giovanni B Ferrero, Francesco Benedicenti, Paola Grammatico, Daniela Melis, Katharina Steindl, Nicola Brunetti-Pierri, Eva Trevisson, Andrew Om Wilkie, Angela E Lin, Valerie Cormier-Daire, Stephen Rf Twigg, Marco Tartaglia, Anne Goriely","doi":"10.1016/j.ajhg.2024.07.006","DOIUrl":"10.1016/j.ajhg.2024.07.006","url":null,"abstract":"<p><p>While it is widely thought that de novo mutations (DNMs) occur randomly, we previously showed that some DNMs are enriched because they are positively selected in the testes of aging men. These \"selfish\" mutations cause disorders with a shared presentation of features, including exclusive paternal origin, significant increase of the father's age, and high apparent germline mutation rate. To date, all known selfish mutations cluster within the components of the RTK-RAS-MAPK signaling pathway, a critical modulator of testicular homeostasis. Here, we demonstrate the selfish nature of the SMAD4 DNMs causing Myhre syndrome (MYHRS). By analyzing 16 informative trios, we show that MYHRS-causing DNMs originated on the paternally derived allele in all cases. We document a statistically significant epidemiological paternal age effect of 6.3 years excess for fathers of MYHRS probands. We developed an ultra-sensitive assay to quantify spontaneous MYHRS-causing SMAD4 variants in sperm and show that pathogenic variants at codon 500 are found at elevated level in sperm of most men and exhibit a strong positive correlation with donor's age, indicative of a high apparent germline mutation rate. Finally, we performed in vitro assays to validate the peculiar functional behavior of the clonally selected DNMs and explored the basis of the pathophysiology of the different SMAD4 sperm-enriched variants. Taken together, these data provide compelling evidence that SMAD4, a gene operating outside the canonical RAS-MAPK signaling pathway, is associated with selfish spermatogonial selection and raises the possibility that other genes/pathways are under positive selection in the aging human testis.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"1953-1969"},"PeriodicalIF":8.1,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11444041/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141905611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05Epub Date: 2024-08-12DOI: 10.1016/j.ajhg.2024.07.010
Christopher J Shore, Sergio Villicaña, Julia S El-Sayed Moustafa, Amy L Roberts, David A Gunn, Veronique Bataille, Panos Deloukas, Tim D Spector, Kerrin S Small, Jordana T Bell
Whole-skin DNA methylation variation has been implicated in several diseases, including melanoma, but its genetic basis has not yet been fully characterized. Using bulk skin tissue samples from 414 healthy female UK twins, we performed twin-based heritability and methylation quantitative trait loci (meQTL) analyses for >400,000 DNA methylation sites. We find that the human skin DNA methylome is on average less heritable than previously estimated in blood and other tissues (mean heritability: 10.02%). meQTL analysis identified local genetic effects influencing DNA methylation at 18.8% (76,442) of tested CpG sites, as well as 1,775 CpG sites associated with at least one distal genetic variant. As a functional follow-up, we performed skin expression QTL (eQTL) analyses in a partially overlapping sample of 604 female twins. Colocalization analysis identified over 3,500 shared genetic effects affecting thousands of CpG sites (10,067) and genes (4,475). Mediation analysis of putative colocalized gene-CpG pairs identified 114 genes with evidence for eQTL effects being mediated by DNA methylation in skin, including in genes implicating skin disease such as ALOX12 and CSPG4. We further explored the relevance of skin meQTLs to skin disease and found that skin meQTLs and CpGs under genetic influence were enriched for multiple skin-related genome-wide and epigenome-wide association signals, including for melanoma and psoriasis. Our findings give insights into the regulatory landscape of epigenomic variation in skin.
全皮肤 DNA 甲基化变异与包括黑色素瘤在内的多种疾病有关,但其遗传基础尚未完全确定。利用 414 例英国健康女性双胞胎的大块皮肤组织样本,我们对超过 40 万个 DNA 甲基化位点进行了基于双胞胎的遗传性和甲基化定量性状位点(meQTL)分析。我们发现,人类皮肤 DNA 甲基化组的平均遗传率低于之前对血液和其他组织的估计(平均遗传率:10.02%)。meQTL 分析确定了影响 18.8% (76,442 个)受测 CpG 位点 DNA 甲基化的局部遗传效应,以及与至少一个远端遗传变异相关的 1,775 个 CpG 位点。作为一项功能性后续研究,我们对部分重叠的 604 对女性双胞胎样本进行了皮肤表达 QTL(eQTL)分析。共定位分析确定了 3500 多个共同的遗传效应,影响数千个 CpG 位点(10,067 个)和基因(4,475 个)。对推测的共定位基因-CpG 对的中介分析发现了 114 个基因,有证据表明皮肤中的 DNA 甲基化介导了 eQTL 效应,包括 ALOX12 和 CSPG4 等与皮肤病有关的基因。我们进一步探讨了皮肤 meQTL 与皮肤病的相关性,发现受遗传影响的皮肤 meQTL 和 CpGs 富集了多个与皮肤相关的全基因组和全表观基因组关联信号,包括黑色素瘤和银屑病。我们的研究结果有助于深入了解皮肤表观基因组变异的调控格局。
{"title":"Genetic effects on the skin methylome in healthy older twins.","authors":"Christopher J Shore, Sergio Villicaña, Julia S El-Sayed Moustafa, Amy L Roberts, David A Gunn, Veronique Bataille, Panos Deloukas, Tim D Spector, Kerrin S Small, Jordana T Bell","doi":"10.1016/j.ajhg.2024.07.010","DOIUrl":"10.1016/j.ajhg.2024.07.010","url":null,"abstract":"<p><p>Whole-skin DNA methylation variation has been implicated in several diseases, including melanoma, but its genetic basis has not yet been fully characterized. Using bulk skin tissue samples from 414 healthy female UK twins, we performed twin-based heritability and methylation quantitative trait loci (meQTL) analyses for >400,000 DNA methylation sites. We find that the human skin DNA methylome is on average less heritable than previously estimated in blood and other tissues (mean heritability: 10.02%). meQTL analysis identified local genetic effects influencing DNA methylation at 18.8% (76,442) of tested CpG sites, as well as 1,775 CpG sites associated with at least one distal genetic variant. As a functional follow-up, we performed skin expression QTL (eQTL) analyses in a partially overlapping sample of 604 female twins. Colocalization analysis identified over 3,500 shared genetic effects affecting thousands of CpG sites (10,067) and genes (4,475). Mediation analysis of putative colocalized gene-CpG pairs identified 114 genes with evidence for eQTL effects being mediated by DNA methylation in skin, including in genes implicating skin disease such as ALOX12 and CSPG4. We further explored the relevance of skin meQTLs to skin disease and found that skin meQTLs and CpGs under genetic influence were enriched for multiple skin-related genome-wide and epigenome-wide association signals, including for melanoma and psoriasis. Our findings give insights into the regulatory landscape of epigenomic variation in skin.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"1932-1952"},"PeriodicalIF":8.1,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393713/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141974864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-04DOI: 10.1016/j.ajhg.2024.08.010
Junyoung Kim,Kai Wang,Chunhua Weng,Cong Liu
Phenotype-driven gene prioritization is fundamental to diagnosing rare genetic disorders. While traditional approaches rely on curated knowledge graphs with phenotype-gene relations, recent advancements in large language models (LLMs) promise a streamlined text-to-gene solution. In this study, we evaluated five LLMs, including two generative pre-trained transformers (GPT) series and three Llama2 series, assessing their performance across task completeness, gene prediction accuracy, and adherence to required output structures. We conducted experiments, exploring various combinations of models, prompts, phenotypic input types, and task difficulty levels. Our findings revealed that the best-performed LLM, GPT-4, achieved an average accuracy of 17.0% in identifying diagnosed genes within the top 50 predictions, which still falls behind traditional tools. However, accuracy increased with the model size. Consistent results were observed over time, as shown in the dataset curated after 2023. Advanced techniques such as retrieval-augmented generation (RAG) and few-shot learning did not improve the accuracy. Sophisticated prompts were more likely to enhance task completeness, especially in smaller models. Conversely, complicated prompts tended to decrease output structure compliance rate. LLMs also achieved better-than-random prediction accuracy with free-text input, though performance was slightly lower than with standardized concept input. Bias analysis showed that highly cited genes, such as BRCA1, TP53, and PTEN, are more likely to be predicted. Our study provides valuable insights into integrating LLMs with genomic analysis, contributing to the ongoing discussion on their utilization in clinical workflows.
{"title":"Assessing the utility of large language models for phenotype-driven gene prioritization in the diagnosis of rare genetic disease.","authors":"Junyoung Kim,Kai Wang,Chunhua Weng,Cong Liu","doi":"10.1016/j.ajhg.2024.08.010","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.08.010","url":null,"abstract":"Phenotype-driven gene prioritization is fundamental to diagnosing rare genetic disorders. While traditional approaches rely on curated knowledge graphs with phenotype-gene relations, recent advancements in large language models (LLMs) promise a streamlined text-to-gene solution. In this study, we evaluated five LLMs, including two generative pre-trained transformers (GPT) series and three Llama2 series, assessing their performance across task completeness, gene prediction accuracy, and adherence to required output structures. We conducted experiments, exploring various combinations of models, prompts, phenotypic input types, and task difficulty levels. Our findings revealed that the best-performed LLM, GPT-4, achieved an average accuracy of 17.0% in identifying diagnosed genes within the top 50 predictions, which still falls behind traditional tools. However, accuracy increased with the model size. Consistent results were observed over time, as shown in the dataset curated after 2023. Advanced techniques such as retrieval-augmented generation (RAG) and few-shot learning did not improve the accuracy. Sophisticated prompts were more likely to enhance task completeness, especially in smaller models. Conversely, complicated prompts tended to decrease output structure compliance rate. LLMs also achieved better-than-random prediction accuracy with free-text input, though performance was slightly lower than with standardized concept input. Bias analysis showed that highly cited genes, such as BRCA1, TP53, and PTEN, are more likely to be predicted. Our study provides valuable insights into integrating LLMs with genomic analysis, contributing to the ongoing discussion on their utilization in clinical workflows.","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":"31 1","pages":""},"PeriodicalIF":9.8,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142170697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-04DOI: 10.1016/j.ajhg.2024.08.018
Derek Shyr,Rounak Dey,Xihao Li,Hufeng Zhou,Eric Boerwinkle,Steve Buyske,Mark Daly,Richard A Gibbs,Ira Hall,Tara Matise,Catherine Reeves,Nathan O Stitziel,Michael Zody,Benjamin M Neale,Xihong Lin
Large-scale, multi-ethnic whole-genome sequencing (WGS) studies, such as the National Human Genome Research Institute Genome Sequencing Program's Centers for Common Disease Genomics (CCDG), play an important role in increasing diversity for genetic research. Before performing association analyses, assessing Hardy-Weinberg equilibrium (HWE) is a crucial step in quality control procedures to remove low quality variants and ensure valid downstream analyses. Diverse WGS studies contain ancestrally heterogeneous samples; however, commonly used HWE methods assume that the samples are homogeneous. Therefore, directly applying these to the whole dataset can yield statistically invalid results. To account for this heterogeneity, HWE can be tested on subsets of samples that have genetically homogeneous ancestries and the results aggregated at each variant. To facilitate valid HWE subset testing, we developed a semi-supervised learning approach that predicts homogeneous ancestries based on the genotype. This method provides a convenient tool for estimating HWE in the presence of population structure and missing self-reported race and ethnicities in diverse WGS studies. In addition, assessing HWE within the homogeneous ancestries provides reliable HWE estimates that will directly benefit downstream analyses, including association analyses in WGS studies. We applied our proposed method on the CCDG dataset, predicting homogeneous genetic ancestry groups for 60,545 multi-ethnic WGS samples to assess HWE within each group.
{"title":"Semi-supervised machine learning method for predicting homogeneous ancestry groups to assess Hardy-Weinberg equilibrium in diverse whole-genome sequencing studies.","authors":"Derek Shyr,Rounak Dey,Xihao Li,Hufeng Zhou,Eric Boerwinkle,Steve Buyske,Mark Daly,Richard A Gibbs,Ira Hall,Tara Matise,Catherine Reeves,Nathan O Stitziel,Michael Zody,Benjamin M Neale,Xihong Lin","doi":"10.1016/j.ajhg.2024.08.018","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.08.018","url":null,"abstract":"Large-scale, multi-ethnic whole-genome sequencing (WGS) studies, such as the National Human Genome Research Institute Genome Sequencing Program's Centers for Common Disease Genomics (CCDG), play an important role in increasing diversity for genetic research. Before performing association analyses, assessing Hardy-Weinberg equilibrium (HWE) is a crucial step in quality control procedures to remove low quality variants and ensure valid downstream analyses. Diverse WGS studies contain ancestrally heterogeneous samples; however, commonly used HWE methods assume that the samples are homogeneous. Therefore, directly applying these to the whole dataset can yield statistically invalid results. To account for this heterogeneity, HWE can be tested on subsets of samples that have genetically homogeneous ancestries and the results aggregated at each variant. To facilitate valid HWE subset testing, we developed a semi-supervised learning approach that predicts homogeneous ancestries based on the genotype. This method provides a convenient tool for estimating HWE in the presence of population structure and missing self-reported race and ethnicities in diverse WGS studies. In addition, assessing HWE within the homogeneous ancestries provides reliable HWE estimates that will directly benefit downstream analyses, including association analyses in WGS studies. We applied our proposed method on the CCDG dataset, predicting homogeneous genetic ancestry groups for 60,545 multi-ethnic WGS samples to assess HWE within each group.","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":"1 1","pages":""},"PeriodicalIF":9.8,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142259771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-03DOI: 10.1016/j.ajhg.2024.08.009
Yuxi Liu,Cheng Peng,Ina S Brorson,Denise G O'Mahony,Rebecca L Kelly,Yujing J Heng,Gabrielle M Baker,Grethe I Grenaker Alnæs,Clara Bodelon,Daniel G Stover,Eliezer M Van Allen,A Heather Eliassen,Vessela N Kristensen,Rulla M Tamimi,Peter Kraft
The tumor immune microenvironment (TIME) plays key roles in tumor progression and response to immunotherapy. Previous studies have identified individual germline variants associated with differences in TIME. Here, we hypothesize that common variants associated with breast cancer risk or cancer-related traits, represented by polygenic risk scores (PRSs), may jointly influence immune features in TIME. We derived 154 immune traits from bulk gene expression profiles of 764 breast tumors and 598 adjacent normal tissue samples from 825 individuals with breast cancer in the Nurses' Health Study (NHS) and NHSII. Immunohistochemical staining of four immune cell markers were available for a subset of 205 individuals. Germline PRSs were calculated for 16 different traits including breast cancer, autoimmune diseases, type 2 diabetes, ages at menarche and menopause, body mass index (BMI), BMI-adjusted waist-to-hip ratio, alcohol intake, and tobacco smoking. Overall, we identified 44 associations between germline PRSs and immune traits at false discovery rate q < 0.25, including 3 associations with q < 0.05. We observed consistent inverse associations of inflammatory bowel disease (IBD) and Crohn disease (CD) PRSs with interferon signaling and STAT1 scores in breast tumor and adjacent normal tissue; these associations were replicated in a Norwegian cohort. Inverse associations were also consistently observed for IBD PRS and B cell abundance in normal tissue. We also observed positive associations between CD PRS and endothelial cell abundance in tumor. Our findings suggest that the genetic mechanisms that influence immune-related diseases are also associated with TIME in breast cancer.
肿瘤免疫微环境(TIME)在肿瘤进展和对免疫疗法的反应中起着关键作用。以往的研究发现了与 TIME 差异相关的单个种系变异。在此,我们假设与乳腺癌风险或癌症相关特征相关的常见变异(以多基因风险评分(PRS)为代表)可能会共同影响 TIME 中的免疫特征。我们从护士健康研究(NHS)和 NHSII 中 825 名乳腺癌患者的 764 个乳腺肿瘤和 598 个邻近正常组织样本的大量基因表达谱中得出了 154 个免疫特征。对 205 人的子集进行了四种免疫细胞标记物的免疫组化染色。我们计算了 16 种不同性状的种系 PRS,包括乳腺癌、自身免疫性疾病、2 型糖尿病、初潮年龄和绝经年龄、体重指数 (BMI)、BMI 调整后的腰臀比、酒精摄入量和吸烟。总体而言,我们在种系PRS与免疫特征之间发现了44种假性发现率q < 0.25的关联,其中包括3种q < 0.05的关联。我们观察到炎症性肠病(IBD)和克罗恩病(CD)PRS与干扰素信号转导和乳腺肿瘤及邻近正常组织中STAT1评分之间存在一致的反向关联;这些关联在挪威队列中得到了复制。我们还持续观察到 IBD PRS 与正常组织中 B 细胞丰度的反向关联。我们还观察到 CD PRS 与肿瘤中内皮细胞的丰度呈正相关。我们的研究结果表明,影响免疫相关疾病的遗传机制也与乳腺癌的TIME有关。
{"title":"Germline polygenic risk scores are associated with immune gene expression signature and immune cell infiltration in breast cancer.","authors":"Yuxi Liu,Cheng Peng,Ina S Brorson,Denise G O'Mahony,Rebecca L Kelly,Yujing J Heng,Gabrielle M Baker,Grethe I Grenaker Alnæs,Clara Bodelon,Daniel G Stover,Eliezer M Van Allen,A Heather Eliassen,Vessela N Kristensen,Rulla M Tamimi,Peter Kraft","doi":"10.1016/j.ajhg.2024.08.009","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.08.009","url":null,"abstract":"The tumor immune microenvironment (TIME) plays key roles in tumor progression and response to immunotherapy. Previous studies have identified individual germline variants associated with differences in TIME. Here, we hypothesize that common variants associated with breast cancer risk or cancer-related traits, represented by polygenic risk scores (PRSs), may jointly influence immune features in TIME. We derived 154 immune traits from bulk gene expression profiles of 764 breast tumors and 598 adjacent normal tissue samples from 825 individuals with breast cancer in the Nurses' Health Study (NHS) and NHSII. Immunohistochemical staining of four immune cell markers were available for a subset of 205 individuals. Germline PRSs were calculated for 16 different traits including breast cancer, autoimmune diseases, type 2 diabetes, ages at menarche and menopause, body mass index (BMI), BMI-adjusted waist-to-hip ratio, alcohol intake, and tobacco smoking. Overall, we identified 44 associations between germline PRSs and immune traits at false discovery rate q < 0.25, including 3 associations with q < 0.05. We observed consistent inverse associations of inflammatory bowel disease (IBD) and Crohn disease (CD) PRSs with interferon signaling and STAT1 scores in breast tumor and adjacent normal tissue; these associations were replicated in a Norwegian cohort. Inverse associations were also consistently observed for IBD PRS and B cell abundance in normal tissue. We also observed positive associations between CD PRS and endothelial cell abundance in tumor. Our findings suggest that the genetic mechanisms that influence immune-related diseases are also associated with TIME in breast cancer.","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":"25 1","pages":""},"PeriodicalIF":9.8,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142259772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-29DOI: 10.1016/j.ajhg.2024.08.007
Christa Ventresca,Daphne O Martschenko,Robbee Wedow,Mete Civelek,James Tabery,Jedidiah Carlson,Stephen C J Parker,Paula S Ramos
Same-sex sexual behavior has long interested genetics researchers in part because, while there is evidence of heritability, the trait as typically defined is associated with fewer offspring. Investigations of this phenomenon began in the 1990s with linkage studies and continue today with the advent of genome-wide association studies. As this body of research grows, so does critical scientific and ethical review of it. Here, we provide a targeted overview of existing genetics studies on same-sex sexual behavior, highlight the ethical and scientific considerations of this nascent field, and provide recommendations developed by the authors to enhance social and ethical responsibility.
{"title":"The methodological and ethical concerns of genetic studies of same-sex sexual behavior.","authors":"Christa Ventresca,Daphne O Martschenko,Robbee Wedow,Mete Civelek,James Tabery,Jedidiah Carlson,Stephen C J Parker,Paula S Ramos","doi":"10.1016/j.ajhg.2024.08.007","DOIUrl":"https://doi.org/10.1016/j.ajhg.2024.08.007","url":null,"abstract":"Same-sex sexual behavior has long interested genetics researchers in part because, while there is evidence of heritability, the trait as typically defined is associated with fewer offspring. Investigations of this phenomenon began in the 1990s with linkage studies and continue today with the advent of genome-wide association studies. As this body of research grows, so does critical scientific and ethical review of it. Here, we provide a targeted overview of existing genetics studies on same-sex sexual behavior, highlight the ethical and scientific considerations of this nascent field, and provide recommendations developed by the authors to enhance social and ethical responsibility.","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":"45 1","pages":""},"PeriodicalIF":9.8,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142198506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-08Epub Date: 2024-06-25DOI: 10.1016/j.ajhg.2024.06.002
Xavier Bledsoe, Eric R Gamazon
Regulation of gene expression is a vital component of neurological homeostasis. Cataloging the consequences of endogenous gene expression on the physical structure and connectivity of the brain offers a means of unifying trait-associated genetic variation with trait-associated neurological features. We perform tissue-specific transcriptome-wide association studies (TWASs) on over 3,400 neuroimaging phenotypes in the UK Biobank (N = 33,224) using our joint-tissue imputation (JTI)-TWAS method. We identify highly significant associations between predicted expression for 7,192 genes and a wide variety of measures of the brain derived from magnetic resonance imaging (MRI). Our approach generates reproducible results in internal and external replication datasets. Genetically determined expression alone is sufficient for high-fidelity reconstruction of brain structure and organization. We demonstrate complementary benefits of cross-tissue and single-tissue analyses toward an integrated neurobiology and provide evidence that gene expression outside the central nervous system provides unique insights into brain health. As an application, we provide evidence suggesting that the genetically regulated expression of schizophrenia risk genes causally affects over 73% of neurological phenotypes that are altered in individuals with schizophrenia (as identified by neuroimaging studies). Imaging features associated with neuropsychiatric traits can provide valuable insights into underlying pathophysiology. By linking neuroimaging-derived phenotypes with expression levels of specific genes, this resource represents a powerful gene prioritization schema that can improve our understanding of brain function, development, and disease. The use of multiple different cortical and subcortical atlases in the resource facilitates direct integration of these data with findings from a diverse range of clinical neuroimaging studies.
{"title":"A transcriptomic atlas of the human brain reveals genetically determined aspects of neuropsychiatric health.","authors":"Xavier Bledsoe, Eric R Gamazon","doi":"10.1016/j.ajhg.2024.06.002","DOIUrl":"10.1016/j.ajhg.2024.06.002","url":null,"abstract":"<p><p>Regulation of gene expression is a vital component of neurological homeostasis. Cataloging the consequences of endogenous gene expression on the physical structure and connectivity of the brain offers a means of unifying trait-associated genetic variation with trait-associated neurological features. We perform tissue-specific transcriptome-wide association studies (TWASs) on over 3,400 neuroimaging phenotypes in the UK Biobank (N = 33,224) using our joint-tissue imputation (JTI)-TWAS method. We identify highly significant associations between predicted expression for 7,192 genes and a wide variety of measures of the brain derived from magnetic resonance imaging (MRI). Our approach generates reproducible results in internal and external replication datasets. Genetically determined expression alone is sufficient for high-fidelity reconstruction of brain structure and organization. We demonstrate complementary benefits of cross-tissue and single-tissue analyses toward an integrated neurobiology and provide evidence that gene expression outside the central nervous system provides unique insights into brain health. As an application, we provide evidence suggesting that the genetically regulated expression of schizophrenia risk genes causally affects over 73% of neurological phenotypes that are altered in individuals with schizophrenia (as identified by neuroimaging studies). Imaging features associated with neuropsychiatric traits can provide valuable insights into underlying pathophysiology. By linking neuroimaging-derived phenotypes with expression levels of specific genes, this resource represents a powerful gene prioritization schema that can improve our understanding of brain function, development, and disease. The use of multiple different cortical and subcortical atlases in the resource facilitates direct integration of these data with findings from a diverse range of clinical neuroimaging studies.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"1559-1572"},"PeriodicalIF":8.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11339608/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141454657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-08Epub Date: 2024-07-02DOI: 10.1016/j.ajhg.2024.06.005
Amy Nisselle, Bronwyn Terrill, Monika Janinski, Sylvia Metcalfe, Clara Gaff
A health workforce capable of implementing genomic medicine requires effective genomics education. Genomics education interventions developed for health professions over the last two decades, and their impact, are variably described in the literature. To inform an evaluation framework for genomics education, we undertook an exploratory scoping review of published needs assessments for, and/or evaluations of, genomics education interventions for health professionals from 2000 to 2023. We retrieved and screened 4,659 records across the two searches with 363 being selected for full-text review and consideration by an interdisciplinary working group. 104 articles were selected for inclusion in the review-60 needs assessments, 52 genomics education evaluations, and eight describing both. Included articles spanned all years and described education interventions in over 30 countries. Target audiences included medical specialists, nurses/midwives, and/or allied health professionals. Evaluation questions, outcomes, and measures were extracted, categorized, and tabulated to iteratively compare measures across stages of genomics education evaluation: planning (pre-implementation), development and delivery (implementation), and impact (immediate, intermediate, or long-term outcomes). They are presented here along with descriptions of study designs. We document the wide variability in evaluation approaches and terminology used to define measures and note that few articles considered downstream (long-term) outcomes of genomics education interventions. Alongside the evaluation framework for genomics education, results from this scoping review form part of a toolkit to help educators to undertake rigorous genomics evaluation that is fit for purpose and can contribute to the growing evidence base of the contribution of genomics education in implementation strategies for genomic medicine.
{"title":"Ensuring best practice in genomics education: A scoping review of genomics education needs assessments and evaluations.","authors":"Amy Nisselle, Bronwyn Terrill, Monika Janinski, Sylvia Metcalfe, Clara Gaff","doi":"10.1016/j.ajhg.2024.06.005","DOIUrl":"10.1016/j.ajhg.2024.06.005","url":null,"abstract":"<p><p>A health workforce capable of implementing genomic medicine requires effective genomics education. Genomics education interventions developed for health professions over the last two decades, and their impact, are variably described in the literature. To inform an evaluation framework for genomics education, we undertook an exploratory scoping review of published needs assessments for, and/or evaluations of, genomics education interventions for health professionals from 2000 to 2023. We retrieved and screened 4,659 records across the two searches with 363 being selected for full-text review and consideration by an interdisciplinary working group. 104 articles were selected for inclusion in the review-60 needs assessments, 52 genomics education evaluations, and eight describing both. Included articles spanned all years and described education interventions in over 30 countries. Target audiences included medical specialists, nurses/midwives, and/or allied health professionals. Evaluation questions, outcomes, and measures were extracted, categorized, and tabulated to iteratively compare measures across stages of genomics education evaluation: planning (pre-implementation), development and delivery (implementation), and impact (immediate, intermediate, or long-term outcomes). They are presented here along with descriptions of study designs. We document the wide variability in evaluation approaches and terminology used to define measures and note that few articles considered downstream (long-term) outcomes of genomics education interventions. Alongside the evaluation framework for genomics education, results from this scoping review form part of a toolkit to help educators to undertake rigorous genomics evaluation that is fit for purpose and can contribute to the growing evidence base of the contribution of genomics education in implementation strategies for genomic medicine.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"1508-1523"},"PeriodicalIF":8.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11339611/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141496884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-08Epub Date: 2024-07-24DOI: 10.1016/j.ajhg.2024.06.013
Huiling Liao, Haoran Xue, Wei Pan
In Mendelian randomization, two single SNP-trait correlation-based methods have been developed to infer the causal direction between an exposure (e.g., a gene) and an outcome (e.g., a trait), called MR Steiger's method and its recent extension called Causal Direction-Ratio (CD-Ratio). Here we propose an approach based on R2, the coefficient of determination, to combine information from multiple (possibly correlated) SNPs to simultaneously infer the presence and direction of a causal relationship between an exposure and an outcome. Our proposed method generalizes Steiger's method from using a single SNP to multiple SNPs as IVs. It is especially useful in transcriptome-wide association studies (TWASs) (and similar applications) with typically small sample sizes for gene expression (or another molecular trait) data, providing a more flexible and powerful approach to inferring causal directions. It can be applied to GWAS summary data with a reference panel. We also discuss the influence of invalid IVs and introduce a new approach called R2S to select and remove invalid IVs (if any) to enhance the robustness. We compared the performance of the proposed method with existing methods in simulations to demonstrate its advantages. We applied the methods to identify causal genes for high/low-density lipoprotein cholesterol (HDL/LDL) using the individual-level GTEx gene expression data and UK Biobank GWAS data. The proposed method was able to confirm some well-known causal genes while identifying some novel ones. Additionally, we illustrated an application of the proposed method to GWAS summary to infer causal relationships between HDL/LDL and stroke/coronary artery disease (CAD).
在孟德尔随机化中,有两种基于单 SNP-性状相关性的方法可用于推断暴露(如基因)与结果(如性状)之间的因果方向,分别称为 MR Steiger 方法和最近扩展的因果方向比(CD-Ratio)方法。在此,我们提出一种基于 R2(决定系数)的方法,将多个 SNPs(可能相关)的信息结合起来,同时推断暴露与结果之间是否存在因果关系以及因果关系的方向。我们提出的方法将 Steiger 的方法从使用单个 SNP 推广到多个 SNP 作为 IV。它特别适用于基因表达(或其他分子性状)数据样本量通常较小的转录组范围关联研究(TWAS)(及类似应用),为推断因果方向提供了一种更灵活、更强大的方法。它可以应用于具有参考面板的 GWAS 摘要数据。我们还讨论了无效 IV 的影响,并引入了一种称为 R2S 的新方法来选择和移除无效 IV(如果有的话),以增强稳健性。我们通过模拟比较了拟议方法与现有方法的性能,以证明其优势。我们利用个体水平的 GTEx 基因表达数据和英国生物库 GWAS 数据,将这些方法用于识别高/低密度脂蛋白胆固醇(HDL/LDL)的因果基因。所提出的方法在确认了一些众所周知的因果基因的同时,还发现了一些新的基因。此外,我们还说明了所提方法在 GWAS 总结中的应用,以推断高密度脂蛋白/低密度脂蛋白与中风/冠状动脉疾病(CAD)之间的因果关系。
{"title":"Inferring causal direction between two traits using R<sup>2</sup> with application to transcriptome-wide association studies.","authors":"Huiling Liao, Haoran Xue, Wei Pan","doi":"10.1016/j.ajhg.2024.06.013","DOIUrl":"10.1016/j.ajhg.2024.06.013","url":null,"abstract":"<p><p>In Mendelian randomization, two single SNP-trait correlation-based methods have been developed to infer the causal direction between an exposure (e.g., a gene) and an outcome (e.g., a trait), called MR Steiger's method and its recent extension called Causal Direction-Ratio (CD-Ratio). Here we propose an approach based on R<sup>2</sup>, the coefficient of determination, to combine information from multiple (possibly correlated) SNPs to simultaneously infer the presence and direction of a causal relationship between an exposure and an outcome. Our proposed method generalizes Steiger's method from using a single SNP to multiple SNPs as IVs. It is especially useful in transcriptome-wide association studies (TWASs) (and similar applications) with typically small sample sizes for gene expression (or another molecular trait) data, providing a more flexible and powerful approach to inferring causal directions. It can be applied to GWAS summary data with a reference panel. We also discuss the influence of invalid IVs and introduce a new approach called R2S to select and remove invalid IVs (if any) to enhance the robustness. We compared the performance of the proposed method with existing methods in simulations to demonstrate its advantages. We applied the methods to identify causal genes for high/low-density lipoprotein cholesterol (HDL/LDL) using the individual-level GTEx gene expression data and UK Biobank GWAS data. The proposed method was able to confirm some well-known causal genes while identifying some novel ones. Additionally, we illustrated an application of the proposed method to GWAS summary to infer causal relationships between HDL/LDL and stroke/coronary artery disease (CAD).</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"1782-1795"},"PeriodicalIF":8.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11339628/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141756661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-08Epub Date: 2024-07-23DOI: 10.1016/j.ajhg.2024.06.015
Jihoon G Yoon, Seong-Kyun Lim, Hoseok Seo, Seungbok Lee, Jaeso Cho, Soo Yeon Kim, Hyun Yong Koh, Annapurna H Poduri, Vijayalakshmi Ramakumaran, Pradeep Vasudevan, Martijn J de Groot, Jung Min Ko, Dohyun Han, Jong-Hee Chae, Chul-Hwan Lee
Histone deacetylase 3 (HDAC3) is a crucial epigenetic modulator essential for various developmental and physiological functions. Although its dysfunction is increasingly recognized in abnormal phenotypes, to our knowledge, there have been no established reports of human diseases directly linked to HDAC3 dysfunction. Using trio exome sequencing and extensive phenotypic analysis, we correlated heterozygous de novo variants in HDAC3 with a neurodevelopmental disorder having variable clinical presentations, frequently associated with intellectual disability, developmental delay, epilepsy, and musculoskeletal abnormalities. In a cohort of six individuals, we identified missense variants in HDAC3 (c.277G>A [p.Asp93Asn], c.328G>A [p.Ala110Thr], c.601C>T [p.Pro201Ser], c. 797T>C [p.Leu266Ser], c.799G>A [p.Gly267Ser], and c.1075C>T [p.Arg359Cys]), all located in evolutionarily conserved sites and confirmed as de novo. Experimental studies identified defective deacetylation activity in the p.Asp93Asn, p.Pro201Ser, p.Leu266Ser, and p.Gly267Ser variants, positioned near the enzymatic pocket. In addition, proteomic analysis employing co-immunoprecipitation revealed that the disrupted interactions with molecules involved in the CoREST and NCoR complexes, particularly in the p.Ala110Thr variant, consist of a central pathogenic mechanism. Moreover, immunofluorescence analysis showed diminished nuclear to cytoplasmic fluorescence ratio in the p.Ala110Thr, p.Gly267Ser, and p.Arg359Cys variants, indicating impaired nuclear localization. Taken together, our study highlights that de novo missense variants in HDAC3 are associated with a broad spectrum of neurodevelopmental disorders, which emphasizes the complex role of HDAC3 in histone deacetylase activity, multi-protein complex interactions, and nuclear localization for proper physiological functions. These insights open new avenues for understanding the molecular mechanisms of HDAC3-related disorders and may inform future therapeutic strategies.
{"title":"De novo missense variants in HDAC3 leading to epigenetic machinery dysfunction are associated with a variable neurodevelopmental disorder.","authors":"Jihoon G Yoon, Seong-Kyun Lim, Hoseok Seo, Seungbok Lee, Jaeso Cho, Soo Yeon Kim, Hyun Yong Koh, Annapurna H Poduri, Vijayalakshmi Ramakumaran, Pradeep Vasudevan, Martijn J de Groot, Jung Min Ko, Dohyun Han, Jong-Hee Chae, Chul-Hwan Lee","doi":"10.1016/j.ajhg.2024.06.015","DOIUrl":"10.1016/j.ajhg.2024.06.015","url":null,"abstract":"<p><p>Histone deacetylase 3 (HDAC3) is a crucial epigenetic modulator essential for various developmental and physiological functions. Although its dysfunction is increasingly recognized in abnormal phenotypes, to our knowledge, there have been no established reports of human diseases directly linked to HDAC3 dysfunction. Using trio exome sequencing and extensive phenotypic analysis, we correlated heterozygous de novo variants in HDAC3 with a neurodevelopmental disorder having variable clinical presentations, frequently associated with intellectual disability, developmental delay, epilepsy, and musculoskeletal abnormalities. In a cohort of six individuals, we identified missense variants in HDAC3 (c.277G>A [p.Asp93Asn], c.328G>A [p.Ala110Thr], c.601C>T [p.Pro201Ser], c. 797T>C [p.Leu266Ser], c.799G>A [p.Gly267Ser], and c.1075C>T [p.Arg359Cys]), all located in evolutionarily conserved sites and confirmed as de novo. Experimental studies identified defective deacetylation activity in the p.Asp93Asn, p.Pro201Ser, p.Leu266Ser, and p.Gly267Ser variants, positioned near the enzymatic pocket. In addition, proteomic analysis employing co-immunoprecipitation revealed that the disrupted interactions with molecules involved in the CoREST and NCoR complexes, particularly in the p.Ala110Thr variant, consist of a central pathogenic mechanism. Moreover, immunofluorescence analysis showed diminished nuclear to cytoplasmic fluorescence ratio in the p.Ala110Thr, p.Gly267Ser, and p.Arg359Cys variants, indicating impaired nuclear localization. Taken together, our study highlights that de novo missense variants in HDAC3 are associated with a broad spectrum of neurodevelopmental disorders, which emphasizes the complex role of HDAC3 in histone deacetylase activity, multi-protein complex interactions, and nuclear localization for proper physiological functions. These insights open new avenues for understanding the molecular mechanisms of HDAC3-related disorders and may inform future therapeutic strategies.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"1588-1604"},"PeriodicalIF":8.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11339613/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141756660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}