Pub Date : 2024-08-20DOI: 10.1038/s10038-024-01281-2
Yoshiro Nagao
“Missing heritability” is a current problem in human genetics. I previously reported a method to estimate heritability of a polymorphism (hp2) for a common disease without calculating the genetic variance under dominant and the recessive models. Here, I extend the method to the co-dominant model and carry out trial calculations of hp2. I also calculate hp2 applying the allele distribution model originally reported by Pawitan et al. for comparison as a conventional method. But unexpectedly, hp2 calculated for rare variants with high odds ratios was much higher than the calculated values with the allele distribution model. Also, while examining the basis for the difference in calculated hp2, I noticed that conventional methods use the allele frequency (AF) of a variant in the general population to calculate the genetic variance of that variant. However, this implicitly assumes that the unaffected are included among the phenotypes of the disease – an assumption that is inconsistent with case-control studies in which unaffected individuals belong to the control (unaffected) group. Therefore, I modified the allele distribution model by using the AF in the patient population. Consequently, the hp2 of rare variants calculated with the modified allele distribution model was quite high. Recalculating hp2 of several rare variants reported in the literature with the modified allele distribution model yielded results were 3.2 - 53.7 times higher than the hp2 calculated with the original allele distribution model. These results suggest that the contribution of rare variants to heritability of a disease has been considerably underestimated.
{"title":"Contribution of rare variants to heritability of a disease is much greater than conventionally estimated: modification of allele distribution model","authors":"Yoshiro Nagao","doi":"10.1038/s10038-024-01281-2","DOIUrl":"10.1038/s10038-024-01281-2","url":null,"abstract":"“Missing heritability” is a current problem in human genetics. I previously reported a method to estimate heritability of a polymorphism (hp2) for a common disease without calculating the genetic variance under dominant and the recessive models. Here, I extend the method to the co-dominant model and carry out trial calculations of hp2. I also calculate hp2 applying the allele distribution model originally reported by Pawitan et al. for comparison as a conventional method. But unexpectedly, hp2 calculated for rare variants with high odds ratios was much higher than the calculated values with the allele distribution model. Also, while examining the basis for the difference in calculated hp2, I noticed that conventional methods use the allele frequency (AF) of a variant in the general population to calculate the genetic variance of that variant. However, this implicitly assumes that the unaffected are included among the phenotypes of the disease – an assumption that is inconsistent with case-control studies in which unaffected individuals belong to the control (unaffected) group. Therefore, I modified the allele distribution model by using the AF in the patient population. Consequently, the hp2 of rare variants calculated with the modified allele distribution model was quite high. Recalculating hp2 of several rare variants reported in the literature with the modified allele distribution model yielded results were 3.2 - 53.7 times higher than the hp2 calculated with the original allele distribution model. These results suggest that the contribution of rare variants to heritability of a disease has been considerably underestimated.","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":"69 12","pages":"663-668"},"PeriodicalIF":2.6,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142008895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-19DOI: 10.1038/s10038-024-01288-9
Qi Fang, Lanxi Ran, Xinying Bi, Jianyong Di, Ye Liu, Fengqin Xu, Binbin Wang
Male infertility is a widespread population health concern, causing various degrees of adverse fertility outcomes. We determined the genetic cause of an infertile male from a consanguineous family, expanding the mutant spectrum of male infertility. A non-obstructive azoospermia (NOA) patient was recruited, and histological type of human testicular tissue of the patient categorized as maturation arrest. We identified a novel loss-of-function variant of syntaxin 2 (STX2) (c.142C>T:p.Gln48*) by performing Whole-exome sequencing (WES) on the NOA patient from a consanguineous Chinese family. Sanger sequencing confirmed the p.Gln48* variant was maternally and paternally inherited. The variant was predicted to be deleterious and resulted in aberrant changes to structure and function of STX2 by In silico analysis. In summary, we reported for the first time that a nonsense variant occurred in the exon region of STX2 in an infertile male presenting with NOA, which was beneficial for diagnosis and therapies of NOA.
{"title":"A novel homozygous nonsense variant of STX2 underlies non-obstructive azoospermia in a consanguineous Chinese family","authors":"Qi Fang, Lanxi Ran, Xinying Bi, Jianyong Di, Ye Liu, Fengqin Xu, Binbin Wang","doi":"10.1038/s10038-024-01288-9","DOIUrl":"10.1038/s10038-024-01288-9","url":null,"abstract":"Male infertility is a widespread population health concern, causing various degrees of adverse fertility outcomes. We determined the genetic cause of an infertile male from a consanguineous family, expanding the mutant spectrum of male infertility. A non-obstructive azoospermia (NOA) patient was recruited, and histological type of human testicular tissue of the patient categorized as maturation arrest. We identified a novel loss-of-function variant of syntaxin 2 (STX2) (c.142C>T:p.Gln48*) by performing Whole-exome sequencing (WES) on the NOA patient from a consanguineous Chinese family. Sanger sequencing confirmed the p.Gln48* variant was maternally and paternally inherited. The variant was predicted to be deleterious and resulted in aberrant changes to structure and function of STX2 by In silico analysis. In summary, we reported for the first time that a nonsense variant occurred in the exon region of STX2 in an infertile male presenting with NOA, which was beneficial for diagnosis and therapies of NOA.","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":"69 12","pages":"675-677"},"PeriodicalIF":2.6,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142000104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Primary ciliary dyskinesia (PCD) is a genetic disorder characterized by ciliary structural abnormalities and dysfunction, leading to chronic rhinosinusitis, otitis media with effusion, bronchiectasis, and infertility. Approximately half of Japanese PCD cases are attributed to variants in the dynein regulatory complex subunit 1 (DRC1) gene, predominantly featuring homogeneous deletions of exons 1–4 spanning 27,748 base pairs on chromosome 2. Here, we report 10 new PCD cases (9 families) in addition to 29 previously reported cases (24 families) caused by DRC1 variants. Among these 39 cases, biallelic DRC1 exon 1–4 deletions were detected in 38 (97.4%). These DRC1 deletions exhibited an identical breakpoint in all PCD cases in the Japanese and Korean populations, strongly suggesting a founder effect. In this study, we performed haplotype analysis, using a whole-exome sequencing dataset of 18 Japanese PCD patients harboring large biallelic DRC1 deletions. We estimated that the founder allele likely emerged 115.1 generations ago (95% confidence interval: 33.7–205.1), suggesting an origin of approximately 3050 years ago, coinciding with the transition from the Jomon period to the early Yayoi period in Japan. Considering the formation of the modern Japanese population, the founder with the DRC1 exon 1–4 deletion likely lived on the Korean peninsula, with the allele later transmitted to Japan through migration. This study provides insights into the origin of the DRC1 copy number variant, the most frequent PCD variant in the Japanese and Korean populations, highlighting the importance of understanding population-specific genetic variations in the context of human migration and disease prevalence.
{"title":"A 3000-year-old founder variant in the DRC1 gene causes primary ciliary dyskinesia in Japan and Korea","authors":"Ryotaro Hashizume, Yifei Xu, Makoto Ikejiri, Shimpei Gotoh, Kazuhiko Takeuchi","doi":"10.1038/s10038-024-01289-8","DOIUrl":"10.1038/s10038-024-01289-8","url":null,"abstract":"Primary ciliary dyskinesia (PCD) is a genetic disorder characterized by ciliary structural abnormalities and dysfunction, leading to chronic rhinosinusitis, otitis media with effusion, bronchiectasis, and infertility. Approximately half of Japanese PCD cases are attributed to variants in the dynein regulatory complex subunit 1 (DRC1) gene, predominantly featuring homogeneous deletions of exons 1–4 spanning 27,748 base pairs on chromosome 2. Here, we report 10 new PCD cases (9 families) in addition to 29 previously reported cases (24 families) caused by DRC1 variants. Among these 39 cases, biallelic DRC1 exon 1–4 deletions were detected in 38 (97.4%). These DRC1 deletions exhibited an identical breakpoint in all PCD cases in the Japanese and Korean populations, strongly suggesting a founder effect. In this study, we performed haplotype analysis, using a whole-exome sequencing dataset of 18 Japanese PCD patients harboring large biallelic DRC1 deletions. We estimated that the founder allele likely emerged 115.1 generations ago (95% confidence interval: 33.7–205.1), suggesting an origin of approximately 3050 years ago, coinciding with the transition from the Jomon period to the early Yayoi period in Japan. Considering the formation of the modern Japanese population, the founder with the DRC1 exon 1–4 deletion likely lived on the Korean peninsula, with the allele later transmitted to Japan through migration. This study provides insights into the origin of the DRC1 copy number variant, the most frequent PCD variant in the Japanese and Korean populations, highlighting the importance of understanding population-specific genetic variations in the context of human migration and disease prevalence.","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":"69 12","pages":"655-661"},"PeriodicalIF":2.6,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141995901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Age at menarche (AAM) is a sign of puberty of females. It is a heritable trait associated with various adult diseases. However, the genetic mechanism that determines AAM and links it to disease risk is poorly understood. Aiming to uncover the genetic basis for AAM, we conducted a joint association study in up to 438,089 women from 3 genome-wide association studies of European and East Asian ancestries. A series of bioinformatical analyses and causal inference were then followed to explore in-depth annotations at the associated loci and infer the causal relationship between AAM and other complex traits/diseases. This largest meta-analysis identified a total of 21 novel AAM associated loci at the genome wide significance level (P < 5.0 × 10−8), 4 of which were European ancestry-specific loci. Functional annotations prioritized 33 candidate genes at newly identified loci. Significant genetic correlations were observed between AAM and 67 complex traits. Further causal inference demonstrated the effects of AAM on 13 traits, including forced vital capacity (FVC), high blood pressure, age at first live birth, etc, indicating that earlier AAM causes lower FVC, worse lung function, hypertension and earlier age at first (last) live birth. Enrichment analysis identified 5 enriched tissues, including the hypothalamus middle, hypothalamo hypophyseal system, neurosecretory systems, hypothalamus and retina. Our findings may provide useful insights that elucidate the mechanisms determining AAM and the genetic interplay between AAM and some traits of women.
{"title":"The genetic architecture of age at menarche and its causal effects on other traits","authors":"Gui-Juan Feng, Qian Xu, Qi-Gang Zhao, Bai-Xue Han, Shan-Shan Yan, Jie Zhu, Yu-Fang Pei","doi":"10.1038/s10038-024-01287-w","DOIUrl":"10.1038/s10038-024-01287-w","url":null,"abstract":"Age at menarche (AAM) is a sign of puberty of females. It is a heritable trait associated with various adult diseases. However, the genetic mechanism that determines AAM and links it to disease risk is poorly understood. Aiming to uncover the genetic basis for AAM, we conducted a joint association study in up to 438,089 women from 3 genome-wide association studies of European and East Asian ancestries. A series of bioinformatical analyses and causal inference were then followed to explore in-depth annotations at the associated loci and infer the causal relationship between AAM and other complex traits/diseases. This largest meta-analysis identified a total of 21 novel AAM associated loci at the genome wide significance level (P < 5.0 × 10−8), 4 of which were European ancestry-specific loci. Functional annotations prioritized 33 candidate genes at newly identified loci. Significant genetic correlations were observed between AAM and 67 complex traits. Further causal inference demonstrated the effects of AAM on 13 traits, including forced vital capacity (FVC), high blood pressure, age at first live birth, etc, indicating that earlier AAM causes lower FVC, worse lung function, hypertension and earlier age at first (last) live birth. Enrichment analysis identified 5 enriched tissues, including the hypothalamus middle, hypothalamo hypophyseal system, neurosecretory systems, hypothalamus and retina. Our findings may provide useful insights that elucidate the mechanisms determining AAM and the genetic interplay between AAM and some traits of women.","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":"69 12","pages":"645-653"},"PeriodicalIF":2.6,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141988157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In June 2024, the Japanese government introduced a new genomic strategic action to shorten the “diagnostic odyssey” for patients with rare and intractable diseases: Six groups of rare diseases, (i) Muscle weakness group, (ii) Growth retardation, intellectual disability, and characteristic facial features group, (iii) Intellectual disability/epilepsy group, (iv) Cardiomyopathy group (mainly adult onset) (v) Proteinuria group, (vi) Fever, inflammation, skin rash, osteoarthritis group, have been newly recognized as “difficult-to-differentiate disorders” and comprehensive genomic testing can be reimbursed when patients belong to one of the six groups and certain requirements are met. The introduction of comprehensive genomic testing will improve the diagnosis rate of diseases and have significant potential to enhance Japan’s rare and intractable disease policy. The new strategy in Japan and its rationale will be a reference for insurance reimbursement of comprehensive genomic testing in other countries that have universal health coverage supported by the public health insurance system.
{"title":"Japanese Public Health Insurance System’s new genomic strategic action to shorten the “diagnostic odyssey” for patients with rare and intractable diseases","authors":"Jiro Ezaki, Yukari Takahashi, Harutaka Saijo, Fuyuki Miya, Kenjiro Kosaki","doi":"10.1038/s10038-024-01285-y","DOIUrl":"10.1038/s10038-024-01285-y","url":null,"abstract":"In June 2024, the Japanese government introduced a new genomic strategic action to shorten the “diagnostic odyssey” for patients with rare and intractable diseases: Six groups of rare diseases, (i) Muscle weakness group, (ii) Growth retardation, intellectual disability, and characteristic facial features group, (iii) Intellectual disability/epilepsy group, (iv) Cardiomyopathy group (mainly adult onset) (v) Proteinuria group, (vi) Fever, inflammation, skin rash, osteoarthritis group, have been newly recognized as “difficult-to-differentiate disorders” and comprehensive genomic testing can be reimbursed when patients belong to one of the six groups and certain requirements are met. The introduction of comprehensive genomic testing will improve the diagnosis rate of diseases and have significant potential to enhance Japan’s rare and intractable disease policy. The new strategy in Japan and its rationale will be a reference for insurance reimbursement of comprehensive genomic testing in other countries that have universal health coverage supported by the public health insurance system.","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":"69 11","pages":"549-552"},"PeriodicalIF":2.6,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141988156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reciprocal chromosomal translocation is one of genomic variations. When cytogenetically de novo reciprocal translocations are identified in patients with some clinical manifestations, the genes in the breakpoints are considered to be related to the clinical features. In this study, we encountered a patient with severe developmental delay, intractable epilepsy, growth failure, distinctive features, and skeletal manifestations. Conventional karyotyping revealed a de novo translocation described as 46,XY,t(3;4)(q27;q31.2). Chromosomal microarray testing detected a 1.25-Mb microdeletion at 3q27.3q28. Although the skeletal manifestations may have been affected by this deletion, the neurological features of this patient were severe and could not be fully explained by this deletion. Since no genomic copy number aberration was detected on chromosome 4, long-read whole-genome sequencing analysis was performed and a precise breakpoint was confirmed. A 460-bp deletion was detected between the two breakpoints; however, no gene was disrupted. FBXW7, the gene responsible for developmental delay, hypotonia, and impaired language, is in the 0.5-Mb telomeric region. Most of the patient’s clinical features were considered consistent with symptoms of FBXW7-related disorders, but were more severe. FBXW7 expression in the immortalized lymphoblasts of the patient was reduced compared to that in controls. Based on these findings, we suspect that FBXW7 is affected by downstream position effects of chromosomal translocations. The severe neurological features of the patient may have been affected not only by the 3q27-q28 deletion but also by impaired expression of FBXW7 derived from the breakage of chromosome 4.
{"title":"Reciprocal chromosome translocation t(3;4)(q27;q31.2) with deletion of 3q27 and reduced FBXW7 expression in a patient with developmental delay, hypotonia, and seizures","authors":"Takeaki Tamura, Keiko Shimojima Yamamoto, Jun Tohyama, Ichiro Morioka, Hitoshi Kanno, Toshiyuki Yamamoto","doi":"10.1038/s10038-024-01286-x","DOIUrl":"10.1038/s10038-024-01286-x","url":null,"abstract":"Reciprocal chromosomal translocation is one of genomic variations. When cytogenetically de novo reciprocal translocations are identified in patients with some clinical manifestations, the genes in the breakpoints are considered to be related to the clinical features. In this study, we encountered a patient with severe developmental delay, intractable epilepsy, growth failure, distinctive features, and skeletal manifestations. Conventional karyotyping revealed a de novo translocation described as 46,XY,t(3;4)(q27;q31.2). Chromosomal microarray testing detected a 1.25-Mb microdeletion at 3q27.3q28. Although the skeletal manifestations may have been affected by this deletion, the neurological features of this patient were severe and could not be fully explained by this deletion. Since no genomic copy number aberration was detected on chromosome 4, long-read whole-genome sequencing analysis was performed and a precise breakpoint was confirmed. A 460-bp deletion was detected between the two breakpoints; however, no gene was disrupted. FBXW7, the gene responsible for developmental delay, hypotonia, and impaired language, is in the 0.5-Mb telomeric region. Most of the patient’s clinical features were considered consistent with symptoms of FBXW7-related disorders, but were more severe. FBXW7 expression in the immortalized lymphoblasts of the patient was reduced compared to that in controls. Based on these findings, we suspect that FBXW7 is affected by downstream position effects of chromosomal translocations. The severe neurological features of the patient may have been affected not only by the 3q27-q28 deletion but also by impaired expression of FBXW7 derived from the breakage of chromosome 4.","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":"69 12","pages":"639-644"},"PeriodicalIF":2.6,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141912929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lissencephaly is a rare brain malformation characterized by abnormal neuronal migration during cortical development. In this study, we performed a comprehensive genetic analysis using next-generation sequencing in 12 unsolved Japanese lissencephaly patients, in whom PAFAH1B1, DCX, TUBA1A, and ARX variants were excluded using the Sanger method. Exome sequencing (ES) was conducted on these 12 patients, identifying pathogenic variants in CEP85L, DYNC1H1, LAMC3, and DCX in four patients. Next, we performed genome sequencing (GS) on eight unsolved patients, and structural variants in PAFAH1B1, including an inversion and microdeletions involving several exons, were detected in three patients. Notably, these microdeletions in PAFAH1B1 could not to be detected by copy number variation (CNV) detection tools based on the depth of coverage methods using ES data. The density of repeat sequences, including Alu sequences or segmental duplications, which increase the susceptibility to structural variations, is very high in some lissencephaly spectrum genes (PAFAH1B1, TUBA1A, DYNC1H1). These missing CNVs were due to the limitations of detecting repeat sequences in ES-based CNV detection tools. Our study suggests that a combined approach integrating ES with GS can contribute to a higher diagnostic yield and a better understanding of the genetic landscape of the lissencephaly spectrum.
脑裂畸形是一种罕见的脑畸形,其特征是大脑皮层发育过程中神经元迁移异常。在本研究中,我们使用新一代测序技术对 12 例尚未解决的日本脑裂患者进行了全面的遗传分析,其中使用 Sanger 方法排除了 PAFAH1B1、DCX、TUBA1A 和 ARX 变体。我们对这12名患者进行了外显子组测序(ES),在4名患者中发现了CEP85L、DYNC1H1、LAMC3和DCX的致病变异。接下来,我们对 8 例未解决的患者进行了基因组测序(GS),在 3 例患者中检测到 PAFAH1B1 的结构变异,包括倒位和涉及多个外显子的微缺失。值得注意的是,基于ES数据覆盖深度方法的拷贝数变异(CNV)检测工具无法检测到PAFAH1B1中的这些微缺失。在一些脑裂谱基因(PAFAH1B1、TUBA1A、DYNC1H1)中,重复序列(包括 Alu 序列或片段重复)的密度非常高,这增加了结构变异的易感性。这些缺失的 CNV 是由于基于 ES 的 CNV 检测工具在检测重复序列方面的局限性造成的。我们的研究表明,将 ES 与 GS 相结合的方法有助于提高诊断率,并更好地了解无脑畸形谱系的遗传情况。
{"title":"Exploring unsolved cases of lissencephaly spectrum: integrating exome and genome sequencing for higher diagnostic yield","authors":"Shogo Furukawa, Mitsuhiro Kato, Akihiko Ishiyama, Tomohiro Kumada, Takeshi Yoshida, Eri Takeshita, Pin Fee Chong, Hideo Yamanouchi, Yuko Kotake, Takayoshi Kyoda, Toshihiro Nomura, Yohane Miyata, Mitsuko Nakashima, Hirotomo Saitsu","doi":"10.1038/s10038-024-01283-0","DOIUrl":"10.1038/s10038-024-01283-0","url":null,"abstract":"Lissencephaly is a rare brain malformation characterized by abnormal neuronal migration during cortical development. In this study, we performed a comprehensive genetic analysis using next-generation sequencing in 12 unsolved Japanese lissencephaly patients, in whom PAFAH1B1, DCX, TUBA1A, and ARX variants were excluded using the Sanger method. Exome sequencing (ES) was conducted on these 12 patients, identifying pathogenic variants in CEP85L, DYNC1H1, LAMC3, and DCX in four patients. Next, we performed genome sequencing (GS) on eight unsolved patients, and structural variants in PAFAH1B1, including an inversion and microdeletions involving several exons, were detected in three patients. Notably, these microdeletions in PAFAH1B1 could not to be detected by copy number variation (CNV) detection tools based on the depth of coverage methods using ES data. The density of repeat sequences, including Alu sequences or segmental duplications, which increase the susceptibility to structural variations, is very high in some lissencephaly spectrum genes (PAFAH1B1, TUBA1A, DYNC1H1). These missing CNVs were due to the limitations of detecting repeat sequences in ES-based CNV detection tools. Our study suggests that a combined approach integrating ES with GS can contribute to a higher diagnostic yield and a better understanding of the genetic landscape of the lissencephaly spectrum.","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":"69 12","pages":"629-637"},"PeriodicalIF":2.6,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141912928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-02DOI: 10.1038/s10038-024-01278-x
Kaho Tanaka, Kosuke Kato, Naoki Nonaka, Jun Seita
Human leukocyte antigen (HLA) genes are associated with a variety of diseases, yet the direct typing of HLA alleles is both time-consuming and costly. Consequently, various imputation methods leveraging sequential single nucleotide polymorphisms (SNPs) data have been proposed, employing either statistical or deep learning models, such as the convolutional neural network (CNN)-based model, DEEP*HLA. However, these methods exhibit limited imputation efficiency for infrequent alleles and necessitate a large size of reference dataset. In this context, we have developed a Transformer-based model to HLA allele imputation, named “HLA Reliable IMpuatioN by Transformer (HLARIMNT)” designed to exploit the sequential nature of SNPs data. We evaluated HLARIMNT’s performance using two distinct reference panels; Pan-Asian reference panel (n = 530) and Type 1 Diabetes genetics Consortium (T1DGC) reference panel (n = 5225), alongside a combined panel (n = 1060). HLARIMNT demonstrated superior accuracy to DEEP*HLA across several indices, particularly for infrequent alleles. Furthermore, we explored the impact of varying training data sizes on imputation accuracy, finding that HLARIMNT consistently outperformed across all data size. These findings suggest that Transformer-based models can efficiently impute not only HLA types but potentially other gene types from sequential SNPs data.
{"title":"Efficient HLA imputation from sequential SNPs data by transformer","authors":"Kaho Tanaka, Kosuke Kato, Naoki Nonaka, Jun Seita","doi":"10.1038/s10038-024-01278-x","DOIUrl":"10.1038/s10038-024-01278-x","url":null,"abstract":"Human leukocyte antigen (HLA) genes are associated with a variety of diseases, yet the direct typing of HLA alleles is both time-consuming and costly. Consequently, various imputation methods leveraging sequential single nucleotide polymorphisms (SNPs) data have been proposed, employing either statistical or deep learning models, such as the convolutional neural network (CNN)-based model, DEEP*HLA. However, these methods exhibit limited imputation efficiency for infrequent alleles and necessitate a large size of reference dataset. In this context, we have developed a Transformer-based model to HLA allele imputation, named “HLA Reliable IMpuatioN by Transformer (HLARIMNT)” designed to exploit the sequential nature of SNPs data. We evaluated HLARIMNT’s performance using two distinct reference panels; Pan-Asian reference panel (n = 530) and Type 1 Diabetes genetics Consortium (T1DGC) reference panel (n = 5225), alongside a combined panel (n = 1060). HLARIMNT demonstrated superior accuracy to DEEP*HLA across several indices, particularly for infrequent alleles. Furthermore, we explored the impact of varying training data sizes on imputation accuracy, finding that HLARIMNT consistently outperformed across all data size. These findings suggest that Transformer-based models can efficiently impute not only HLA types but potentially other gene types from sequential SNPs data.","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":"69 10","pages":"533-540"},"PeriodicalIF":2.6,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s10038-024-01278-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141878853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-31DOI: 10.1038/s10038-024-01275-0
Yao-zhong Zhang, Seiya Imoto
Genomic sequences are traditionally represented as strings of characters: A (adenine), C (cytosine), G (guanine), and T (thymine). However, an alternative approach involves depicting sequence-related information through image representations, such as Chaos Game Representation (CGR) and read pileup images. With rapid advancements in deep learning (DL) methods within computer vision and natural language processing, there is growing interest in applying image-based DL methods to genomic sequence analysis. These methods involve encoding genomic information as images or integrating spatial information from images into the analytical process. In this review, we summarize three typical applications that use image processing with DL models for genome analysis. We examine the utilization and advantages of these image-based approaches.
{"title":"Genome analysis through image processing with deep learning models","authors":"Yao-zhong Zhang, Seiya Imoto","doi":"10.1038/s10038-024-01275-0","DOIUrl":"10.1038/s10038-024-01275-0","url":null,"abstract":"Genomic sequences are traditionally represented as strings of characters: A (adenine), C (cytosine), G (guanine), and T (thymine). However, an alternative approach involves depicting sequence-related information through image representations, such as Chaos Game Representation (CGR) and read pileup images. With rapid advancements in deep learning (DL) methods within computer vision and natural language processing, there is growing interest in applying image-based DL methods to genomic sequence analysis. These methods involve encoding genomic information as images or integrating spatial information from images into the analytical process. In this review, we summarize three typical applications that use image processing with DL models for genome analysis. We examine the utilization and advantages of these image-based approaches.","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":"69 10","pages":"519-525"},"PeriodicalIF":2.6,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s10038-024-01275-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141859951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-31DOI: 10.1038/s10038-024-01277-y
Kenji Naritomi
{"title":"An application supporting diagnosis for rare genetic diseases – UR-DBMS and Syndrome Finder –","authors":"Kenji Naritomi","doi":"10.1038/s10038-024-01277-y","DOIUrl":"10.1038/s10038-024-01277-y","url":null,"abstract":"","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":"69 10","pages":"527-531"},"PeriodicalIF":2.6,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141859950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}