Pub Date : 2026-02-07DOI: 10.1016/j.xhgg.2026.100579
Zheng Li, Wei Zhao, Xiang Zhou, Yuk Yee Leung, Gerard D Schellenberg, Li-San Wang, Sebastian Schönherr, Lukas Forer, Christian Fuchsberger, Sharmistha Dey, Jinkook Lee, Jennifer A Smith, Aparajit B Dey, Sharon L R Kardia
India is the most populous country globally, yet genetic studies involving Indian individuals remain limited. The Indian population is composed of many founder groups and has a mixed genetic ancestry, including an ancestral component not observed anywhere outside of India. This presents a unique opportunity to uncover novel disease variants and develop tailored medical interventions. To facilitate genetic research in India, a crucial first step is to create a foundational resource that serves as a benchmark for future population studies and methods development. Thus, we constructed the largest and most nationally representative linkage disequilibrium (LD) and genotype imputation reference panels in India to date, using high-coverage whole-genome sequencing data of 2,680 participants from the Longitudinal Aging Study in India-Harmonized Diagnostic Assessment of Dementia (LASI-DAD). As an LD reference panel, LASI-DAD includes 69.5 million variants, representing 170% and 213% increases relative to the 1000 Genomes Project (1000G) and TOP-LD South Asian panels, respectively. Besides serving as an LD lookup panel, LASI-DAD facilitates various statistical analyses relying on precise LD estimates. In polygenic risk score (PRS) analyses, LASI-DAD improved the PRS predictive performance by 2.1% to 35.1% across traits and studies. As an imputation reference panel, LASI-DAD enhanced imputation accuracy, measured by the Pearson correlation between imputed and true genotypes, by 3% to 101% (mean = 38%) compared to the TOPMed panel and by 3% to 73% (mean = 27%) compared to the Genome Asia Pilot panel across different allele frequencies. The LASI-DAD reference panel is publicly available to benefit future studies.
印度是全球人口最多的国家,但涉及印度人的基因研究仍然有限。印度人口由许多创始群体组成,具有混合的遗传祖先,包括在印度以外任何地方都没有观察到的祖先成分。这提供了一个独特的机会来发现新的疾病变异和开发量身定制的医疗干预措施。为了促进印度的基因研究,关键的第一步是创建一个基础资源,作为未来人口研究和方法开发的基准。因此,我们利用来自印度纵向衰老研究的2680名参与者的高覆盖率全基因组测序数据,构建了迄今为止印度规模最大、最具全国代表性的连锁不平衡(LD)和基因型imputation参考面板。作为LD参考面板,LASI-DAD包含6950万个变异,相对于1000基因组计划(1000G)和TOP-LD南亚面板分别增加了170%和213%。除了作为LD查找面板外,LASI-DAD还可以根据精确的LD估计进行各种统计分析。在多基因风险评分(PRS)分析中,LASI-DAD在各性状和研究中将PRS预测性能提高了2.1%至35.1%。作为一个输入参考面板,LASI-DAD提高了输入准确性,通过输入基因型和真实基因型之间的Pearson相关性来测量,与TOPMed面板相比,在不同的等位基因频率下,与Genome Asia Pilot面板相比,LASI-DAD面板提高了3%至101%(平均= 38%),提高了3%至73%(平均= 27%)。LASI-DAD参考面板是公开的,有利于未来的研究。
{"title":"A reference panel for linkage disequilibrium and genotype imputation using whole-genome sequencing data from 2,680 participants across India.","authors":"Zheng Li, Wei Zhao, Xiang Zhou, Yuk Yee Leung, Gerard D Schellenberg, Li-San Wang, Sebastian Schönherr, Lukas Forer, Christian Fuchsberger, Sharmistha Dey, Jinkook Lee, Jennifer A Smith, Aparajit B Dey, Sharon L R Kardia","doi":"10.1016/j.xhgg.2026.100579","DOIUrl":"https://doi.org/10.1016/j.xhgg.2026.100579","url":null,"abstract":"<p><p>India is the most populous country globally, yet genetic studies involving Indian individuals remain limited. The Indian population is composed of many founder groups and has a mixed genetic ancestry, including an ancestral component not observed anywhere outside of India. This presents a unique opportunity to uncover novel disease variants and develop tailored medical interventions. To facilitate genetic research in India, a crucial first step is to create a foundational resource that serves as a benchmark for future population studies and methods development. Thus, we constructed the largest and most nationally representative linkage disequilibrium (LD) and genotype imputation reference panels in India to date, using high-coverage whole-genome sequencing data of 2,680 participants from the Longitudinal Aging Study in India-Harmonized Diagnostic Assessment of Dementia (LASI-DAD). As an LD reference panel, LASI-DAD includes 69.5 million variants, representing 170% and 213% increases relative to the 1000 Genomes Project (1000G) and TOP-LD South Asian panels, respectively. Besides serving as an LD lookup panel, LASI-DAD facilitates various statistical analyses relying on precise LD estimates. In polygenic risk score (PRS) analyses, LASI-DAD improved the PRS predictive performance by 2.1% to 35.1% across traits and studies. As an imputation reference panel, LASI-DAD enhanced imputation accuracy, measured by the Pearson correlation between imputed and true genotypes, by 3% to 101% (mean = 38%) compared to the TOPMed panel and by 3% to 73% (mean = 27%) compared to the Genome Asia Pilot panel across different allele frequencies. The LASI-DAD reference panel is publicly available to benefit future studies.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100579"},"PeriodicalIF":3.6,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146143881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Approximately 200 genes have been identified as causative in hereditary hearing loss. Genetic testing is increasingly important-not only for accurate diagnosis but also for predicting audiometric profiles, prognoses, and potential syndromic features. Hereditary hearing loss can be syndromic or nonsyndromic, with nonsyndromic forms further classified by inheritance: autosomal-dominant or -recessive. In autosomal-dominant cases (ADNSHL), three pathological mechanisms-haploinsufficiency, dominant-negative effects and gain of function-are often implicated. Moreover, specific genes correlate with distinct audiometric patterns: WFS1 variants typically cause low-frequency hearing loss, whereas KCNQ4 and POU4F3 variants are linked to high-frequency loss. To investigate the underlying mechanisms of these frequency-dependent patterns, gene expression across cochlear turns was compared in mice-but interpretations of the results were limited because of inherent structural differences between rodent and primate cochleae. Therefore, the common marmoset (Callithrix jacchus), which offers closer anatomical and functional similarity to human cochleae, was utilized herein as an improved model. Using RNA sequencing (RNA-seq) across cochlear turns of common marmosets, the present study aimed to uncover gene expression and alternative splicing patterns that may explain tonotopic manifestations in hereditary hearing loss, including those caused by WFS1 variants-the present study being the such one using common marmoset cochlear RNA-seq data and these findings are highly valuable for genetic diagnosis and the development of gene therapies.
{"title":"Identification of Alternative Splicing in WFS1 Associated with Low-Frequency Hearing Loss in Common Marmoset.","authors":"Shu Yokota, Hidekane Yoshimura, Shin-Ya Nishio, Erika Sasaki, Keisuke Mukasa, Shin-Ichi Usami, Yutaka Takumi","doi":"10.1016/j.xhgg.2026.100578","DOIUrl":"https://doi.org/10.1016/j.xhgg.2026.100578","url":null,"abstract":"<p><p>Approximately 200 genes have been identified as causative in hereditary hearing loss. Genetic testing is increasingly important-not only for accurate diagnosis but also for predicting audiometric profiles, prognoses, and potential syndromic features. Hereditary hearing loss can be syndromic or nonsyndromic, with nonsyndromic forms further classified by inheritance: autosomal-dominant or -recessive. In autosomal-dominant cases (ADNSHL), three pathological mechanisms-haploinsufficiency, dominant-negative effects and gain of function-are often implicated. Moreover, specific genes correlate with distinct audiometric patterns: WFS1 variants typically cause low-frequency hearing loss, whereas KCNQ4 and POU4F3 variants are linked to high-frequency loss. To investigate the underlying mechanisms of these frequency-dependent patterns, gene expression across cochlear turns was compared in mice-but interpretations of the results were limited because of inherent structural differences between rodent and primate cochleae. Therefore, the common marmoset (Callithrix jacchus), which offers closer anatomical and functional similarity to human cochleae, was utilized herein as an improved model. Using RNA sequencing (RNA-seq) across cochlear turns of common marmosets, the present study aimed to uncover gene expression and alternative splicing patterns that may explain tonotopic manifestations in hereditary hearing loss, including those caused by WFS1 variants-the present study being the such one using common marmoset cochlear RNA-seq data and these findings are highly valuable for genetic diagnosis and the development of gene therapies.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100578"},"PeriodicalIF":3.6,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146137916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-04DOI: 10.1016/j.xhgg.2026.100576
Amy L Williams
Genetic relatives share long stretches of DNA they co-inherited from a common ancestor in identical-by-descent (IBD) segments. Because children inherit half their parents' genomes, the expected amount of DNA relatives share drops by 12 for each generation that separates them, being 2-d for d-degree relatives. Even so, there is substantial variance in sharing rates, such that most distant relatives share zero IBD segments. We characterized IBD segment sharing between relatives by simulating 100,000 pairs for each of first through eighth cousins, including once removed and half-cousins, while modeling both crossover interference and sex-specific genetic maps. Our results show that 98.5% of third cousins share at least one IBD segment, while only 32.7% of fifth cousins and 0.96% of eighth cousins have such sharing. These sharing rates are slightly higher than those that arise from models that ignore the more elaborate crossover features and can be filtered by segment length. The resulting segment count distributions are available with an interactive segment length threshold at https://hapi-dna.org/ibd-sharing-rates/.
{"title":"The rate of identical-by-descent segment sharing between close and distant relatives.","authors":"Amy L Williams","doi":"10.1016/j.xhgg.2026.100576","DOIUrl":"https://doi.org/10.1016/j.xhgg.2026.100576","url":null,"abstract":"<p><p>Genetic relatives share long stretches of DNA they co-inherited from a common ancestor in identical-by-descent (IBD) segments. Because children inherit half their parents' genomes, the expected amount of DNA relatives share drops by 12 for each generation that separates them, being 2<sup>-d</sup> for d-degree relatives. Even so, there is substantial variance in sharing rates, such that most distant relatives share zero IBD segments. We characterized IBD segment sharing between relatives by simulating 100,000 pairs for each of first through eighth cousins, including once removed and half-cousins, while modeling both crossover interference and sex-specific genetic maps. Our results show that 98.5% of third cousins share at least one IBD segment, while only 32.7% of fifth cousins and 0.96% of eighth cousins have such sharing. These sharing rates are slightly higher than those that arise from models that ignore the more elaborate crossover features and can be filtered by segment length. The resulting segment count distributions are available with an interactive segment length threshold at https://hapi-dna.org/ibd-sharing-rates/.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100576"},"PeriodicalIF":3.6,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1016/j.xhgg.2026.100577
Ella Beraldo, Shelin Adam, Colleen Guimond, Jan M Friedman
An increased frequency of sporadic autosomal dominant disorders has been observed among children born to older fathers. This paternal age effect is thought to reflect an accumulation of new mutations in the male germ line as DNA replication and cell division continue to occur as men age. Genome-wide sequencing is useful for identifying disease-causing genetic variants in patients with suspected genetic diseases and for determining inheritance or de novo mutation of the variants when done in patient-parent trios. We analyzed paternal ages in 593 families who received trio or quad exome or genome sequencing for suspected genetic disease. The mean age of fathers of children with de novo disease-causing variants (35.09 years) was significantly greater than that of children with inherited disease-causing variants (33.78 years, p=0.04). The mean age of mothers of children with de novo disease-causing variants (31.86 years) was not significantly greater than that of children with inherited disease-causing variants (30.80 years, p=0.09). Interestingly, when the de novo disease-causing variants were broken down into subgroups by variant type, both mean paternal age and mean maternal age of children with de novo indel variants (paternal = 36.33 years, maternal = 33.34 years) were significantly higher than in children identified to have de novo single nucleotide variants (paternal = 34.35 years, p=0.03; maternal = 31.15 years, p=0.004). This observation, which may have implications for how indels arise, requires further study.
{"title":"Paternal age effect in autosomal dominant or X-linked de novo variants identified by genome-wide sequencing.","authors":"Ella Beraldo, Shelin Adam, Colleen Guimond, Jan M Friedman","doi":"10.1016/j.xhgg.2026.100577","DOIUrl":"https://doi.org/10.1016/j.xhgg.2026.100577","url":null,"abstract":"<p><p>An increased frequency of sporadic autosomal dominant disorders has been observed among children born to older fathers. This paternal age effect is thought to reflect an accumulation of new mutations in the male germ line as DNA replication and cell division continue to occur as men age. Genome-wide sequencing is useful for identifying disease-causing genetic variants in patients with suspected genetic diseases and for determining inheritance or de novo mutation of the variants when done in patient-parent trios. We analyzed paternal ages in 593 families who received trio or quad exome or genome sequencing for suspected genetic disease. The mean age of fathers of children with de novo disease-causing variants (35.09 years) was significantly greater than that of children with inherited disease-causing variants (33.78 years, p=0.04). The mean age of mothers of children with de novo disease-causing variants (31.86 years) was not significantly greater than that of children with inherited disease-causing variants (30.80 years, p=0.09). Interestingly, when the de novo disease-causing variants were broken down into subgroups by variant type, both mean paternal age and mean maternal age of children with de novo indel variants (paternal = 36.33 years, maternal = 33.34 years) were significantly higher than in children identified to have de novo single nucleotide variants (paternal = 34.35 years, p=0.03; maternal = 31.15 years, p=0.004). This observation, which may have implications for how indels arise, requires further study.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100577"},"PeriodicalIF":3.6,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1016/j.xhgg.2026.100575
Philip Harraka, Robert L O'Reilly, Jared Burke, Paul Yeh, Kerryn Howlett, Kiarash Behrouzfar, Daniele Belluoccio, Amanda Rewse, Brigid M Lynch, Kristen J Bubb, Stephen J Nicholls, Roger L Milne, Melissa C Southey
Clonal haematopoiesis of indeterminate potential (CHIP) is associated with many diseases of ageing. Large research initiatives are needed to develop clinical guidelines for the management of individuals with CHIP, and their risk of disease. However, little guidance is available for the classification of variants as CHIP-associated, or how to identify individuals consistently and systematically as having CHIP. This study aimed to develop and execute a resource-mindful framework for identifying individuals with CHIP, and those without, for downstream clinical studies. This framework was used to categorise CHIP in a cross-section of 2,328 participants from the Australian Breakthrough Cancer (ABC) Study. DNA extracted from saliva samples was sequenced for a panel of ten gene regions that frequently carry variants that are associated with CHIP. Variants in these regions were curated for CHIP according to field-specific criteria. Individuals were categorised as either CHIP-positive, -negative, or -indeterminate based on their variant findings. Sequencing was successfully performed on 2,328 individuals. The mean age (± standard deviation) was 68±3 years and 48% were men. 347 participants (15%) were identified as CHIP-positive with a total of 400 CHIP-associated variants. 1,442 participants (62%) were considered CHIP-negative based on finding no somatic variation within the target regions. The remaining 539 (23%) were considered CHIP-indeterminate because they had at least one variant that could not be interpreted. This framework provides a consistent approach to the categorisation of individuals as CHIP-positive or -negative for clinical research and provides an opportunity for improved harmonisation in the curation of CHIP.
{"title":"A framework for the molecular identification of CHIP for clinical research.","authors":"Philip Harraka, Robert L O'Reilly, Jared Burke, Paul Yeh, Kerryn Howlett, Kiarash Behrouzfar, Daniele Belluoccio, Amanda Rewse, Brigid M Lynch, Kristen J Bubb, Stephen J Nicholls, Roger L Milne, Melissa C Southey","doi":"10.1016/j.xhgg.2026.100575","DOIUrl":"https://doi.org/10.1016/j.xhgg.2026.100575","url":null,"abstract":"<p><p>Clonal haematopoiesis of indeterminate potential (CHIP) is associated with many diseases of ageing. Large research initiatives are needed to develop clinical guidelines for the management of individuals with CHIP, and their risk of disease. However, little guidance is available for the classification of variants as CHIP-associated, or how to identify individuals consistently and systematically as having CHIP. This study aimed to develop and execute a resource-mindful framework for identifying individuals with CHIP, and those without, for downstream clinical studies. This framework was used to categorise CHIP in a cross-section of 2,328 participants from the Australian Breakthrough Cancer (ABC) Study. DNA extracted from saliva samples was sequenced for a panel of ten gene regions that frequently carry variants that are associated with CHIP. Variants in these regions were curated for CHIP according to field-specific criteria. Individuals were categorised as either CHIP-positive, -negative, or -indeterminate based on their variant findings. Sequencing was successfully performed on 2,328 individuals. The mean age (± standard deviation) was 68±3 years and 48% were men. 347 participants (15%) were identified as CHIP-positive with a total of 400 CHIP-associated variants. 1,442 participants (62%) were considered CHIP-negative based on finding no somatic variation within the target regions. The remaining 539 (23%) were considered CHIP-indeterminate because they had at least one variant that could not be interpreted. This framework provides a consistent approach to the categorisation of individuals as CHIP-positive or -negative for clinical research and provides an opportunity for improved harmonisation in the curation of CHIP.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100575"},"PeriodicalIF":3.6,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146019865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1016/j.xhgg.2026.100574
Dongyu Wang, Sabrina Abbruzzese, Nancy Heard-Costa, Andy Rampersaud, Eden Martin, Adam Naj, Bilcag Akgun, Brian Kunkle, Sudha Seshadri, Gina Peloso, Anita L DeStefano, Zilin Li, Xihao Li, Seung Hoan Choi
Rare genetic variation is considered a potential source of heritability in individuals with sporadic Alzheimer's Disease and related dementias (ADRD). The STAAR framework leverages multiple functional annotations of genetic variants and combines association statistics from multiple variant aggregation-based methods, including burden, SKAT, and ACAT-V, into a single measure of significance. Using whole genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP), we comprehensively examined the association of rare genetic variation with ADRD in 23,454 individuals (37% ADRD cases) and with cognitively healthy elder status in 13,292 individuals (13% cognitively healthy elders) from diverse populations via the STAAR framework. We identified several genes significantly associated with ADRD or cognitively healthy status. However, our analysis revealed several limitations within the STAAR framework incorporating ultra-rare variants with dichotomous outcomes. To enhance the robustness of the framework, we proposed several computational refinements, including creating a burden of ultra-rare variants and employing more precise annotations to match with expected mechanism. After implementing the proposed modifications, the association with ADRD for ZNF200 was no longer statistically significant (α=1x10-7), while TBX19, PLXNB2, CARD11, and LINC01880 remained significantly associated with cognitively healthy status. We identified and addressed the computational limitations in the STAAR framework that could lead to potential spurious results for ultra-rare variant aggregates with an extremely low cumulative minor allele count. Our proposed refinements produced more robust results for associations with rare variants in the context of dichotomous outcomes.
{"title":"Application of the STAAR Framework in Detecting Rare Variant Associations with Alzheimer's Disease and Related Dementias: Insights and Implications.","authors":"Dongyu Wang, Sabrina Abbruzzese, Nancy Heard-Costa, Andy Rampersaud, Eden Martin, Adam Naj, Bilcag Akgun, Brian Kunkle, Sudha Seshadri, Gina Peloso, Anita L DeStefano, Zilin Li, Xihao Li, Seung Hoan Choi","doi":"10.1016/j.xhgg.2026.100574","DOIUrl":"https://doi.org/10.1016/j.xhgg.2026.100574","url":null,"abstract":"<p><p>Rare genetic variation is considered a potential source of heritability in individuals with sporadic Alzheimer's Disease and related dementias (ADRD). The STAAR framework leverages multiple functional annotations of genetic variants and combines association statistics from multiple variant aggregation-based methods, including burden, SKAT, and ACAT-V, into a single measure of significance. Using whole genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP), we comprehensively examined the association of rare genetic variation with ADRD in 23,454 individuals (37% ADRD cases) and with cognitively healthy elder status in 13,292 individuals (13% cognitively healthy elders) from diverse populations via the STAAR framework. We identified several genes significantly associated with ADRD or cognitively healthy status. However, our analysis revealed several limitations within the STAAR framework incorporating ultra-rare variants with dichotomous outcomes. To enhance the robustness of the framework, we proposed several computational refinements, including creating a burden of ultra-rare variants and employing more precise annotations to match with expected mechanism. After implementing the proposed modifications, the association with ADRD for ZNF200 was no longer statistically significant (α=1x10<sup>-7</sup>), while TBX19, PLXNB2, CARD11, and LINC01880 remained significantly associated with cognitively healthy status. We identified and addressed the computational limitations in the STAAR framework that could lead to potential spurious results for ultra-rare variant aggregates with an extremely low cumulative minor allele count. Our proposed refinements produced more robust results for associations with rare variants in the context of dichotomous outcomes.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100574"},"PeriodicalIF":3.6,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146012311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.1016/j.xhgg.2026.100573
Ang Zhou, Ville Karhunen, Haodong Tian, Janne Pott, Ashish Patel, Eric A W Slob, Stephen Burgess
Optimal selection of instrumental variables (IVs) from a single gene region in cis-Mendelian randomization (MR) is challenging, as variants are highly correlated due to linkage disequilibrium (LD). Using only the lead variant is convenient but may not achieve full statistical power if multiple signals exist. We compared four selection methods that incorporate correlated non-lead variants, including LD-pruning, conditional and joint analysis (COJO), sum of single effects (SuSiE) regression, and principal component analysis (PCA), and evaluated their ability to increase instrument strength, measured by variance explained in the exposure (R2), relative to the lead-variant-only approach. We applied these methods to circulating haptoglobin (HP), to simulated traits with known variance explained, and to 15 additional gene regions where non-lead cis-protein quantitative trait loci (pQTLs) contributed varying proportions of cis-genetic variance. R2 was estimated from variant-protein association estimates (Fenland study, n = 10,708) using LD from the UK Biobank (n = 356,557). In the HP region, the four methods produced a median proportional gain in R2 of 145.1% compared with the lead variant alone (range: 69.6%-169.4%), with a median reduction in the MR standard error of 36.3% (range: -37.9% to -19.3%). In simulations, all methods were able to recover the expected genetic variance. Across the 15 gene regions, methods incorporating non-lead variants consistently outperformed the lead-variant-only approach. Variant selection methods incorporating correlated non-lead variants can reliably improve instrument strength in cis-MR analyses. We recommend using such methods but advise comparing their estimates with the lead-variant-only estimate to safeguard against numerical instability.
{"title":"Variant selection to maximize variance explained in cis-Mendelian randomization.","authors":"Ang Zhou, Ville Karhunen, Haodong Tian, Janne Pott, Ashish Patel, Eric A W Slob, Stephen Burgess","doi":"10.1016/j.xhgg.2026.100573","DOIUrl":"10.1016/j.xhgg.2026.100573","url":null,"abstract":"<p><p>Optimal selection of instrumental variables (IVs) from a single gene region in cis-Mendelian randomization (MR) is challenging, as variants are highly correlated due to linkage disequilibrium (LD). Using only the lead variant is convenient but may not achieve full statistical power if multiple signals exist. We compared four selection methods that incorporate correlated non-lead variants, including LD-pruning, conditional and joint analysis (COJO), sum of single effects (SuSiE) regression, and principal component analysis (PCA), and evaluated their ability to increase instrument strength, measured by variance explained in the exposure (R<sup>2</sup>), relative to the lead-variant-only approach. We applied these methods to circulating haptoglobin (HP), to simulated traits with known variance explained, and to 15 additional gene regions where non-lead cis-protein quantitative trait loci (pQTLs) contributed varying proportions of cis-genetic variance. R<sup>2</sup> was estimated from variant-protein association estimates (Fenland study, n = 10,708) using LD from the UK Biobank (n = 356,557). In the HP region, the four methods produced a median proportional gain in R<sup>2</sup> of 145.1% compared with the lead variant alone (range: 69.6%-169.4%), with a median reduction in the MR standard error of 36.3% (range: -37.9% to -19.3%). In simulations, all methods were able to recover the expected genetic variance. Across the 15 gene regions, methods incorporating non-lead variants consistently outperformed the lead-variant-only approach. Variant selection methods incorporating correlated non-lead variants can reliably improve instrument strength in cis-MR analyses. We recommend using such methods but advise comparing their estimates with the lead-variant-only estimate to safeguard against numerical instability.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100573"},"PeriodicalIF":3.6,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145994600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15Epub Date: 2025-08-27DOI: 10.1016/j.xhgg.2025.100498
Jarosław Dulski, Arun K Boddapati, Barbara Risi, Pablo Iruzubieta, Antonio Orlacchio, Roberto Fernández-Torrón, Tamara Castillo-Triviño, Adolfo López de Munain, Steve Vucic, Alessandro Padovani, Laura Donker Kaat, Tahsin Stefan Barakat, Leonard Petrucelli, Mercedes Prudencio, John E Landers, Jochen H Weishaupt, Andreas Prokop, Massimiliano Filosto, Zbigniew K Wszolek, Devesh C Pant
KIF5A (Kinesin family member 5A) is a motor protein that functions as a key component of the axonal transport machinery. Variants in KIF5A are linked to several neurodegenerative diseases, mainly spastic paraplegia type 10 (SPG10), Charcot-Marie-Tooth disease type 2 (CMT2), and amyotrophic lateral sclerosis (ALS). These diseases share motor neuron involvement but vary significantly in clinical presentation, severity, and progression. KIF5A variants are mainly categorized into N-terminal variants associated with SPG10/CMT2 and C-terminal variants linked to ALS. This study utilized a multiplex NULISA targeted platform to analyze plasma proteome from KIF5A-linked SPG10 and ALS individuals and compare them to healthy controls. Our results revealed distinct proteomic signatures, with significant alterations in proteins related to synaptic function and inflammation. Notably, neurofilament light polypeptide, a biomarker for neurodegenerative diseases, was elevated in KIF5A ALS but not in SPG10 individuals. Moreover, these findings can now be used to gain mechanistic understanding of axonopathies linking to N- versus C-terminal KIF5A variants affecting both central and peripheral nervous systems.
{"title":"Targeted plasma proteomics uncover proteins associated with KIF5A-linked SPG10 and ALS spectrum disorders.","authors":"Jarosław Dulski, Arun K Boddapati, Barbara Risi, Pablo Iruzubieta, Antonio Orlacchio, Roberto Fernández-Torrón, Tamara Castillo-Triviño, Adolfo López de Munain, Steve Vucic, Alessandro Padovani, Laura Donker Kaat, Tahsin Stefan Barakat, Leonard Petrucelli, Mercedes Prudencio, John E Landers, Jochen H Weishaupt, Andreas Prokop, Massimiliano Filosto, Zbigniew K Wszolek, Devesh C Pant","doi":"10.1016/j.xhgg.2025.100498","DOIUrl":"10.1016/j.xhgg.2025.100498","url":null,"abstract":"<p><p>KIF5A (Kinesin family member 5A) is a motor protein that functions as a key component of the axonal transport machinery. Variants in KIF5A are linked to several neurodegenerative diseases, mainly spastic paraplegia type 10 (SPG10), Charcot-Marie-Tooth disease type 2 (CMT2), and amyotrophic lateral sclerosis (ALS). These diseases share motor neuron involvement but vary significantly in clinical presentation, severity, and progression. KIF5A variants are mainly categorized into N-terminal variants associated with SPG10/CMT2 and C-terminal variants linked to ALS. This study utilized a multiplex NULISA targeted platform to analyze plasma proteome from KIF5A-linked SPG10 and ALS individuals and compare them to healthy controls. Our results revealed distinct proteomic signatures, with significant alterations in proteins related to synaptic function and inflammation. Notably, neurofilament light polypeptide, a biomarker for neurodegenerative diseases, was elevated in KIF5A ALS but not in SPG10 individuals. Moreover, these findings can now be used to gain mechanistic understanding of axonopathies linking to N- versus C-terminal KIF5A variants affecting both central and peripheral nervous systems.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100498"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144972066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Assembling individual genomes remains an expensive endeavor, hindering large-scale comparative human genomics. So far, chromosome-level assemblies of only a few individuals, including PR1 (Puerto Rican), Ash1 (Ashkenazi Jew), Han1 (Southern Han Chinese), and CHM13 (Northern European) have been reported. Here, we present a chromosome-level genome assembly of a non-International Genome Sample Resource (IGSR) and non-Genome in a Bottle individual from the Indian subcontinent (KIn1) obtained using a cost-effective approach. We achieved an N50 of 141 Mb and an L50 of 9-very close to the maximum achievable N50 of 147 Mb and minimum achievable L50 of 8, respectively, for human genomes. We also generated chromosome-level assemblies for other individuals from the Indian diaspora, including PJL1 from Punjab, Lahore (HG03492), GIH1 from Gujarat (NA20847), BIB1 from Bangladesh (HG03009), and ITU1 from Andhra Pradesh (HG04217), all represented in IGSR, by scaffolding the publicly available respective contigs and Hi-C data. Here, we demonstrate that by comparing these individual genomes with those reported elsewhere, the configuration of inversion 8p23.1 in KIn1, Han1, GIH1, and BIB1 is similar to that in hg38, here to referred as 8p23.1std. The inverted configuration, 8p23.1inv, is present in CHM13, PJL1, Ash1, and PR1. We also find evidence of all three large known inversions in the p-arm of chromosome 16, with prevalence among South Asians. In chromosome 5, one of the reported inversions is present in all assemblies except hg38 and Ash1. Finally, we investigate the large inversions that are unique to KIn1.
{"title":"The Karnataka Individual Genome Project expands the human reference landscape to include South Asia.","authors":"Apoorva Ganesh, Anisha Mhatre, Yash Chindarkar, Moushmi Goswami, Prakruti Mishra, Aditya Sharma, Manjushri Kalpande, Febina Ravindran, Subhashini Srinivasan, Bibha Choudhary","doi":"10.1016/j.xhgg.2025.100516","DOIUrl":"10.1016/j.xhgg.2025.100516","url":null,"abstract":"<p><p>Assembling individual genomes remains an expensive endeavor, hindering large-scale comparative human genomics. So far, chromosome-level assemblies of only a few individuals, including PR1 (Puerto Rican), Ash1 (Ashkenazi Jew), Han1 (Southern Han Chinese), and CHM13 (Northern European) have been reported. Here, we present a chromosome-level genome assembly of a non-International Genome Sample Resource (IGSR) and non-Genome in a Bottle individual from the Indian subcontinent (KIn1) obtained using a cost-effective approach. We achieved an N50 of 141 Mb and an L50 of 9-very close to the maximum achievable N50 of 147 Mb and minimum achievable L50 of 8, respectively, for human genomes. We also generated chromosome-level assemblies for other individuals from the Indian diaspora, including PJL1 from Punjab, Lahore (HG03492), GIH1 from Gujarat (NA20847), BIB1 from Bangladesh (HG03009), and ITU1 from Andhra Pradesh (HG04217), all represented in IGSR, by scaffolding the publicly available respective contigs and Hi-C data. Here, we demonstrate that by comparing these individual genomes with those reported elsewhere, the configuration of inversion 8p23.1 in KIn1, Han1, GIH1, and BIB1 is similar to that in hg38, here to referred as 8p23.1std. The inverted configuration, 8p23.1inv, is present in CHM13, PJL1, Ash1, and PR1. We also find evidence of all three large known inversions in the p-arm of chromosome 16, with prevalence among South Asians. In chromosome 5, one of the reported inversions is present in all assemblies except hg38 and Ash1. Finally, we investigate the large inversions that are unique to KIn1.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100516"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12513208/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145087377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15Epub Date: 2025-10-16DOI: 10.1016/j.xhgg.2025.100533
Yining Liu, Yeunjoo E Song, Audrey Lynn, Weihuan Wang, Kristy Miskimen, Sarada L Fuzzell, Sherri D Hochstetler, Renee A Laux, Laura J Caywood, Jason E Clouse, Sharlene D Herington, Ping Wang, Alexander Gulyayev, Daniel A Dorfsman, Noel C Moore, Leighanne R Main, Michael B Prough, Andrew F Zaman, Larry D Adams, Patrice Whitehead, Paula Ogrocki, Alan J Lerner, Jeffery M Vance, Michael L Cuccaro, William K Scott, Margaret A Pericak-Vance, Jonathan L Haines
Telomere length (TL) is a key indicator of biological aging. Understanding the association between TL and cognitive impairment may provide important insights into disease mechanisms for age-related neurodegenerative disorders, such as Alzheimer's disease (AD). However, the relationship between TL and cognitive function remains controversial, with studies reporting positive, negative, or no associations between them. This inconsistency may be attributed to genetic and environmental variations or differences in TL measurement methods. We conducted a comprehensive characterization of DNA sequence-determined TL and analyzed its association with cognitive function in the Midwestern Amish. The Midwestern Amish are a founder population demonstrating reduced genetic and environmental variation compared with the general European population. This unique population structure allowed us to better control for potential confounding by non-telomere genetic and environmental factors. Our study confirmed the expected telomere shortening with age and provided both SNP-based and pedigree-based TL heritability estimates. No significant correlation was observed between TL and cognitive function. However, a genome-wide association study of TL revealed three loci associated with TL, each containing Amish-enriched rare variants.
{"title":"Telomere length, aging, and cognitive function in the Midwestern Amish.","authors":"Yining Liu, Yeunjoo E Song, Audrey Lynn, Weihuan Wang, Kristy Miskimen, Sarada L Fuzzell, Sherri D Hochstetler, Renee A Laux, Laura J Caywood, Jason E Clouse, Sharlene D Herington, Ping Wang, Alexander Gulyayev, Daniel A Dorfsman, Noel C Moore, Leighanne R Main, Michael B Prough, Andrew F Zaman, Larry D Adams, Patrice Whitehead, Paula Ogrocki, Alan J Lerner, Jeffery M Vance, Michael L Cuccaro, William K Scott, Margaret A Pericak-Vance, Jonathan L Haines","doi":"10.1016/j.xhgg.2025.100533","DOIUrl":"10.1016/j.xhgg.2025.100533","url":null,"abstract":"<p><p>Telomere length (TL) is a key indicator of biological aging. Understanding the association between TL and cognitive impairment may provide important insights into disease mechanisms for age-related neurodegenerative disorders, such as Alzheimer's disease (AD). However, the relationship between TL and cognitive function remains controversial, with studies reporting positive, negative, or no associations between them. This inconsistency may be attributed to genetic and environmental variations or differences in TL measurement methods. We conducted a comprehensive characterization of DNA sequence-determined TL and analyzed its association with cognitive function in the Midwestern Amish. The Midwestern Amish are a founder population demonstrating reduced genetic and environmental variation compared with the general European population. This unique population structure allowed us to better control for potential confounding by non-telomere genetic and environmental factors. Our study confirmed the expected telomere shortening with age and provided both SNP-based and pedigree-based TL heritability estimates. No significant correlation was observed between TL and cognitive function. However, a genome-wide association study of TL revealed three loci associated with TL, each containing Amish-enriched rare variants.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100533"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12639303/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145313856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}