Taeho Jo, Paula Bice, Kwangsik Nho, Andrew J. Saykin, Alzheimer's Disease Sequencing Project
{"title":"LD-informed deep learning for Alzheimer's gene loci detection using WGS data","authors":"Taeho Jo, Paula Bice, Kwangsik Nho, Andrew J. Saykin, Alzheimer's Disease Sequencing Project","doi":"10.1002/trc2.70041","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> INTRODUCTION</h3>\n \n <p>The exponential growth of genomic datasets necessitates advanced analytical tools to effectively identify genetic loci from large-scale high throughput sequencing data. This study presents Deep-Block, a multi-stage deep learning framework that incorporates biological knowledge into its AI architecture to identify genetic regions as significantly associated with Alzheimer's disease (AD). The framework employs a three-stage approach: (1) genome segmentation based on linkage disequilibrium (LD) patterns, (2) selection of relevant LD blocks using sparse attention mechanisms, and (3) application of TabNet and Random Forest algorithms to quantify single nucleotide polymorphism (SNP) feature importance, thereby identifying genetic factors contributing to AD risk.</p>\n </section>\n \n <section>\n \n <h3> METHODS</h3>\n \n <p>The Deep-Block was applied to a large-scale whole genome sequencing (WGS) dataset from the Alzheimer's Disease Sequencing Project (ADSP), comprising 7416 non-Hispanic white (NHW) participants (3150 cognitively normal older adults (CN), 4266 AD).</p>\n </section>\n \n <section>\n \n <h3> RESULTS</h3>\n \n <p>30,218 LD blocks were identified and then ranked based on their relevance with Alzheimer's disease. Subsequently, the Deep-Block identified novel SNPs within the top 1500 LD blocks and confirmed previously known variants, including <i>APOE</i> rs429358 and rs769449. Expression Quantitative Trait Loci (eQTL) analysis across 13 brain regions provided functional evidence for the identified variants. The results were cross-validated against established AD-associated loci from the European Alzheimer's and Dementia Biobank (EADB) and the GWAS catalog.</p>\n </section>\n \n <section>\n \n <h3> DISCUSSION</h3>\n \n <p>The Deep-Block framework effectively processes large-scale high throughput sequencing data while preserving SNP interactions during dimensionality reduction, minimizing bias and information loss. The framework's findings are supported by tissue-specific eQTL evidence across brain regions, indicating the functional relevance of the identified variants. Additionally, the Deep-Block approach has identified both known and novel genetic variants, enhancing our understanding of the genetic architecture and demonstrating its potential for application in large-scale sequencing studies.</p>\n </section>\n \n <section>\n \n <h3> Highlights</h3>\n \n <div>\n <ul>\n \n <li>Growing genomic datasets require advanced tools to identify genetic loci in sequencing.</li>\n \n <li>Deep-Block, a novel AI framework, was used to process large-scale ADSP WGS data.</li>\n \n <li>Deep-Block identified both known and novel AD-associated genetic loci.</li>\n \n <li>rs429358 (<i>APOE</i>) was key; rs11556505 (<i>TOMM40</i>), rs34342646 (<i>NECTIN2</i>) were significant.</li>\n \n <li>The AI framework uses biological knowledge to enhance detection of Alzheimer's loci.</li>\n </ul>\n </div>\n </section>\n </div>","PeriodicalId":53225,"journal":{"name":"Alzheimer''s and Dementia: Translational Research and Clinical Interventions","volume":"11 1","pages":""},"PeriodicalIF":4.9000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11736638/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Alzheimer''s and Dementia: Translational Research and Clinical Interventions","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/trc2.70041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
INTRODUCTION
The exponential growth of genomic datasets necessitates advanced analytical tools to effectively identify genetic loci from large-scale high throughput sequencing data. This study presents Deep-Block, a multi-stage deep learning framework that incorporates biological knowledge into its AI architecture to identify genetic regions as significantly associated with Alzheimer's disease (AD). The framework employs a three-stage approach: (1) genome segmentation based on linkage disequilibrium (LD) patterns, (2) selection of relevant LD blocks using sparse attention mechanisms, and (3) application of TabNet and Random Forest algorithms to quantify single nucleotide polymorphism (SNP) feature importance, thereby identifying genetic factors contributing to AD risk.
METHODS
The Deep-Block was applied to a large-scale whole genome sequencing (WGS) dataset from the Alzheimer's Disease Sequencing Project (ADSP), comprising 7416 non-Hispanic white (NHW) participants (3150 cognitively normal older adults (CN), 4266 AD).
RESULTS
30,218 LD blocks were identified and then ranked based on their relevance with Alzheimer's disease. Subsequently, the Deep-Block identified novel SNPs within the top 1500 LD blocks and confirmed previously known variants, including APOE rs429358 and rs769449. Expression Quantitative Trait Loci (eQTL) analysis across 13 brain regions provided functional evidence for the identified variants. The results were cross-validated against established AD-associated loci from the European Alzheimer's and Dementia Biobank (EADB) and the GWAS catalog.
DISCUSSION
The Deep-Block framework effectively processes large-scale high throughput sequencing data while preserving SNP interactions during dimensionality reduction, minimizing bias and information loss. The framework's findings are supported by tissue-specific eQTL evidence across brain regions, indicating the functional relevance of the identified variants. Additionally, the Deep-Block approach has identified both known and novel genetic variants, enhancing our understanding of the genetic architecture and demonstrating its potential for application in large-scale sequencing studies.
Highlights
Growing genomic datasets require advanced tools to identify genetic loci in sequencing.
Deep-Block, a novel AI framework, was used to process large-scale ADSP WGS data.
Deep-Block identified both known and novel AD-associated genetic loci.
rs429358 (APOE) was key; rs11556505 (TOMM40), rs34342646 (NECTIN2) were significant.
The AI framework uses biological knowledge to enhance detection of Alzheimer's loci.
期刊介绍:
Alzheimer''s & Dementia: Translational Research & Clinical Interventions (TRCI) is a peer-reviewed, open access,journal from the Alzheimer''s Association®. The journal seeks to bridge the full scope of explorations between basic research on drug discovery and clinical studies, validating putative therapies for aging-related chronic brain conditions that affect cognition, motor functions, and other behavioral or clinical symptoms associated with all forms dementia and Alzheimer''s disease. The journal will publish findings from diverse domains of research and disciplines to accelerate the conversion of abstract facts into practical knowledge: specifically, to translate what is learned at the bench into bedside applications. The journal seeks to publish articles that go beyond a singular emphasis on either basic drug discovery research or clinical research. Rather, an important theme of articles will be the linkages between and among the various discrete steps in the complex continuum of therapy development. For rapid communication among a multidisciplinary research audience involving the range of therapeutic interventions, TRCI will consider only original contributions that include feature length research articles, systematic reviews, meta-analyses, brief reports, narrative reviews, commentaries, letters, perspectives, and research news that would advance wide range of interventions to ameliorate symptoms or alter the progression of chronic neurocognitive disorders such as dementia and Alzheimer''s disease. The journal will publish on topics related to medicine, geriatrics, neuroscience, neurophysiology, neurology, psychiatry, clinical psychology, bioinformatics, pharmaco-genetics, regulatory issues, health economics, pharmacoeconomics, and public health policy as these apply to preclinical and clinical research on therapeutics.