Wen Wen, Jiaxin Zhong, Zhaoxi Zhang, Lijuan Jia, Tinyi Chu, Nating Wang, Charles G Danko, Zhong Wang
{"title":"dHICA: a deep transformer-based model enables accurate histone imputation from chromatin accessibility.","authors":"Wen Wen, Jiaxin Zhong, Zhaoxi Zhang, Lijuan Jia, Tinyi Chu, Nating Wang, Charles G Danko, Zhong Wang","doi":"10.1093/bib/bbae459","DOIUrl":null,"url":null,"abstract":"<p><p>Histone modifications (HMs) are pivotal in various biological processes, including transcription, replication, and DNA repair, significantly impacting chromatin structure. These modifications underpin the molecular mechanisms of cell-type-specific gene expression and complex diseases. However, annotating HMs across different cell types solely using experimental approaches is impractical due to cost and time constraints. Herein, we present dHICA (deep histone imputation using chromatin accessibility), a novel deep learning framework that integrates DNA sequences and chromatin accessibility data to predict multiple HM tracks. Employing the transformer architecture alongside dilated convolutions, dHICA boasts an extensive receptive field and captures more cell-type-specific information. dHICA outperforms state-of-the-art baselines and achieves superior performance in cell-type-specific loci and gene elements, aligning with biological expectations. Furthermore, dHICA's imputations hold significant potential for downstream applications, including chromatin state segmentation and elucidating the functional implications of SNPs (Single Nucleotide Polymorphisms). In conclusion, dHICA serves as a valuable tool for advancing the understanding of chromatin dynamics, offering enhanced predictive capabilities and interpretability.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11421843/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbae459","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Histone modifications (HMs) are pivotal in various biological processes, including transcription, replication, and DNA repair, significantly impacting chromatin structure. These modifications underpin the molecular mechanisms of cell-type-specific gene expression and complex diseases. However, annotating HMs across different cell types solely using experimental approaches is impractical due to cost and time constraints. Herein, we present dHICA (deep histone imputation using chromatin accessibility), a novel deep learning framework that integrates DNA sequences and chromatin accessibility data to predict multiple HM tracks. Employing the transformer architecture alongside dilated convolutions, dHICA boasts an extensive receptive field and captures more cell-type-specific information. dHICA outperforms state-of-the-art baselines and achieves superior performance in cell-type-specific loci and gene elements, aligning with biological expectations. Furthermore, dHICA's imputations hold significant potential for downstream applications, including chromatin state segmentation and elucidating the functional implications of SNPs (Single Nucleotide Polymorphisms). In conclusion, dHICA serves as a valuable tool for advancing the understanding of chromatin dynamics, offering enhanced predictive capabilities and interpretability.
组蛋白修饰(HMs)在转录、复制和 DNA 修复等各种生物过程中起着关键作用,对染色质结构产生重大影响。这些修饰是细胞类型特异性基因表达和复杂疾病的分子机制的基础。然而,由于成本和时间限制,仅使用实验方法注释不同细胞类型的 HMs 是不切实际的。在这里,我们提出了 dHICA(利用染色质可及性进行深度组蛋白推断),这是一种新颖的深度学习框架,它整合了 DNA 序列和染色质可及性数据,可预测多种 HM 轨迹。dHICA 采用变压器架构和扩张卷积,具有广泛的感受野,能捕捉到更多细胞类型特异性信息。dHICA 的表现优于最先进的基线,在细胞类型特异性位点和基因元件方面表现出色,符合生物学预期。此外,dHICA 的推算还为染色质状态分割和阐明 SNP(单核苷酸多态性)的功能意义等下游应用提供了巨大潜力。总之,dHICA 是推进对染色质动力学理解的重要工具,它提供了更强的预测能力和可解释性。
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.