Predicting gene expression state and prioritizing putative enhancers using 5hmC signal.

IF 10.1 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Genome Biology Pub Date : 2024-06-03 DOI:10.1186/s13059-024-03273-z
Edahi Gonzalez-Avalos, Atsushi Onodera, Daniela Samaniego-Castruita, Anjana Rao, Ferhat Ay
{"title":"Predicting gene expression state and prioritizing putative enhancers using 5hmC signal.","authors":"Edahi Gonzalez-Avalos, Atsushi Onodera, Daniela Samaniego-Castruita, Anjana Rao, Ferhat Ay","doi":"10.1186/s13059-024-03273-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Like its parent base 5-methylcytosine (5mC), 5-hydroxymethylcytosine (5hmC) is a direct epigenetic modification of cytosines in the context of CpG dinucleotides. 5hmC is the most abundant oxidized form of 5mC, generated through the action of TET dioxygenases at gene bodies of actively-transcribed genes and at active or lineage-specific enhancers. Although such enrichments are reported for 5hmC, to date, predictive models of gene expression state or putative regulatory regions for genes using 5hmC have not been developed.</p><p><strong>Results: </strong>Here, by using only 5hmC enrichment in genic regions and their vicinity, we develop neural network models that predict gene expression state across 49 cell types. We show that our deep neural network models distinguish high vs low expression state utilizing only 5hmC levels and these predictive models generalize to unseen cell types. Further, in order to leverage 5hmC signal in distal enhancers for expression prediction, we employ an Activity-by-Contact model and also develop a graph convolutional neural network model with both utilizing Hi-C data and 5hmC enrichment to prioritize enhancer-promoter links. These approaches identify known and novel putative enhancers for key genes in multiple immune cell subsets.</p><p><strong>Conclusions: </strong>Our work highlights the importance of 5hmC in gene regulation through proximal and distal mechanisms and provides a framework to link it to genome function. With the recent advances in 6-letter DNA sequencing by short and long-read techniques, profiling of 5mC and 5hmC may be done routinely in the near future, hence, providing a broad range of applications for the methods developed here.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":10.1000,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11145787/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13059-024-03273-z","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Like its parent base 5-methylcytosine (5mC), 5-hydroxymethylcytosine (5hmC) is a direct epigenetic modification of cytosines in the context of CpG dinucleotides. 5hmC is the most abundant oxidized form of 5mC, generated through the action of TET dioxygenases at gene bodies of actively-transcribed genes and at active or lineage-specific enhancers. Although such enrichments are reported for 5hmC, to date, predictive models of gene expression state or putative regulatory regions for genes using 5hmC have not been developed.

Results: Here, by using only 5hmC enrichment in genic regions and their vicinity, we develop neural network models that predict gene expression state across 49 cell types. We show that our deep neural network models distinguish high vs low expression state utilizing only 5hmC levels and these predictive models generalize to unseen cell types. Further, in order to leverage 5hmC signal in distal enhancers for expression prediction, we employ an Activity-by-Contact model and also develop a graph convolutional neural network model with both utilizing Hi-C data and 5hmC enrichment to prioritize enhancer-promoter links. These approaches identify known and novel putative enhancers for key genes in multiple immune cell subsets.

Conclusions: Our work highlights the importance of 5hmC in gene regulation through proximal and distal mechanisms and provides a framework to link it to genome function. With the recent advances in 6-letter DNA sequencing by short and long-read techniques, profiling of 5mC and 5hmC may be done routinely in the near future, hence, providing a broad range of applications for the methods developed here.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用 5hmC 信号预测基因表达状态并优先选择推定增强子
背景:与母基 5-甲基胞嘧啶(5mC)一样,5-羟甲基胞嘧啶(5hmC)也是 CpG 二核苷酸中胞嘧啶的一种直接表观遗传修饰。5hmC 是 5mC 最丰富的氧化形式,通过 TET 二氧酶的作用在活跃转录基因的基因体和活跃的或特定谱系的增强子上生成。尽管有报告称 5hmC 具有这种富集作用,但迄今为止,尚未开发出使用 5hmC 预测基因表达状态或基因假定调控区域的模型:结果:在此,我们仅利用基因区域及其附近的 5hmC 富集,开发了神经网络模型,可预测 49 种细胞类型的基因表达状态。我们的研究表明,我们的深度神经网络模型仅利用 5hmC 水平就能区分高表达状态和低表达状态,而且这些预测模型还能推广到未见过的细胞类型。此外,为了利用远端增强子中的 5hmC 信号进行表达预测,我们采用了 "按接触活动"(Activity-by-Contact)模型,还开发了图卷积神经网络模型,利用 Hi-C 数据和 5hmC 富集来优先考虑增强子-启动子链接。这些方法为多个免疫细胞亚群中的关键基因确定了已知的和新的推定增强子:我们的工作强调了 5hmC 通过近端和远端机制在基因调控中的重要性,并提供了将其与基因组功能联系起来的框架。随着近来短读和长读技术在 6 字母 DNA 测序方面的进步,5mC 和 5hmC 的分析在不久的将来可能会成为常规方法,从而为本文开发的方法提供了广泛的应用前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Genome Biology
Genome Biology Biochemistry, Genetics and Molecular Biology-Genetics
CiteScore
21.00
自引率
3.30%
发文量
241
审稿时长
2 months
期刊介绍: Genome Biology stands as a premier platform for exceptional research across all domains of biology and biomedicine, explored through a genomic and post-genomic lens. With an impressive impact factor of 12.3 (2022),* the journal secures its position as the 3rd-ranked research journal in the Genetics and Heredity category and the 2nd-ranked research journal in the Biotechnology and Applied Microbiology category by Thomson Reuters. Notably, Genome Biology holds the distinction of being the highest-ranked open-access journal in this category. Our dedicated team of highly trained in-house Editors collaborates closely with our esteemed Editorial Board of international experts, ensuring the journal remains on the forefront of scientific advances and community standards. Regular engagement with researchers at conferences and institute visits underscores our commitment to staying abreast of the latest developments in the field.
期刊最新文献
Atlas of telomeric repeat diversity in Arabidopsis thaliana ESCHR: a hyperparameter-randomized ensemble approach for robust clustering across diverse datasets Splam: a deep-learning-based splice site predictor that improves spliced alignments Dimension reduction, cell clustering, and cell–cell communication inference for single-cell transcriptomics with DcjComm A comprehensive map of the aging blood methylome in humans
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1