DeepCBA: A deep learning framework for gene expression prediction in maize based on DNA sequences and chromatin interactions.

IF 9.4 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Plant Communications Pub Date : 2024-06-10 DOI:10.1016/j.xplc.2024.100985
Zhenye Wang, Yong Peng, Jie Li, Jiying Li, Hao Yuan, Shangpo Yang, Xinru Ding, Ao Xie, Jiangling Zhang, Shouzhe Wang, Keqin Li, Jiaqi Shi, Guangjie Xing, Weihan Shi, Jianbing Yan, Jianxiao Liu
{"title":"DeepCBA: A deep learning framework for gene expression prediction in maize based on DNA sequences and chromatin interactions.","authors":"Zhenye Wang, Yong Peng, Jie Li, Jiying Li, Hao Yuan, Shangpo Yang, Xinru Ding, Ao Xie, Jiangling Zhang, Shouzhe Wang, Keqin Li, Jiaqi Shi, Guangjie Xing, Weihan Shi, Jianbing Yan, Jianxiao Liu","doi":"10.1016/j.xplc.2024.100985","DOIUrl":null,"url":null,"abstract":"<p><p>Chromatin interactions create spatial proximity between distal regulatory elements and target genes in the genome, which has an important impact on gene expression, transcriptional regulation, and phenotypic traits. To date, several methods have been developed for predicting gene expression. However, existing methods do not take into consideration the effect of chromatin interactions on target gene expression, thus potentially reducing the accuracy of gene expression prediction and mining of important regulatory elements. In this study, we developed a highly accurate deep learning-based gene expression prediction model (DeepCBA) based on maize chromatin interaction data. Compared with existing models, DeepCBA exhibits higher accuracy in expression classification and expression value prediction. The average Pearson correlation coefficients (PCCs) for predicting gene expression using gene promoter proximal interactions, proximal-distal interactions, and both proximal and distal interactions were 0.818, 0.625, and 0.929, respectively, representing an increase of 0.357, 0.16, and 0.469 over the PCCs obtained with traditional methods that use only gene proximal sequences. Some important motifs were identified through DeepCBA; they were enriched in open chromatin regions and expression quantitative trait loci and showed clear tissue specificity. Importantly, experimental results for the maize flowering-related gene ZmRap2.7 and the tillering-related gene ZmTb1 demonstrated the feasibility of DeepCBA for exploration of regulatory elements that affect gene expression. Moreover, promoter editing and verification of two reported genes (ZmCLE7 and ZmVTE4) demonstrated the utility of DeepCBA for the precise design of gene expression and even for future intelligent breeding. DeepCBA is available at http://www.deepcba.com/ or http://124.220.197.196/.</p>","PeriodicalId":52373,"journal":{"name":"Plant Communications","volume":null,"pages":null},"PeriodicalIF":9.4000,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Communications","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.xplc.2024.100985","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Chromatin interactions create spatial proximity between distal regulatory elements and target genes in the genome, which has an important impact on gene expression, transcriptional regulation, and phenotypic traits. To date, several methods have been developed for predicting gene expression. However, existing methods do not take into consideration the effect of chromatin interactions on target gene expression, thus potentially reducing the accuracy of gene expression prediction and mining of important regulatory elements. In this study, we developed a highly accurate deep learning-based gene expression prediction model (DeepCBA) based on maize chromatin interaction data. Compared with existing models, DeepCBA exhibits higher accuracy in expression classification and expression value prediction. The average Pearson correlation coefficients (PCCs) for predicting gene expression using gene promoter proximal interactions, proximal-distal interactions, and both proximal and distal interactions were 0.818, 0.625, and 0.929, respectively, representing an increase of 0.357, 0.16, and 0.469 over the PCCs obtained with traditional methods that use only gene proximal sequences. Some important motifs were identified through DeepCBA; they were enriched in open chromatin regions and expression quantitative trait loci and showed clear tissue specificity. Importantly, experimental results for the maize flowering-related gene ZmRap2.7 and the tillering-related gene ZmTb1 demonstrated the feasibility of DeepCBA for exploration of regulatory elements that affect gene expression. Moreover, promoter editing and verification of two reported genes (ZmCLE7 and ZmVTE4) demonstrated the utility of DeepCBA for the precise design of gene expression and even for future intelligent breeding. DeepCBA is available at http://www.deepcba.com/ or http://124.220.197.196/.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DeepCBA:基于 DNA 序列和染色质相互作用的玉米基因表达预测深度学习框架。
染色质相互作用在远端调控元件和基因组中的靶基因之间产生空间接近性,这对基因表达、转录调控和表型特征有重要影响。迄今为止,已开发出几种预测基因表达的方法。然而,现有的方法没有考虑染色质相互作用对目标基因表达的影响,因此可能会降低基因表达预测和重要调控元件挖掘的准确性。本研究基于玉米染色质相互作用数据,开发了一种基于深度学习的高精度基因表达预测模型(DeepCBA)。与现有模型相比,DeepCBA 在表达分类和表达值预测方面表现出更高的准确性。利用基因启动子近端相互作用、近端-远端相互作用以及近端和远端相互作用预测基因表达的平均皮尔逊相关系数(PCC)分别为0.818、0.625和0.929,比只利用基因近端序列的传统方法的PCC分别提高了0.357、0.16和0.469。通过DeepCBA发现了一些重要的基序,这些基序富集在开放染色质区域和表达定量性状位点(eQTL)中,并具有组织特异性的分子特征。重要的是,玉米开花相关基因ZmRap2.7和分蘖相关基因ZmTb1的实验结果证明了DeepCBA在探索影响基因表达的调控元件方面的可行性。此外,对两个已报道基因(ZmCLE7 和 ZmVTE4)启动子的编辑和验证表明,DeepCBA 在精确设计基因表达乃至未来智能育种方面有新的见解。DeepCBA 可在 http://www.deepcba.com/ 或 http://124.220.197.196/ 上查阅。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Plant Communications
Plant Communications Agricultural and Biological Sciences-Plant Science
CiteScore
15.70
自引率
5.70%
发文量
105
审稿时长
6 weeks
期刊介绍: Plant Communications is an open access publishing platform that supports the global plant science community. It publishes original research, review articles, technical advances, and research resources in various areas of plant sciences. The scope of topics includes evolution, ecology, physiology, biochemistry, development, reproduction, metabolism, molecular and cellular biology, genetics, genomics, environmental interactions, biotechnology, breeding of higher and lower plants, and their interactions with other organisms. The goal of Plant Communications is to provide a high-quality platform for the dissemination of plant science research.
期刊最新文献
Genome architecture of the allotetraploid wild grass Aegilops ventricosa reveals its evolutionary history and contributions to wheat improvement. AcRLK2P-1, an LRR receptor protein kinase gene from Agropyron cristatum, confers leaf rust resistance in wheat. Regulatory Networks of Coresident Subgenomes during Rapid Fiber Cell Elongation in Upland Cotton. OsRbohI Is the Indispensable NADPH Oxidase for Molecular Patterns Induced Reactive Oxygen Species Production in Rice. Rice E3 ubiquitin ligases: From key modulators of host immunity to potential breeding application.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1