Prediction of DNA Methylation based on Multi-dimensional feature encoding and double convolutional fully connected convolutional neural network.

IF 4.3 2区 生物学 PLoS Computational Biology Pub Date : 2023-08-28 eCollection Date: 2023-08-01 DOI:10.1371/journal.pcbi.1011370
Wenxing Hu, Lixin Guan, Mengshan Li
{"title":"Prediction of DNA Methylation based on Multi-dimensional feature encoding and double convolutional fully connected convolutional neural network.","authors":"Wenxing Hu, Lixin Guan, Mengshan Li","doi":"10.1371/journal.pcbi.1011370","DOIUrl":null,"url":null,"abstract":"DNA methylation takes on critical significance to the regulation of gene expression by affecting the stability of DNA and changing the structure of chromosomes. DNA methylation modification sites should be identified, which lays a solid basis for gaining more insights into their biological functions. Existing machine learning-based methods of predicting DNA methylation have not fully exploited the hidden multidimensional information in DNA gene sequences, such that the prediction accuracy of models is significantly limited. Besides, most models have been built in terms of a single methylation type. To address the above-mentioned issues, a deep learning-based method was proposed in this study for DNA methylation site prediction, termed the MEDCNN model. The MEDCNN model is capable of extracting feature information from gene sequences in three dimensions (i.e., positional information, biological information, and chemical information). Moreover, the proposed method employs a convolutional neural network model with double convolutional layers and double fully connected layers while iteratively updating the gradient descent algorithm using the cross-entropy loss function to increase the prediction accuracy of the model. Besides, the MEDCNN model can predict different types of DNA methylation sites. As indicated by the experimental results,the deep learning method based on coding from multiple dimensions outperformed single coding methods, and the MEDCNN model was highly applicable and outperformed existing models in predicting DNA methylation between different species. As revealed by the above-described findings, the MEDCNN model can be effective in predicting DNA methylation sites.","PeriodicalId":49688,"journal":{"name":"PLoS Computational Biology","volume":"19 8","pages":"e1011370"},"PeriodicalIF":4.3000,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10461834/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pcbi.1011370","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/8/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

DNA methylation takes on critical significance to the regulation of gene expression by affecting the stability of DNA and changing the structure of chromosomes. DNA methylation modification sites should be identified, which lays a solid basis for gaining more insights into their biological functions. Existing machine learning-based methods of predicting DNA methylation have not fully exploited the hidden multidimensional information in DNA gene sequences, such that the prediction accuracy of models is significantly limited. Besides, most models have been built in terms of a single methylation type. To address the above-mentioned issues, a deep learning-based method was proposed in this study for DNA methylation site prediction, termed the MEDCNN model. The MEDCNN model is capable of extracting feature information from gene sequences in three dimensions (i.e., positional information, biological information, and chemical information). Moreover, the proposed method employs a convolutional neural network model with double convolutional layers and double fully connected layers while iteratively updating the gradient descent algorithm using the cross-entropy loss function to increase the prediction accuracy of the model. Besides, the MEDCNN model can predict different types of DNA methylation sites. As indicated by the experimental results,the deep learning method based on coding from multiple dimensions outperformed single coding methods, and the MEDCNN model was highly applicable and outperformed existing models in predicting DNA methylation between different species. As revealed by the above-described findings, the MEDCNN model can be effective in predicting DNA methylation sites.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于多维特征编码和双卷积全连接卷积神经网络的DNA甲基化预测。
DNA甲基化通过影响DNA的稳定性和改变染色体的结构,对基因表达的调节具有关键意义。应该确定DNA甲基化修饰位点,这为深入了解其生物学功能奠定了坚实的基础。现有的基于机器学习的DNA甲基化预测方法没有充分利用DNA基因序列中隐藏的多维信息,因此模型的预测准确性受到显著限制。此外,大多数模型都是根据单一甲基化类型构建的。为了解决上述问题,本研究提出了一种基于深度学习的DNA甲基化位点预测方法,称为MEDCNN模型。MEDCNN模型能够从三维的基因序列中提取特征信息(即位置信息、生物信息和化学信息)。此外,该方法采用了具有双卷积层和双完全连接层的卷积神经网络模型,同时使用交叉熵损失函数迭代更新梯度下降算法,以提高模型的预测精度。此外,MEDCNN模型可以预测不同类型的DNA甲基化位点。实验结果表明,基于多维编码的深度学习方法优于单一编码方法,MEDCNN模型在预测不同物种之间的DNA甲基化方面具有高度的适用性,优于现有模型。正如上述发现所揭示的,MEDCNN模型可以有效地预测DNA甲基化位点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
PLoS Computational Biology
PLoS Computational Biology 生物-生化研究方法
CiteScore
7.10
自引率
4.70%
发文量
820
期刊介绍: PLOS Computational Biology features works of exceptional significance that further our understanding of living systems at all scales—from molecules and cells, to patient populations and ecosystems—through the application of computational methods. Readers include life and computational scientists, who can take the important findings presented here to the next level of discovery. Research articles must be declared as belonging to a relevant section. More information about the sections can be found in the submission guidelines. Research articles should model aspects of biological systems, demonstrate both methodological and scientific novelty, and provide profound new biological insights. Generally, reliability and significance of biological discovery through computation should be validated and enriched by experimental studies. Inclusion of experimental validation is not required for publication, but should be referenced where possible. Inclusion of experimental validation of a modest biological discovery through computation does not render a manuscript suitable for PLOS Computational Biology. Research articles specifically designated as Methods papers should describe outstanding methods of exceptional importance that have been shown, or have the promise to provide new biological insights. The method must already be widely adopted, or have the promise of wide adoption by a broad community of users. Enhancements to existing published methods will only be considered if those enhancements bring exceptional new capabilities.
期刊最新文献
Real-time forecasting of COVID-19-related hospital strain in France using a non-Markovian mechanistic model. Ten simple rules for teaching an introduction to R Evolutionary analyses of intrinsically disordered regions reveal widespread signals of conservation. A weak coupling mechanism for the early steps of the recovery stroke of myosin VI: A free energy simulation and string method analysis. Validity conditions of approximations for a target-mediated drug disposition model: A novel first-order approximation and its comparison to other approximations.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1