LyFor:Prediction of lysine formylation sites from sequence based features using support vector machine

Md. Sohrawordi, Md. Al Mehedi Hasan
{"title":"LyFor:Prediction of lysine formylation sites from sequence based features using support vector machine","authors":"Md. Sohrawordi, Md. Al Mehedi Hasan","doi":"10.1109/TENSYMP50017.2020.9230689","DOIUrl":null,"url":null,"abstract":"Lysine formylation is a recently invented post- translational modification (PTM), which mostly resides on nuclear histone proteins. It is mainly responsible for playing an effective role in the mechanisms of cellular chromatin regulation such as DNA binding, DNA repair and protein synthesis and has great effect on other PTMs such as methylation and acetylation. As computational methods are simple, popular and high speedy compared to traditional experimental methods, it is very important and essential to generate mathematical model for proper identification of formylated lysine sites. A useful bioinformatics tool named LyFor, in this study, is developed by using amino acid composition (AAC), amino acid index (AAI), binary encoding (BE) and composition of k-spaced amino acid pair (CKSAAP) feature construction techniques to predict formylated lysine residues and non-formylated lysine residues. Moreover, a dimensional reduction method named principal component analysis (PCA) and randomly oversample method were used for preprocessing training dataset, which was applied to train the model with support vector machine algorithm. We have seen that LyFor achieves a better performance with an accuracy of 90.02 % for 10-fold cross-validation compared to existing models. Therefore, the analysis and prediction of lysine formylation may provide very useful information to study the mechanisms of chromatin regulation.","PeriodicalId":6721,"journal":{"name":"2020 IEEE Region 10 Symposium (TENSYMP)","volume":"53 1","pages":"250-253"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Region 10 Symposium (TENSYMP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENSYMP50017.2020.9230689","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Lysine formylation is a recently invented post- translational modification (PTM), which mostly resides on nuclear histone proteins. It is mainly responsible for playing an effective role in the mechanisms of cellular chromatin regulation such as DNA binding, DNA repair and protein synthesis and has great effect on other PTMs such as methylation and acetylation. As computational methods are simple, popular and high speedy compared to traditional experimental methods, it is very important and essential to generate mathematical model for proper identification of formylated lysine sites. A useful bioinformatics tool named LyFor, in this study, is developed by using amino acid composition (AAC), amino acid index (AAI), binary encoding (BE) and composition of k-spaced amino acid pair (CKSAAP) feature construction techniques to predict formylated lysine residues and non-formylated lysine residues. Moreover, a dimensional reduction method named principal component analysis (PCA) and randomly oversample method were used for preprocessing training dataset, which was applied to train the model with support vector machine algorithm. We have seen that LyFor achieves a better performance with an accuracy of 90.02 % for 10-fold cross-validation compared to existing models. Therefore, the analysis and prediction of lysine formylation may provide very useful information to study the mechanisms of chromatin regulation.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
LyFor:使用支持向量机从基于序列的特征中预测赖氨酸甲酰化位点
赖氨酸甲酰化是近年来出现的一种翻译后修饰(PTM),主要发生在核组蛋白上。它主要负责在细胞染色质调控机制中发挥有效作用,如DNA结合、DNA修复和蛋白质合成,对其他PTMs如甲基化和乙酰化也有很大影响。与传统的实验方法相比,计算方法简单、流行、速度快,因此建立数学模型对正确识别甲酰化赖氨酸位点非常重要和必要。本研究利用氨基酸组成(AAC)、氨基酸指数(AAI)、二进制编码(BE)和k间隔氨基酸对组成(CKSAAP)特征构建技术,开发了一个有用的生物信息学工具LyFor,用于预测甲酰化赖氨酸残基和非甲酰化赖氨酸残基。采用主成分分析(PCA)降维方法和随机抽样方法对训练数据集进行预处理,并应用支持向量机算法对模型进行训练。我们已经看到,与现有模型相比,LyFor在10倍交叉验证中实现了更好的性能,准确率为90.02%。因此,分析和预测赖氨酸甲酰化可能为研究染色质调控机制提供非常有用的信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Honorary Chair Multi-connectivity for URLLC: Performance Comparison of Different Architectures Efficiency Evaluation of P&O MPPT Technique used for Maximum Power Extraction from Solar Photovoltaic System Application of Internet of Things (IoT) to Develop a Smart Watering System for Cairns Parklands – A Case Study Analysis of Stability and Control of Helicopter Flight Dynamics Through Mathematical Modeling in Matlab
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1