Support vector machine prediction of N-and O-glycosylation sites using whole sequence information and subcellular localization

Q3 Biochemistry, Genetics and Molecular Biology IPSJ Transactions on Bioinformatics Pub Date : 2009-12-01 DOI:10.2197/IPSJTBIO.2.25
Kenta Sasaki, Nobuyoshi Nagamine, Y. Sakakibara
{"title":"Support vector machine prediction of N-and O-glycosylation sites using whole sequence information and subcellular localization","authors":"Kenta Sasaki, Nobuyoshi Nagamine, Y. Sakakibara","doi":"10.2197/IPSJTBIO.2.25","DOIUrl":null,"url":null,"abstract":"Background: Glycans, or sugar chains, are one of the three types of chain (DNA, protein and glycan) that constitute living organisms; they are often called “the third chain of the living organism”. About half of all proteins are estimated to be glycosylated based on the SWISS-PROT database. Glycosylation is one of the most important post-translational modifications, affecting many critical functions of proteins, including cellular communication, and their tertiary structure. In order to computationally predict N-glycosylation and O-glycosylation sites, we developed three kinds of support vector machine (SVM) model, which utilize local information, general protein information and/or subcellular localization in consideration of the binding specificity of glycosyltransferases and the characteristic subcellular localization of glycoproteins. Results: In our computational experiment, the model integrating three kinds of information achieved about 90% accuracy in predictions of both N-glycosylation and O-glycosylation sites. Moreover, our model was applied to a protein whose glycosylation sites had not been previously identified and we succeeded in showing that the glycosylation sites predicted by our model were structurally reasonable. Conclusions: In the present study, we developed a comprehensive and effective computational method that detects glycosylation sites. We conclude that our method is a comprehensive and effective computational prediction method that is applicable at a genome-wide level.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":"2 1","pages":"25-35"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.2.25","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IPSJ Transactions on Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2197/IPSJTBIO.2.25","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 16

Abstract

Background: Glycans, or sugar chains, are one of the three types of chain (DNA, protein and glycan) that constitute living organisms; they are often called “the third chain of the living organism”. About half of all proteins are estimated to be glycosylated based on the SWISS-PROT database. Glycosylation is one of the most important post-translational modifications, affecting many critical functions of proteins, including cellular communication, and their tertiary structure. In order to computationally predict N-glycosylation and O-glycosylation sites, we developed three kinds of support vector machine (SVM) model, which utilize local information, general protein information and/or subcellular localization in consideration of the binding specificity of glycosyltransferases and the characteristic subcellular localization of glycoproteins. Results: In our computational experiment, the model integrating three kinds of information achieved about 90% accuracy in predictions of both N-glycosylation and O-glycosylation sites. Moreover, our model was applied to a protein whose glycosylation sites had not been previously identified and we succeeded in showing that the glycosylation sites predicted by our model were structurally reasonable. Conclusions: In the present study, we developed a comprehensive and effective computational method that detects glycosylation sites. We conclude that our method is a comprehensive and effective computational prediction method that is applicable at a genome-wide level.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于全序列信息和亚细胞定位的支持向量机预测n和o糖基化位点
背景:聚糖或糖链是构成生物体的三种链(DNA、蛋白质和聚糖)之一;它们通常被称为“生物体的第三链”。根据SWISS-PROT数据库估计,大约一半的蛋白质被糖基化。糖基化是最重要的翻译后修饰之一,影响蛋白质的许多关键功能,包括细胞通讯和它们的三级结构。为了计算预测n -糖基化位点和o -糖基化位点,考虑到糖基转移酶的结合特异性和糖蛋白的亚细胞定位特性,我们开发了三种支持向量机(SVM)模型,分别利用局部信息、一般蛋白质信息和/或亚细胞定位。结果:在我们的计算实验中,整合三种信息的模型对n -糖基化位点和o -糖基化位点的预测准确率均达到90%左右。此外,我们的模型应用于一种糖基化位点之前未被确定的蛋白质,我们成功地证明了我们的模型预测的糖基化位点在结构上是合理的。结论:在本研究中,我们开发了一种全面有效的检测糖基化位点的计算方法。结果表明,该方法是一种全面有效的计算预测方法,适用于全基因组水平。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IPSJ Transactions on Bioinformatics
IPSJ Transactions on Bioinformatics Biochemistry, Genetics and Molecular Biology-Biochemistry, Genetics and Molecular Biology (miscellaneous)
CiteScore
1.90
自引率
0.00%
发文量
3
期刊最新文献
A High-speed Measurement System for Treadmill Spherical Motion in Virtual Reality for Mice and a Robust Rotation Axis Estimation Algorithm Based on Spherical Geometry Metabolic Network Analysis by Time-series Causal Inference Using the Multi-dimensional Space of Prediction Errors AtLASS: A Scheme for End-to-End Prediction of Splice Sites Using Attention-based Bi-LSTM Erratum: A High-speed Measurement System for Treadmill Spherical Motion in Virtual Reality for Mice and a Robust Rotation Axis Estimation Algorithm Based on Spherical Geometry [IPSJ Transactions on Bioinformatics Vol.16 pp.1-12] A Novel Metagenomic Binning Framework Using NLP Techniques in Feature Extraction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1