KbhbXG: A Machine learning architecture based on XGBoost for prediction of lysine β-Hydroxybutyrylation (Kbhb) modification sites

IF 4.2 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Methods Pub Date : 2024-04-27 DOI:10.1016/j.ymeth.2024.04.016
Leqi Chen , Liwen Liu , Haiyan Su , Yan Xu
{"title":"KbhbXG: A Machine learning architecture based on XGBoost for prediction of lysine β-Hydroxybutyrylation (Kbhb) modification sites","authors":"Leqi Chen ,&nbsp;Liwen Liu ,&nbsp;Haiyan Su ,&nbsp;Yan Xu","doi":"10.1016/j.ymeth.2024.04.016","DOIUrl":null,"url":null,"abstract":"<div><p>Lysine β-hydroxybutyrylation is an important post-translational modification (PTM) involved in various physiological and biological processes. In this research, we introduce a novel predictor KbhbXG, which utilizes XGBoost to identify β-hydroxybutyrylation modification sites based on protein sequence information. The traditional experimental methods employed for the identification of β-hydroxybutyrylated sites using proteomic techniques are both costly and time-consuming. Thus, the development of computational methods and predictors can play a crucial role in facilitating the rapid identification of β-hydroxybutyrylation sites. Our proposed KbhbXG model first utilizes machine learning algorithm XGBoost to predict β-hydroxybutyrylation modification sites. On the independent test set, KbhbXG achieves an accuracy of 0.7457, specificity of 0.7771, and an impressive area under the curve (AUC) score of 0.8172. The high AUC score achieved by our method demonstrates its potential for effectively identifying novel β-hydroxybutyrylation sites, thereby facilitating further research and exploration of the β-hydroxybutyrylation process. Also, functional analyses have revealed that different organisms preferentially engage in distinct biological processes and pathways, which can provide valuable insights for understanding the mechanism of β-hydroxybutyrylation and guide experimental verification. To promote transparency and reproducibility, we have made both the codes and dataset of KbhbXG publicly available. Researchers interested in utilizing our proposed model can access these resources at <span>https://github.com/Lab-Xu/KbhbXG</span><svg><path></path></svg>.</p></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"227 ","pages":"Pages 27-34"},"PeriodicalIF":4.2000,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1046202324001063","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Lysine β-hydroxybutyrylation is an important post-translational modification (PTM) involved in various physiological and biological processes. In this research, we introduce a novel predictor KbhbXG, which utilizes XGBoost to identify β-hydroxybutyrylation modification sites based on protein sequence information. The traditional experimental methods employed for the identification of β-hydroxybutyrylated sites using proteomic techniques are both costly and time-consuming. Thus, the development of computational methods and predictors can play a crucial role in facilitating the rapid identification of β-hydroxybutyrylation sites. Our proposed KbhbXG model first utilizes machine learning algorithm XGBoost to predict β-hydroxybutyrylation modification sites. On the independent test set, KbhbXG achieves an accuracy of 0.7457, specificity of 0.7771, and an impressive area under the curve (AUC) score of 0.8172. The high AUC score achieved by our method demonstrates its potential for effectively identifying novel β-hydroxybutyrylation sites, thereby facilitating further research and exploration of the β-hydroxybutyrylation process. Also, functional analyses have revealed that different organisms preferentially engage in distinct biological processes and pathways, which can provide valuable insights for understanding the mechanism of β-hydroxybutyrylation and guide experimental verification. To promote transparency and reproducibility, we have made both the codes and dataset of KbhbXG publicly available. Researchers interested in utilizing our proposed model can access these resources at https://github.com/Lab-Xu/KbhbXG.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
KbhbXG:基于 XGBoost 的机器学习架构,用于预测赖氨酸 β-羟基丁酰化(Kbhb)修饰位点。
赖氨酸β-羟基丁酰化是一种重要的翻译后修饰(PTM),涉及多种生理和生物过程。在这项研究中,我们介绍了一种新型预测器 KbhbXG,它利用 XGBoost 技术根据蛋白质序列信息来识别 β-羟基丁酰化修饰位点。利用蛋白质组技术鉴定β-羟基丁酰化位点的传统实验方法既昂贵又耗时。因此,计算方法和预测器的开发在促进β-羟基丁酰化位点的快速鉴定方面起着至关重要的作用。我们提出的KbhbXG模型首先利用机器学习算法XGBoost来预测β-羟基丁酰化修饰位点。在独立测试集上,KbhbXG 的准确度为 0.7457,特异度为 0.7771,曲线下面积(AUC)为 0.8172,令人印象深刻。我们的方法所获得的高 AUC 分数表明,它具有有效识别新型 β-羟基丁酰化位点的潜力,从而有助于进一步研究和探索 β-羟基丁酰化过程。此外,功能分析还揭示了不同生物优先参与不同的生物过程和途径,这可以为理解 β-羟基丁酰化机制提供宝贵的见解,并指导实验验证。为了提高透明度和可重复性,我们公开了 KbhbXG 的代码和数据集。对我们提出的模型感兴趣的研究人员可以访问 https://github.com/Lab-Xu/KbhbXG 获取这些资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Methods
Methods 生物-生化研究方法
CiteScore
9.80
自引率
2.10%
发文量
222
审稿时长
11.3 weeks
期刊介绍: Methods focuses on rapidly developing techniques in the experimental biological and medical sciences. Each topical issue, organized by a guest editor who is an expert in the area covered, consists solely of invited quality articles by specialist authors, many of them reviews. Issues are devoted to specific technical approaches with emphasis on clear detailed descriptions of protocols that allow them to be reproduced easily. The background information provided enables researchers to understand the principles underlying the methods; other helpful sections include comparisons of alternative methods giving the advantages and disadvantages of particular methods, guidance on avoiding potential pitfalls, and suggestions for troubleshooting.
期刊最新文献
Optimizing retinal Imaging: Evaluation of ultrasmall TiO2 Nanoparticle- fluorescein conjugates for improved Fundus fluorescein angiography. Ab-Amy 2.0: Predicting light chain amyloidogenic risk of therapeutic antibodies based on antibody language model. Data preprocessing methods for selective sweep detection using convolutional neural networks. SITP: A single cell bioinformatics analysis flow captures proteasome markers in the development of breast cancer Exploring drug-target interaction prediction on cold-start scenarios via meta-learning-based graph transformer.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1