基于堆叠的集合学习框架，用于识别硝基酪氨酸位点。

IF 7 2区医学 Q1 BIOLOGY Computers in biology and medicine Pub Date : 2024-10-03 DOI:10.1016/j.compbiomed.2024.109200

Aiman Parvez, Syed Danish Ali, Hilal Tayara, Kil To Chong

{"title":"基于堆叠的集合学习框架，用于识别硝基酪氨酸位点。","authors":"Aiman Parvez, Syed Danish Ali, Hilal Tayara, Kil To Chong","doi":"10.1016/j.compbiomed.2024.109200","DOIUrl":null,"url":null,"abstract":"Protein nitrotyrosine is an essential post-translational modification that results from the nitration of tyrosine amino acid residues. This modification is known to be associated with the regulation and characterization of several biological functions and diseases. Therefore, accurate identification of nitrotyrosine sites plays a significant role in the elucidating progress of associated biological signs. In this regard, we reported an accurate computational tool known as iNTyro-Stack for the identification of protein nitrotyrosine sites. iNTyro-Stack is a machine-learning model based on a stacking algorithm. The base classifiers in stacking are selected based on the highest performance. The feature map employed is a linear combination of the amino composition encoding schemes, including the composition of k-spaced amino acid pairs and tri-peptide composition. The recursive feature elimination technique is used for significant feature selection. The performance of the proposed method is evaluated using k-fold cross-validation and independent testing approaches. iNTyro-Stack achieved an accuracy of 86.3% and a Matthews correlation coefficient (MCC) of 72.6% in cross-validation. Its generalization capability was further validated on an imbalanced independent test set, where it attained an accuracy of 69.32%. iNTyro-Stack outperforms existing state-of-the-art methods across both evaluation techniques. The github repository is create to reproduce the method and results of iNTyro-Stack, accessible on: https://github.com/waleed551/iNTyro-Stack/.","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":null,"pages":null},"PeriodicalIF":7.0000,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Stacking based ensemble learning framework for identification of nitrotyrosine sites.\",\"authors\":\"Aiman Parvez, Syed Danish Ali, Hilal Tayara, Kil To Chong\",\"doi\":\"10.1016/j.compbiomed.2024.109200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Protein nitrotyrosine is an essential post-translational modification that results from the nitration of tyrosine amino acid residues. This modification is known to be associated with the regulation and characterization of several biological functions and diseases. Therefore, accurate identification of nitrotyrosine sites plays a significant role in the elucidating progress of associated biological signs. In this regard, we reported an accurate computational tool known as iNTyro-Stack for the identification of protein nitrotyrosine sites. iNTyro-Stack is a machine-learning model based on a stacking algorithm. The base classifiers in stacking are selected based on the highest performance. The feature map employed is a linear combination of the amino composition encoding schemes, including the composition of k-spaced amino acid pairs and tri-peptide composition. The recursive feature elimination technique is used for significant feature selection. The performance of the proposed method is evaluated using k-fold cross-validation and independent testing approaches. iNTyro-Stack achieved an accuracy of 86.3% and a Matthews correlation coefficient (MCC) of 72.6% in cross-validation. Its generalization capability was further validated on an imbalanced independent test set, where it attained an accuracy of 69.32%. iNTyro-Stack outperforms existing state-of-the-art methods across both evaluation techniques. The github repository is create to reproduce the method and results of iNTyro-Stack, accessible on: https://github.com/waleed551/iNTyro-Stack/.\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2024-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1016/j.compbiomed.2024.109200\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.compbiomed.2024.109200","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

蛋白质硝基酪氨酸是一种重要的翻译后修饰，由酪氨酸氨基酸残基硝化而成。众所周知，这种修饰与多种生物功能和疾病的调节和特征有关。因此，准确鉴定硝基酪氨酸位点对阐明相关生物标志的进展起着重要作用。iNTyro-Stack 是一种基于堆叠算法的机器学习模型。堆叠算法中的基础分类器是根据最高性能选择的。采用的特征图是氨基酸组成编码方案的线性组合，包括 k 距氨基酸对组成和三肽组成。重要特征选择采用递归特征消除技术。在交叉验证中，iNTyro-Stack 的准确率达到了 86.3%，马修斯相关系数（MCC）为 72.6%。iNTyro-Stack 的泛化能力在不平衡独立测试集上得到了进一步验证，准确率达到 69.32%。为重现 iNTyro-Stack 的方法和结果，我们创建了 github 存储库，访问网址为：https://github.com/waleed551/iNTyro-Stack/。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Stacking based ensemble learning framework for identification of nitrotyrosine sites.

Protein nitrotyrosine is an essential post-translational modification that results from the nitration of tyrosine amino acid residues. This modification is known to be associated with the regulation and characterization of several biological functions and diseases. Therefore, accurate identification of nitrotyrosine sites plays a significant role in the elucidating progress of associated biological signs. In this regard, we reported an accurate computational tool known as iNTyro-Stack for the identification of protein nitrotyrosine sites. iNTyro-Stack is a machine-learning model based on a stacking algorithm. The base classifiers in stacking are selected based on the highest performance. The feature map employed is a linear combination of the amino composition encoding schemes, including the composition of k-spaced amino acid pairs and tri-peptide composition. The recursive feature elimination technique is used for significant feature selection. The performance of the proposed method is evaluated using k-fold cross-validation and independent testing approaches. iNTyro-Stack achieved an accuracy of 86.3% and a Matthews correlation coefficient (MCC) of 72.6% in cross-validation. Its generalization capability was further validated on an imbalanced independent test set, where it attained an accuracy of 69.32%. iNTyro-Stack outperforms existing state-of-the-art methods across both evaluation techniques. The github repository is create to reproduce the method and results of iNTyro-Stack, accessible on: https://github.com/waleed551/iNTyro-Stack/.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers in biology and medicine 工程技术-工程：生物医学

CiteScore

11.70

自引率

10.40%

发文量

1086

审稿时长

74 days

期刊介绍： Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.