CAA-PPI:使用不同编码策略预测蛋白质-蛋白质相互作用的计算特征设计

IF 3.1 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE AI (Basel, Switzerland) Pub Date : 2023-04-28 DOI:10.3390/ai4020020
Bhawna Mewara, Gunjan Sahni, Soniya Lalwani, Rajesh Kumar
{"title":"CAA-PPI:使用不同编码策略预测蛋白质-蛋白质相互作用的计算特征设计","authors":"Bhawna Mewara, Gunjan Sahni, Soniya Lalwani, Rajesh Kumar","doi":"10.3390/ai4020020","DOIUrl":null,"url":null,"abstract":"Protein–protein interactions (PPIs) are involved in an extensive variety of biological procedures, including cell-to-cell interactions, and metabolic and developmental control. PPIs are becoming one of the most important aims of system biology. PPIs act as a fundamental part in predicting the protein function of the target protein and the drug ability of molecules. An abundance of work has been performed to develop methods to computationally predict PPIs as this supplements laboratory trials and offers a cost-effective way of predicting the most likely set of interactions at the entire proteome scale. This article presents an innovative feature representation method (CAA-PPI) to extract features from protein sequences using two different encoding strategies followed by an ensemble learning method. The random forest methodwas used as a classifier for PPI prediction. CAA-PPI considers the role of the trigram and bond of a given amino acid with its nearby ones. The proposed PPI model achieved more than a 98% prediction accuracy with one encoding scheme and more than a 95% prediction accuracy with another encoding scheme for the two diverse PPI datasets, i.e., H. pylori and Yeast. Further, investigations were performed to compare the CAA-PPI approach with existing sequence-based methods and revealed the proficiency of the proposed method with both encoding strategies. To further assess the practical prediction competence, a blind test was implemented on five other species’ datasets independent of the training set, and the obtained results ascertained the productivity of CAA-PPI with both encoding schemes.","PeriodicalId":93633,"journal":{"name":"AI (Basel, Switzerland)","volume":"88 1","pages":"0"},"PeriodicalIF":3.1000,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CAA-PPI: A Computational Feature Design to Predict Protein–Protein Interactions Using Different Encoding Strategies\",\"authors\":\"Bhawna Mewara, Gunjan Sahni, Soniya Lalwani, Rajesh Kumar\",\"doi\":\"10.3390/ai4020020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Protein–protein interactions (PPIs) are involved in an extensive variety of biological procedures, including cell-to-cell interactions, and metabolic and developmental control. PPIs are becoming one of the most important aims of system biology. PPIs act as a fundamental part in predicting the protein function of the target protein and the drug ability of molecules. An abundance of work has been performed to develop methods to computationally predict PPIs as this supplements laboratory trials and offers a cost-effective way of predicting the most likely set of interactions at the entire proteome scale. This article presents an innovative feature representation method (CAA-PPI) to extract features from protein sequences using two different encoding strategies followed by an ensemble learning method. The random forest methodwas used as a classifier for PPI prediction. CAA-PPI considers the role of the trigram and bond of a given amino acid with its nearby ones. The proposed PPI model achieved more than a 98% prediction accuracy with one encoding scheme and more than a 95% prediction accuracy with another encoding scheme for the two diverse PPI datasets, i.e., H. pylori and Yeast. Further, investigations were performed to compare the CAA-PPI approach with existing sequence-based methods and revealed the proficiency of the proposed method with both encoding strategies. To further assess the practical prediction competence, a blind test was implemented on five other species’ datasets independent of the training set, and the obtained results ascertained the productivity of CAA-PPI with both encoding schemes.\",\"PeriodicalId\":93633,\"journal\":{\"name\":\"AI (Basel, Switzerland)\",\"volume\":\"88 1\",\"pages\":\"0\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2023-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AI (Basel, Switzerland)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/ai4020020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI (Basel, Switzerland)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/ai4020020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

蛋白质-蛋白质相互作用(PPIs)涉及广泛的生物过程,包括细胞间相互作用,代谢和发育控制。PPIs正成为系统生物学最重要的目标之一。PPIs在预测靶蛋白的蛋白质功能和分子的药物能力方面起着重要的作用。已经进行了大量的工作来开发计算预测ppi的方法,作为实验室试验的补充,并提供了一种在整个蛋白质组尺度上预测最可能的相互作用集的经济有效的方法。本文提出了一种创新的特征表示方法(CAA-PPI),该方法使用两种不同的编码策略和集成学习方法从蛋白质序列中提取特征。采用随机森林方法作为PPI预测的分类器。CAA-PPI考虑的是给定氨基酸与邻近氨基酸的三元键和键的作用。对于H. pylori和Yeast两种不同PPI数据集,所提出的PPI模型在一种编码方案下的预测准确率达到98%以上,在另一种编码方案下的预测准确率达到95%以上。此外,研究人员还将CAA-PPI方法与现有的基于序列的方法进行了比较,并揭示了所提出的方法对两种编码策略的熟练程度。为了进一步评估实际预测能力,对另外5个独立于训练集的物种数据集进行了盲测,得到的结果确定了两种编码方案下CAA-PPI的生产力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CAA-PPI: A Computational Feature Design to Predict Protein–Protein Interactions Using Different Encoding Strategies
Protein–protein interactions (PPIs) are involved in an extensive variety of biological procedures, including cell-to-cell interactions, and metabolic and developmental control. PPIs are becoming one of the most important aims of system biology. PPIs act as a fundamental part in predicting the protein function of the target protein and the drug ability of molecules. An abundance of work has been performed to develop methods to computationally predict PPIs as this supplements laboratory trials and offers a cost-effective way of predicting the most likely set of interactions at the entire proteome scale. This article presents an innovative feature representation method (CAA-PPI) to extract features from protein sequences using two different encoding strategies followed by an ensemble learning method. The random forest methodwas used as a classifier for PPI prediction. CAA-PPI considers the role of the trigram and bond of a given amino acid with its nearby ones. The proposed PPI model achieved more than a 98% prediction accuracy with one encoding scheme and more than a 95% prediction accuracy with another encoding scheme for the two diverse PPI datasets, i.e., H. pylori and Yeast. Further, investigations were performed to compare the CAA-PPI approach with existing sequence-based methods and revealed the proficiency of the proposed method with both encoding strategies. To further assess the practical prediction competence, a blind test was implemented on five other species’ datasets independent of the training set, and the obtained results ascertained the productivity of CAA-PPI with both encoding schemes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
0.00%
发文量
0
审稿时长
11 weeks
期刊最新文献
Can Artificial Intelligence Aid Diagnosis by Teleguided Point-of-Care Ultrasound? A Pilot Study for Evaluating a Novel Computer Algorithm for COVID-19 Diagnosis Using Lung Ultrasound. Chatbots Put to the Test in Math and Logic Problems: A Comparison and Assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard Deep Learning Performance Characterization on GPUs for Various Quantization Frameworks From Trustworthy Principles to a Trustworthy Development Process: The Need and Elements of Trusted Development of AI Systems Algorithms for All: Can AI in the Mortgage Market Expand Access to Homeownership?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1