Prediction of interactions between cell surface proteins by machine learning.

IF 2.8 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Proteins-Structure Function and Bioinformatics Pub Date : 2024-04-01 Epub Date: 2023-12-05 DOI:10.1002/prot.26648
Zhaoqian Su, Brian Griffin, Scott Emmons, Yinghao Wu
{"title":"Prediction of interactions between cell surface proteins by machine learning.","authors":"Zhaoqian Su, Brian Griffin, Scott Emmons, Yinghao Wu","doi":"10.1002/prot.26648","DOIUrl":null,"url":null,"abstract":"<p><p>Cells detect changes in their external environments or communicate with each other through proteins on their surfaces. These cell surface proteins form a complicated network of interactions in order to fulfill their functions. The interactions between cell surface proteins are highly dynamic and, thus, challenging to detect using traditional experimental techniques. Here, we tackle this challenge using a computational framework. The primary focus of the framework is to develop new tools to identify interactions between domains in the immunoglobulin (Ig) fold, which is the most abundant domain family in cell surface proteins. These interactions could be formed between ligands and receptors from different cells or between proteins on the same cell surface. In practice, we collected all structural data on Ig domain interactions and transformed them into an interface fragment pair library. A high-dimensional profile can then be constructed from the library for a given pair of query protein sequences. Multiple machine learning models were used to read this profile so that the probability of interaction between the query proteins could be predicted. We tested our models on an experimentally derived dataset that contains 564 cell surface proteins in humans. The cross-validation results show that we can achieve higher than 70% accuracy in identifying the PPIs within this dataset. We then applied this method to a group of 46 cell surface proteins in Caenorhabditis elegans. We screened every possible interaction between these proteins. Many interactions recognized by our machine learning classifiers have been experimentally confirmed in the literature. In conclusion, our computational platform serves as a useful tool to help identify potential new interactions between cell surface proteins in addition to current state-of-the-art experimental techniques. The tool is freely accessible for use by the scientific community. Moreover, the general framework of the machine learning classification can also be extended to study the interactions of proteins in other domain superfamilies.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"567-580"},"PeriodicalIF":2.8000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteins-Structure Function and Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/prot.26648","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/5 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Cells detect changes in their external environments or communicate with each other through proteins on their surfaces. These cell surface proteins form a complicated network of interactions in order to fulfill their functions. The interactions between cell surface proteins are highly dynamic and, thus, challenging to detect using traditional experimental techniques. Here, we tackle this challenge using a computational framework. The primary focus of the framework is to develop new tools to identify interactions between domains in the immunoglobulin (Ig) fold, which is the most abundant domain family in cell surface proteins. These interactions could be formed between ligands and receptors from different cells or between proteins on the same cell surface. In practice, we collected all structural data on Ig domain interactions and transformed them into an interface fragment pair library. A high-dimensional profile can then be constructed from the library for a given pair of query protein sequences. Multiple machine learning models were used to read this profile so that the probability of interaction between the query proteins could be predicted. We tested our models on an experimentally derived dataset that contains 564 cell surface proteins in humans. The cross-validation results show that we can achieve higher than 70% accuracy in identifying the PPIs within this dataset. We then applied this method to a group of 46 cell surface proteins in Caenorhabditis elegans. We screened every possible interaction between these proteins. Many interactions recognized by our machine learning classifiers have been experimentally confirmed in the literature. In conclusion, our computational platform serves as a useful tool to help identify potential new interactions between cell surface proteins in addition to current state-of-the-art experimental techniques. The tool is freely accessible for use by the scientific community. Moreover, the general framework of the machine learning classification can also be extended to study the interactions of proteins in other domain superfamilies.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用机器学习预测细胞表面蛋白之间的相互作用。
细胞检测外部环境的变化或通过其表面的蛋白质相互交流。这些细胞表面蛋白质形成了一个复杂的相互作用网络,以实现它们的功能。细胞表面蛋白之间的相互作用是高度动态的,因此,使用传统的实验技术来检测具有挑战性。在这里,我们使用一个计算框架来解决这个挑战。该框架的主要重点是开发新的工具来识别免疫球蛋白(Ig)折叠结构域之间的相互作用,免疫球蛋白(Ig)折叠是细胞表面蛋白中最丰富的结构域家族。这些相互作用可以在不同细胞的配体和受体之间形成,也可以在同一细胞表面的蛋白质之间形成。在实践中,我们收集了Ig域相互作用的所有结构数据,并将其转换为接口片段对库。然后可以从库中为给定的一对查询蛋白序列构建高维轮廓。使用多个机器学习模型来读取该配置文件,以便预测查询蛋白质之间相互作用的概率。我们在一个包含564种人类细胞表面蛋白的实验数据集上测试了我们的模型。交叉验证结果表明,我们在该数据集中识别ppi的准确率可以达到70%以上。然后,我们将这种方法应用于秀丽隐杆线虫的一组46个细胞表面蛋白。我们筛选了这些蛋白质之间所有可能的相互作用。我们的机器学习分类器识别的许多交互作用已经在文献中得到了实验证实。总之,除了当前最先进的实验技术外,我们的计算平台还可以作为一个有用的工具来帮助识别细胞表面蛋白质之间潜在的新相互作用。该工具可供科学界免费使用。此外,机器学习分类的一般框架也可以扩展到研究其他结构域超家族中蛋白质的相互作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Proteins-Structure Function and Bioinformatics
Proteins-Structure Function and Bioinformatics 生物-生化与分子生物学
CiteScore
5.90
自引率
3.40%
发文量
172
审稿时长
3 months
期刊介绍: PROTEINS : Structure, Function, and Bioinformatics publishes original reports of significant experimental and analytic research in all areas of protein research: structure, function, computation, genetics, and design. The journal encourages reports that present new experimental or computational approaches for interpreting and understanding data from biophysical chemistry, structural studies of proteins and macromolecular assemblies, alterations of protein structure and function engineered through techniques of molecular biology and genetics, functional analyses under physiologic conditions, as well as the interactions of proteins with receptors, nucleic acids, or other specific ligands or substrates. Research in protein and peptide biochemistry directed toward synthesizing or characterizing molecules that simulate aspects of the activity of proteins, or that act as inhibitors of protein function, is also within the scope of PROTEINS. In addition to full-length reports, short communications (usually not more than 4 printed pages) and prediction reports are welcome. Reviews are typically by invitation; authors are encouraged to submit proposed topics for consideration.
期刊最新文献
Human Citrate Synthase Post-Translational Modification Mimics and Molecular Dynamic Simulations Demonstrate Attenuation of Acetyl-CoA/CoA Binding. Large Extent of Convergent Evolution Towards the Double Histone Fold Revealed by Targeted Sequence and Structure Search Approach. Enhancement Effects of α4/α6-Targeted Inhibitors on Eg5-Microtubule Interaction. Rieske Iron-Sulfur Cluster Proteins From an Anaerobic Ammonium Oxidizer Suggest Unusual Energetics in Their Parent Rieske/Cytochrome b Complexes. Structure of the NAD+ Bound Erythrose-4-Phosphate Dehydrogenase (E4PDH) Reveals the Stabilizing Effect of Polyethylene Glycol on the Quaternary Structure.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1