Wangmin Cai, Peiqiang Liu, Zunfang Wang, Hong Jiang, Chang Liu, Zhaojie Fei, Zhuang Yang
{"title":"Link prediction in protein–protein interaction network: A similarity multiplied similarity algorithm with paths of length three","authors":"Wangmin Cai, Peiqiang Liu, Zunfang Wang, Hong Jiang, Chang Liu, Zhaojie Fei, Zhuang Yang","doi":"10.1016/j.jtbi.2024.111850","DOIUrl":null,"url":null,"abstract":"<div><p>Protein–protein interactions (PPIs) are crucial for various biological processes, and predicting PPIs is a major challenge. To solve this issue, the most common method is link prediction. Currently, the link prediction methods based on network Paths of Length Three (L3) have been proven to be highly effective. In this paper, we propose a novel link prediction algorithm, named SMS, which is based on L3 and protein similarities. We first design a mixed similarity that combines the topological structure and attribute features of nodes. Then, we compute the predicted value by summing the product of all similarities along the L3. Furthermore, we propose the Max Similarity Multiplied Similarity (maxSMS) algorithm from the perspective of maximum impact. Our computational prediction results show that on six datasets, including S. cerevisiae, H. sapiens, and others, the maxSMS algorithm improves the precision of the top 500, area under the precision–recall curve, and normalized discounted cumulative gain by an average of 26.99%, 53.67%, and 6.7%, respectively, compared to other optimal methods.</p></div>","PeriodicalId":54763,"journal":{"name":"Journal of Theoretical Biology","volume":"589 ","pages":"Article 111850"},"PeriodicalIF":2.0000,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Theoretical Biology","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022519324001310","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/11 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Protein–protein interactions (PPIs) are crucial for various biological processes, and predicting PPIs is a major challenge. To solve this issue, the most common method is link prediction. Currently, the link prediction methods based on network Paths of Length Three (L3) have been proven to be highly effective. In this paper, we propose a novel link prediction algorithm, named SMS, which is based on L3 and protein similarities. We first design a mixed similarity that combines the topological structure and attribute features of nodes. Then, we compute the predicted value by summing the product of all similarities along the L3. Furthermore, we propose the Max Similarity Multiplied Similarity (maxSMS) algorithm from the perspective of maximum impact. Our computational prediction results show that on six datasets, including S. cerevisiae, H. sapiens, and others, the maxSMS algorithm improves the precision of the top 500, area under the precision–recall curve, and normalized discounted cumulative gain by an average of 26.99%, 53.67%, and 6.7%, respectively, compared to other optimal methods.
期刊介绍:
The Journal of Theoretical Biology is the leading forum for theoretical perspectives that give insight into biological processes. It covers a very wide range of topics and is of interest to biologists in many areas of research, including:
• Brain and Neuroscience
• Cancer Growth and Treatment
• Cell Biology
• Developmental Biology
• Ecology
• Evolution
• Immunology,
• Infectious and non-infectious Diseases,
• Mathematical, Computational, Biophysical and Statistical Modeling
• Microbiology, Molecular Biology, and Biochemistry
• Networks and Complex Systems
• Physiology
• Pharmacodynamics
• Animal Behavior and Game Theory
Acceptable papers are those that bear significant importance on the biology per se being presented, and not on the mathematical analysis. Papers that include some data or experimental material bearing on theory will be considered, including those that contain comparative study, statistical data analysis, mathematical proof, computer simulations, experiments, field observations, or even philosophical arguments, which are all methods to support or reject theoretical ideas. However, there should be a concerted effort to make papers intelligible to biologists in the chosen field.