{"title":"p - simmrank:将simmrank扩展到无标度二部网络","authors":"Prasenjit Dey, Kunal Goel, Rahul Agrawal","doi":"10.1145/3366423.3380081","DOIUrl":null,"url":null,"abstract":"The measure of similarity between nodes in a graph is a useful tool in many areas of computer science. SimRank, proposed by Jeh and Widom [7], is a classic measure of similarities of nodes in graph that has both theoretical and intuitive properties and has been extensively studied and used in many applications such as Query-Rewriting, link prediction, collaborative filtering and so on. Existing works based on Simrank primarily focus on preserving the microscopic structure, such as the second and third order proximity of the vertices, while the macroscopic scale-free property is largely ignored. Scale-free property is a critical property of any real-world web graphs where the vertex degrees follow a heavy-tailed distribution. In this paper, we introduce P-Simrank which extends the idea of Simrank to Scale-free bipartite networks. To study the efficacy of the proposed solution on a real world problem, we tested the same on the well known query-rewriting problem in sponsored search domain using bipartite click graph, similar to Simrank++ [1], which acts as our baseline. We show that Simrank++ produces sub-optimal similarity scores in case of bipartite graphs where degree distribution of vertices follow power-law. We also show how P-Simrank can be optimized for real-world large graphs. Finally, we experimentally evaluate P-Simrank algorithm against Simrank++, using actual click graphs obtained from Bing, and show that P-Simrank outperforms Simrank++ in variety of metrics.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"84 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"P-Simrank: Extending Simrank to Scale-Free Bipartite Networks\",\"authors\":\"Prasenjit Dey, Kunal Goel, Rahul Agrawal\",\"doi\":\"10.1145/3366423.3380081\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The measure of similarity between nodes in a graph is a useful tool in many areas of computer science. SimRank, proposed by Jeh and Widom [7], is a classic measure of similarities of nodes in graph that has both theoretical and intuitive properties and has been extensively studied and used in many applications such as Query-Rewriting, link prediction, collaborative filtering and so on. Existing works based on Simrank primarily focus on preserving the microscopic structure, such as the second and third order proximity of the vertices, while the macroscopic scale-free property is largely ignored. Scale-free property is a critical property of any real-world web graphs where the vertex degrees follow a heavy-tailed distribution. In this paper, we introduce P-Simrank which extends the idea of Simrank to Scale-free bipartite networks. To study the efficacy of the proposed solution on a real world problem, we tested the same on the well known query-rewriting problem in sponsored search domain using bipartite click graph, similar to Simrank++ [1], which acts as our baseline. We show that Simrank++ produces sub-optimal similarity scores in case of bipartite graphs where degree distribution of vertices follow power-law. We also show how P-Simrank can be optimized for real-world large graphs. Finally, we experimentally evaluate P-Simrank algorithm against Simrank++, using actual click graphs obtained from Bing, and show that P-Simrank outperforms Simrank++ in variety of metrics.\",\"PeriodicalId\":20754,\"journal\":{\"name\":\"Proceedings of The Web Conference 2020\",\"volume\":\"84 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of The Web Conference 2020\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3366423.3380081\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of The Web Conference 2020","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366423.3380081","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
P-Simrank: Extending Simrank to Scale-Free Bipartite Networks
The measure of similarity between nodes in a graph is a useful tool in many areas of computer science. SimRank, proposed by Jeh and Widom [7], is a classic measure of similarities of nodes in graph that has both theoretical and intuitive properties and has been extensively studied and used in many applications such as Query-Rewriting, link prediction, collaborative filtering and so on. Existing works based on Simrank primarily focus on preserving the microscopic structure, such as the second and third order proximity of the vertices, while the macroscopic scale-free property is largely ignored. Scale-free property is a critical property of any real-world web graphs where the vertex degrees follow a heavy-tailed distribution. In this paper, we introduce P-Simrank which extends the idea of Simrank to Scale-free bipartite networks. To study the efficacy of the proposed solution on a real world problem, we tested the same on the well known query-rewriting problem in sponsored search domain using bipartite click graph, similar to Simrank++ [1], which acts as our baseline. We show that Simrank++ produces sub-optimal similarity scores in case of bipartite graphs where degree distribution of vertices follow power-law. We also show how P-Simrank can be optimized for real-world large graphs. Finally, we experimentally evaluate P-Simrank algorithm against Simrank++, using actual click graphs obtained from Bing, and show that P-Simrank outperforms Simrank++ in variety of metrics.