成对采样和文本驱动:一种新的图嵌入框架

Liheng Chen, Yanru Qu, Zhenghui Wang, Lin Qiu, Weinan Zhang, Ken Chen, Shaodian Zhang, Yong Yu
{"title":"成对采样和文本驱动:一种新的图嵌入框架","authors":"Liheng Chen, Yanru Qu, Zhenghui Wang, Lin Qiu, Weinan Zhang, Ken Chen, Shaodian Zhang, Yong Yu","doi":"10.1145/3308558.3313520","DOIUrl":null,"url":null,"abstract":"In graphs with rich texts, incorporating textual information with structural information would benefit constructing expressive graph embeddings. Among various graph embedding models, random walk (RW)-based is one of the most popular and successful groups. However, it is challenged by two issues when applied on graphs with rich texts: (i) sampling efficiency: deriving from the training objective of RW-based models (e.g., DeepWalk and node2vec), we show that RW-based models are likely to generate large amounts of redundant training samples due to three main drawbacks. (ii) text utilization: these models have difficulty in dealing with zero-shot scenarios where graph embedding models have to infer graph structures directly from texts. To solve these problems, we propose a novel framework, namely Text-driven Graph Embedding with Pairs Sampling (TGE-PS). TGE-PS uses Pairs Sampling (PS) to improve the sampling strategy of RW, being able to reduce ~ 99% training samples while preserving competitive performance. TGE-PS uses Text-driven Graph Embedding (TGE), an inductive graph embedding approach, to generate node embeddings from texts. Since each node contains rich texts, TGE is able to generate high-quality embeddings and provide reasonable predictions on existence of links to unseen nodes. We evaluate TGE-PS on several real-world datasets, and experiment results demonstrate that TGE-PS produces state-of-the-art results on both traditional and zero-shot link prediction tasks.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"239 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Sampled in Pairs and Driven by Text: A New Graph Embedding Framework\",\"authors\":\"Liheng Chen, Yanru Qu, Zhenghui Wang, Lin Qiu, Weinan Zhang, Ken Chen, Shaodian Zhang, Yong Yu\",\"doi\":\"10.1145/3308558.3313520\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In graphs with rich texts, incorporating textual information with structural information would benefit constructing expressive graph embeddings. Among various graph embedding models, random walk (RW)-based is one of the most popular and successful groups. However, it is challenged by two issues when applied on graphs with rich texts: (i) sampling efficiency: deriving from the training objective of RW-based models (e.g., DeepWalk and node2vec), we show that RW-based models are likely to generate large amounts of redundant training samples due to three main drawbacks. (ii) text utilization: these models have difficulty in dealing with zero-shot scenarios where graph embedding models have to infer graph structures directly from texts. To solve these problems, we propose a novel framework, namely Text-driven Graph Embedding with Pairs Sampling (TGE-PS). TGE-PS uses Pairs Sampling (PS) to improve the sampling strategy of RW, being able to reduce ~ 99% training samples while preserving competitive performance. TGE-PS uses Text-driven Graph Embedding (TGE), an inductive graph embedding approach, to generate node embeddings from texts. Since each node contains rich texts, TGE is able to generate high-quality embeddings and provide reasonable predictions on existence of links to unseen nodes. We evaluate TGE-PS on several real-world datasets, and experiment results demonstrate that TGE-PS produces state-of-the-art results on both traditional and zero-shot link prediction tasks.\",\"PeriodicalId\":23013,\"journal\":{\"name\":\"The World Wide Web Conference\",\"volume\":\"239 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The World Wide Web Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3308558.3313520\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The World Wide Web Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3308558.3313520","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在具有丰富文本的图中,将文本信息与结构信息结合将有利于构造富有表现力的图嵌入。在各种图嵌入模型中,基于随机漫步的图嵌入模型是最受欢迎和成功的一种。然而,当应用于具有丰富文本的图时,它受到两个问题的挑战:(i)采样效率:从基于rw的模型(例如DeepWalk和node2vec)的训练目标出发,我们发现基于rw的模型可能会产生大量冗余的训练样本,这主要有三个缺点。(ii)文本利用:在图嵌入模型必须直接从文本推断图结构的情况下,这些模型难以处理零射击场景。为了解决这些问题,我们提出了一种新的框架,即文本驱动图嵌入对采样(TGE-PS)。TGE-PS使用成对采样(PS)来改进RW的采样策略,能够在保持竞争性能的同时减少~ 99%的训练样本。TGE- ps使用文本驱动图嵌入(TGE),一种归纳图嵌入方法,从文本中生成节点嵌入。由于每个节点都包含丰富的文本,TGE能够生成高质量的嵌入,并对未见节点的链接的存在提供合理的预测。我们在几个真实数据集上评估了ge - ps,实验结果表明ge - ps在传统和零射击链路预测任务上都能产生最先进的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Sampled in Pairs and Driven by Text: A New Graph Embedding Framework
In graphs with rich texts, incorporating textual information with structural information would benefit constructing expressive graph embeddings. Among various graph embedding models, random walk (RW)-based is one of the most popular and successful groups. However, it is challenged by two issues when applied on graphs with rich texts: (i) sampling efficiency: deriving from the training objective of RW-based models (e.g., DeepWalk and node2vec), we show that RW-based models are likely to generate large amounts of redundant training samples due to three main drawbacks. (ii) text utilization: these models have difficulty in dealing with zero-shot scenarios where graph embedding models have to infer graph structures directly from texts. To solve these problems, we propose a novel framework, namely Text-driven Graph Embedding with Pairs Sampling (TGE-PS). TGE-PS uses Pairs Sampling (PS) to improve the sampling strategy of RW, being able to reduce ~ 99% training samples while preserving competitive performance. TGE-PS uses Text-driven Graph Embedding (TGE), an inductive graph embedding approach, to generate node embeddings from texts. Since each node contains rich texts, TGE is able to generate high-quality embeddings and provide reasonable predictions on existence of links to unseen nodes. We evaluate TGE-PS on several real-world datasets, and experiment results demonstrate that TGE-PS produces state-of-the-art results on both traditional and zero-shot link prediction tasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Decoupled Smoothing on Graphs Think Outside the Dataset: Finding Fraudulent Reviews using Cross-Dataset Analysis Augmenting Knowledge Tracing by Considering Forgetting Behavior Enhancing Fashion Recommendation with Visual Compatibility Relationship Judging a Book by Its Cover: The Effect of Facial Perception on Centrality in Social Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1