HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management Pub Date : 2017-11-06 DOI:10.1145/3132847.3132953

Tao-yang Fu, Wang-Chien Lee, Zhen Lei

{"title":"HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning","authors":"Tao-yang Fu, Wang-Chien Lee, Zhen Lei","doi":"10.1145/3132847.3132953","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a novel representation learning framework, namely HIN2Vec, for heterogeneous information networks (HINs). The core of the proposed framework is a neural network model, also called HIN2Vec, designed to capture the rich semantics embedded in HINs by exploiting different types of relationships among nodes. Given a set of relationships specified in forms of meta-paths in an HIN, HIN2Vec carries out multiple prediction training tasks jointly based on a target set of relationships to learn latent vectors of nodes and meta-paths in the HIN. In addition to model design, several issues unique to HIN2Vec, including regularization of meta-path vectors, node type selection in negative sampling, and cycles in random walks, are examined. To validate our ideas, we learn latent vectors of nodes using four large-scale real HIN datasets, including Blogcatalog, Yelp, DBLP and U.S. Patents, and use them as features for multi-label node classification and link prediction applications on those networks. Empirical results show that HIN2Vec soundly outperforms the state-of-the-art representation learning models for network data, including DeepWalk, LINE, node2vec, PTE, HINE and ESim, by 6.6% to 23.8% of $micro$-$f_1$ in multi-label node classification and 5% to 70.8% of $MAP$ in link prediction.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"59 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"515","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3132847.3132953","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 515

Abstract

In this paper, we propose a novel representation learning framework, namely HIN2Vec, for heterogeneous information networks (HINs). The core of the proposed framework is a neural network model, also called HIN2Vec, designed to capture the rich semantics embedded in HINs by exploiting different types of relationships among nodes. Given a set of relationships specified in forms of meta-paths in an HIN, HIN2Vec carries out multiple prediction training tasks jointly based on a target set of relationships to learn latent vectors of nodes and meta-paths in the HIN. In addition to model design, several issues unique to HIN2Vec, including regularization of meta-path vectors, node type selection in negative sampling, and cycles in random walks, are examined. To validate our ideas, we learn latent vectors of nodes using four large-scale real HIN datasets, including Blogcatalog, Yelp, DBLP and U.S. Patents, and use them as features for multi-label node classification and link prediction applications on those networks. Empirical results show that HIN2Vec soundly outperforms the state-of-the-art representation learning models for network data, including DeepWalk, LINE, node2vec, PTE, HINE and ESim, by 6.6% to 23.8% of $micro$-$f_1$ in multi-label node classification and 5% to 70.8% of $MAP$ in link prediction.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

HIN2Vec:探索异构信息网络中的元路径用于表示学习

本文提出了一种新的异构信息网络表示学习框架，即HIN2Vec。提出的框架的核心是一个神经网络模型，也称为HIN2Vec，旨在通过利用节点之间不同类型的关系来捕获嵌入在HINs中的丰富语义。给定HIN中以元路径形式指定的一组关系，HIN2Vec基于目标关系集联合执行多个预测训练任务，学习HIN中节点和元路径的潜在向量。除了模型设计之外，还研究了HIN2Vec独有的几个问题，包括元路径向量的正则化、负采样中的节点类型选择和随机漫步中的循环。为了验证我们的想法，我们使用四个大规模真实HIN数据集(包括Blogcatalog、Yelp、DBLP和U.S. Patents)学习节点的潜在向量，并将其用作这些网络上多标签节点分类和链接预测应用的特征。实证结果表明，HIN2Vec在网络数据表征学习模型(包括DeepWalk、LINE、node2vec、PTE、HINE和ESim)中表现出色，在多标签节点分类方面比$micro$-$f_1$高出6.6%至23.8%，在链路预测方面比$MAP$高出5%至70.8%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

自引率

0.00%

发文量

期刊最新文献

Query and Animate Multi-attribute Trajectory Data HyPerInsight: Data Exploration Deep Inside HyPer Algorithmic Bias: Do Good Systems Make Relevant Documents More Retrievable? NeuPL: Attention-based Semantic Matching and Pair-Linking for Entity Disambiguation Health Forum Thread Recommendation Using an Interest Aware Topic Model