{"title":"Con&Net:基于嵌入表示的跨网络锚链接发现方法","authors":"Xueyuan Wang, Hongpo Zhang, Zongmin Wang, Yaqiong Qiao, Jiangtao Ma, Honghua Dai","doi":"10.1145/3469083","DOIUrl":null,"url":null,"abstract":"Cross-network anchor link discovery is an important research problem and has many applications in heterogeneous social network. Existing schemes of cross-network anchor link discovery can provide reasonable link discovery results, but the quality of these results depends on the features of the platform. Therefore, there is no theoretical guarantee to the stability. This article employs user embedding feature to model the relationship between cross-platform accounts, that is, the more similar the user embedding features are, the more similar the two accounts are. The similarity of user embedding features is determined by the distance of the user features in the latent space. Based on the user embedding features, this article proposes an embedding representation-based method Con&Net(Content and Network) to solve cross-network anchor link discovery problem. Con&Net combines the user’s profile features, user-generated content (UGC) features, and user’s social structure features to measure the similarity of two user accounts. Con&Net first trains the user’s profile features to get profile embedding. Then it trains the network structure of the nodes to get structure embedding. It connects the two features through vector concatenating, and calculates the cosine similarity of the vector based on the embedding vector. This cosine similarity is used to measure the similarity of the user accounts. Finally, Con&Net predicts the link based on similarity for account pairs across the two networks. A large number of experiments in Sina Weibo and Twitter networks show that the proposed method Con&Net is better than state-of-the-art method. The area under the curve (AUC) value of the receiver operating characteristic (ROC) curve predicted by the anchor link is 11% higher than the baseline method, and Precision@30 is 25% higher than the baseline method.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"135 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Con&Net: A Cross-Network Anchor Link Discovery Method Based on Embedding Representation\",\"authors\":\"Xueyuan Wang, Hongpo Zhang, Zongmin Wang, Yaqiong Qiao, Jiangtao Ma, Honghua Dai\",\"doi\":\"10.1145/3469083\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cross-network anchor link discovery is an important research problem and has many applications in heterogeneous social network. Existing schemes of cross-network anchor link discovery can provide reasonable link discovery results, but the quality of these results depends on the features of the platform. Therefore, there is no theoretical guarantee to the stability. This article employs user embedding feature to model the relationship between cross-platform accounts, that is, the more similar the user embedding features are, the more similar the two accounts are. The similarity of user embedding features is determined by the distance of the user features in the latent space. Based on the user embedding features, this article proposes an embedding representation-based method Con&Net(Content and Network) to solve cross-network anchor link discovery problem. Con&Net combines the user’s profile features, user-generated content (UGC) features, and user’s social structure features to measure the similarity of two user accounts. Con&Net first trains the user’s profile features to get profile embedding. Then it trains the network structure of the nodes to get structure embedding. It connects the two features through vector concatenating, and calculates the cosine similarity of the vector based on the embedding vector. This cosine similarity is used to measure the similarity of the user accounts. Finally, Con&Net predicts the link based on similarity for account pairs across the two networks. A large number of experiments in Sina Weibo and Twitter networks show that the proposed method Con&Net is better than state-of-the-art method. The area under the curve (AUC) value of the receiver operating characteristic (ROC) curve predicted by the anchor link is 11% higher than the baseline method, and Precision@30 is 25% higher than the baseline method.\",\"PeriodicalId\":435653,\"journal\":{\"name\":\"ACM Transactions on Knowledge Discovery from Data (TKDD)\",\"volume\":\"135 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Knowledge Discovery from Data (TKDD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3469083\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data (TKDD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3469083","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
摘要
跨网络锚链接发现是一个重要的研究问题,在异构社会网络中有着广泛的应用。现有的跨网锚点链接发现方案可以提供合理的链接发现结果,但这些结果的质量取决于平台的特点。因此,对其稳定性没有理论上的保证。本文采用用户嵌入特征对跨平台账号之间的关系进行建模,即用户嵌入特征越相似,两个账号越相似。用户嵌入特征的相似度由用户特征在潜在空间中的距离决定。本文基于用户嵌入的特点,提出了一种基于嵌入表示的方法Con&Net(Content and Network)来解决跨网络锚链接发现问题。Con&Net结合用户的个人资料特征、用户生成内容(UGC)特征和用户的社会结构特征来衡量两个用户账户的相似性。Con&Net首先训练用户的配置文件特征以获得配置文件嵌入。然后对节点的网络结构进行训练,得到结构嵌入。它通过向量拼接将两个特征连接起来,并基于嵌入向量计算向量的余弦相似度。余弦相似度用于度量用户帐户的相似度。最后,Con&Net根据两个网络上帐户对的相似性预测链接。在新浪微博和Twitter网络上进行的大量实验表明,本文提出的方法Con&Net优于现有的方法。锚链预测的受试者工作特征(ROC)曲线下面积(AUC)值比基线法高11%,Precision@30比基线法高25%。
Con&Net: A Cross-Network Anchor Link Discovery Method Based on Embedding Representation
Cross-network anchor link discovery is an important research problem and has many applications in heterogeneous social network. Existing schemes of cross-network anchor link discovery can provide reasonable link discovery results, but the quality of these results depends on the features of the platform. Therefore, there is no theoretical guarantee to the stability. This article employs user embedding feature to model the relationship between cross-platform accounts, that is, the more similar the user embedding features are, the more similar the two accounts are. The similarity of user embedding features is determined by the distance of the user features in the latent space. Based on the user embedding features, this article proposes an embedding representation-based method Con&Net(Content and Network) to solve cross-network anchor link discovery problem. Con&Net combines the user’s profile features, user-generated content (UGC) features, and user’s social structure features to measure the similarity of two user accounts. Con&Net first trains the user’s profile features to get profile embedding. Then it trains the network structure of the nodes to get structure embedding. It connects the two features through vector concatenating, and calculates the cosine similarity of the vector based on the embedding vector. This cosine similarity is used to measure the similarity of the user accounts. Finally, Con&Net predicts the link based on similarity for account pairs across the two networks. A large number of experiments in Sina Weibo and Twitter networks show that the proposed method Con&Net is better than state-of-the-art method. The area under the curve (AUC) value of the receiver operating characteristic (ROC) curve predicted by the anchor link is 11% higher than the baseline method, and Precision@30 is 25% higher than the baseline method.