A cross-domain user association scheme based on graph attention networks with trajectory embedding

IF 4.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Machine Learning Pub Date : 2024-08-21 DOI:10.1007/s10994-024-06613-z

Keqing Cen, Zhenghao Yang, Ze Wang, Minhong Dong

{"title":"A cross-domain user association scheme based on graph attention networks with trajectory embedding","authors":"Keqing Cen, Zhenghao Yang, Ze Wang, Minhong Dong","doi":"10.1007/s10994-024-06613-z","DOIUrl":null,"url":null,"abstract":"<p>With the widespread adoption of mobile internet, users generate vast amounts of location-based data across multiple social networking platforms. This data is valuable for applications such as personalized recommendations and targeted advertising. Accurately identifying users across different platforms enhances understanding of user behavior and preferences. To address the complexity of cross-domain user identification caused by varying check-in frequencies and data precision differences, we propose HTEGAT, a hierarchical trajectory embedding-based graph attention network model. HTEGAT addresses these issues by combining an Encoder and a Trajectory Identification module. The Encoder module, by integrating self-attention mechanisms with LSTM, can effectively extract location point-level features and accurately capture trajectory transition features, thereby accurately characterizing hierarchical temporal trajectories. Trajectory Identification module introduces trajectory distance-neighbor relationships and constructs an adjacency matrix based on these relationships. By utilizing attention weight coefficients in a graph attention network to capture similarities between trajectories, this approach reduces identification complexity while addressing the issue of dataset sparsity. Experiments on two cross-domain Location-Based Social Network (LBSN) datasets demonstrate that HTEGAT achieves higher hit rates with lower time complexity. On the Foursquare-Twitter dataset, HTEGAT significantly improved hit rates, surpassing state-of-the-art methods. On the Instagram-Twitter dataset, HTEGAT consistently outperformed contemporary models, showcasing its effectiveness and superiority.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"1 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06613-z","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

With the widespread adoption of mobile internet, users generate vast amounts of location-based data across multiple social networking platforms. This data is valuable for applications such as personalized recommendations and targeted advertising. Accurately identifying users across different platforms enhances understanding of user behavior and preferences. To address the complexity of cross-domain user identification caused by varying check-in frequencies and data precision differences, we propose HTEGAT, a hierarchical trajectory embedding-based graph attention network model. HTEGAT addresses these issues by combining an Encoder and a Trajectory Identification module. The Encoder module, by integrating self-attention mechanisms with LSTM, can effectively extract location point-level features and accurately capture trajectory transition features, thereby accurately characterizing hierarchical temporal trajectories. Trajectory Identification module introduces trajectory distance-neighbor relationships and constructs an adjacency matrix based on these relationships. By utilizing attention weight coefficients in a graph attention network to capture similarities between trajectories, this approach reduces identification complexity while addressing the issue of dataset sparsity. Experiments on two cross-domain Location-Based Social Network (LBSN) datasets demonstrate that HTEGAT achieves higher hit rates with lower time complexity. On the Foursquare-Twitter dataset, HTEGAT significantly improved hit rates, surpassing state-of-the-art methods. On the Instagram-Twitter dataset, HTEGAT consistently outperformed contemporary models, showcasing its effectiveness and superiority.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于轨迹嵌入的图注意网络的跨域用户关联方案

随着移动互联网的广泛应用，用户在多个社交网络平台上产生了大量基于位置的数据。这些数据对于个性化推荐和定向广告等应用非常有价值。准确识别不同平台上的用户可以加深对用户行为和偏好的了解。为了解决因签到频率和数据精度不同而造成的跨域用户识别的复杂性，我们提出了 HTEGAT，一种基于分层轨迹嵌入的图注意力网络模型。HTEGAT 通过将编码器模块和轨迹识别模块相结合来解决这些问题。编码器模块通过将自我注意机制与 LSTM 相结合，可以有效提取位置点级特征，并准确捕捉轨迹过渡特征，从而准确描述分层时间轨迹。轨迹识别模块引入轨迹的距离-邻接关系，并根据这些关系构建邻接矩阵。通过利用图注意力网络中的注意力权重系数来捕捉轨迹之间的相似性，这种方法降低了识别的复杂性，同时解决了数据集稀少的问题。在两个跨领域基于位置的社交网络（LBSN）数据集上的实验表明，HTEGAT 能以更低的时间复杂度实现更高的命中率。在 Foursquare-Twitter 数据集上，HTEGAT 显著提高了命中率，超过了最先进的方法。在 Instagram-Twitter 数据集上，HTEGAT 的表现始终优于当代模型，展示了其有效性和优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Machine Learning 工程技术-计算机：人工智能

CiteScore

11.00

自引率

2.70%

发文量

162

审稿时长

3 months

期刊介绍： Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.