{"title":"A cross-domain user association scheme based on graph attention networks with trajectory embedding","authors":"Keqing Cen, Zhenghao Yang, Ze Wang, Minhong Dong","doi":"10.1007/s10994-024-06613-z","DOIUrl":null,"url":null,"abstract":"<p>With the widespread adoption of mobile internet, users generate vast amounts of location-based data across multiple social networking platforms. This data is valuable for applications such as personalized recommendations and targeted advertising. Accurately identifying users across different platforms enhances understanding of user behavior and preferences. To address the complexity of cross-domain user identification caused by varying check-in frequencies and data precision differences, we propose HTEGAT, a hierarchical trajectory embedding-based graph attention network model. HTEGAT addresses these issues by combining an Encoder and a Trajectory Identification module. The Encoder module, by integrating self-attention mechanisms with LSTM, can effectively extract location point-level features and accurately capture trajectory transition features, thereby accurately characterizing hierarchical temporal trajectories. Trajectory Identification module introduces trajectory distance-neighbor relationships and constructs an adjacency matrix based on these relationships. By utilizing attention weight coefficients in a graph attention network to capture similarities between trajectories, this approach reduces identification complexity while addressing the issue of dataset sparsity. Experiments on two cross-domain Location-Based Social Network (LBSN) datasets demonstrate that HTEGAT achieves higher hit rates with lower time complexity. On the Foursquare-Twitter dataset, HTEGAT significantly improved hit rates, surpassing state-of-the-art methods. On the Instagram-Twitter dataset, HTEGAT consistently outperformed contemporary models, showcasing its effectiveness and superiority.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06613-z","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
With the widespread adoption of mobile internet, users generate vast amounts of location-based data across multiple social networking platforms. This data is valuable for applications such as personalized recommendations and targeted advertising. Accurately identifying users across different platforms enhances understanding of user behavior and preferences. To address the complexity of cross-domain user identification caused by varying check-in frequencies and data precision differences, we propose HTEGAT, a hierarchical trajectory embedding-based graph attention network model. HTEGAT addresses these issues by combining an Encoder and a Trajectory Identification module. The Encoder module, by integrating self-attention mechanisms with LSTM, can effectively extract location point-level features and accurately capture trajectory transition features, thereby accurately characterizing hierarchical temporal trajectories. Trajectory Identification module introduces trajectory distance-neighbor relationships and constructs an adjacency matrix based on these relationships. By utilizing attention weight coefficients in a graph attention network to capture similarities between trajectories, this approach reduces identification complexity while addressing the issue of dataset sparsity. Experiments on two cross-domain Location-Based Social Network (LBSN) datasets demonstrate that HTEGAT achieves higher hit rates with lower time complexity. On the Foursquare-Twitter dataset, HTEGAT significantly improved hit rates, surpassing state-of-the-art methods. On the Instagram-Twitter dataset, HTEGAT consistently outperformed contemporary models, showcasing its effectiveness and superiority.
期刊介绍:
Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.