J. Feng, Mingyang Zhang, Huandong Wang, Zeyu Yang, Chao Zhang, Yong Li, Depeng Jin
{"title":"DPLink:基于异构移动数据的深度神经网络用户身份链接","authors":"J. Feng, Mingyang Zhang, Huandong Wang, Zeyu Yang, Chao Zhang, Yong Li, Depeng Jin","doi":"10.1145/3308558.3313424","DOIUrl":null,"url":null,"abstract":"Online services are playing critical roles in almost all aspects of users' life. Users usually have multiple online identities (IDs) in different online services. In order to fuse the separated user data in multiple services for better business intelligence, it is critical for service providers to link online IDs belonging to the same user. On the other hand, the popularity of mobile networks and GPS-equipped smart devices have provided a generic way to link IDs, i.e., utilizing the mobility traces of IDs. However, linking IDs based on their mobility traces has been a challenging problem due to the highly heterogeneous, incomplete and noisy mobility data across services. In this paper, we propose DPLink, an end-to-end deep learning based framework, to complete the user identity linkage task for heterogeneous mobility data collected from different services with different properties. DPLink is made up by a feature extractor including a location encoder and a trajectory encoder to extract representative features from trajectory and a comparator to compare and decide whether to link two trajectories as the same user. Particularly, we propose a pre-training strategy with a simple task to train the DPLink model to overcome the training difficulties introduced by the highly heterogeneous nature of different source mobility data. Besides, we introduce a multi-modal embedding network and a co-attention mechanism in DPLink to deal with the low-quality problem of mobility data. By conducting extensive experiments on two real-life ground-truth mobility datasets with eight baselines, we demonstrate that DPLink outperforms the state-of-the-art solutions by more than 15% in terms of hit-precision. Moreover, it is expandable to add external geographical context data and works stably with heterogeneous noisy mobility traces. Our code is publicly available1.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"68","resultStr":"{\"title\":\"DPLink: User Identity Linkage via Deep Neural Network From Heterogeneous Mobility Data\",\"authors\":\"J. Feng, Mingyang Zhang, Huandong Wang, Zeyu Yang, Chao Zhang, Yong Li, Depeng Jin\",\"doi\":\"10.1145/3308558.3313424\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online services are playing critical roles in almost all aspects of users' life. Users usually have multiple online identities (IDs) in different online services. In order to fuse the separated user data in multiple services for better business intelligence, it is critical for service providers to link online IDs belonging to the same user. On the other hand, the popularity of mobile networks and GPS-equipped smart devices have provided a generic way to link IDs, i.e., utilizing the mobility traces of IDs. However, linking IDs based on their mobility traces has been a challenging problem due to the highly heterogeneous, incomplete and noisy mobility data across services. In this paper, we propose DPLink, an end-to-end deep learning based framework, to complete the user identity linkage task for heterogeneous mobility data collected from different services with different properties. DPLink is made up by a feature extractor including a location encoder and a trajectory encoder to extract representative features from trajectory and a comparator to compare and decide whether to link two trajectories as the same user. Particularly, we propose a pre-training strategy with a simple task to train the DPLink model to overcome the training difficulties introduced by the highly heterogeneous nature of different source mobility data. Besides, we introduce a multi-modal embedding network and a co-attention mechanism in DPLink to deal with the low-quality problem of mobility data. By conducting extensive experiments on two real-life ground-truth mobility datasets with eight baselines, we demonstrate that DPLink outperforms the state-of-the-art solutions by more than 15% in terms of hit-precision. Moreover, it is expandable to add external geographical context data and works stably with heterogeneous noisy mobility traces. Our code is publicly available1.\",\"PeriodicalId\":23013,\"journal\":{\"name\":\"The World Wide Web Conference\",\"volume\":\"24 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"68\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The World Wide Web Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3308558.3313424\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The World Wide Web Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3308558.3313424","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DPLink: User Identity Linkage via Deep Neural Network From Heterogeneous Mobility Data
Online services are playing critical roles in almost all aspects of users' life. Users usually have multiple online identities (IDs) in different online services. In order to fuse the separated user data in multiple services for better business intelligence, it is critical for service providers to link online IDs belonging to the same user. On the other hand, the popularity of mobile networks and GPS-equipped smart devices have provided a generic way to link IDs, i.e., utilizing the mobility traces of IDs. However, linking IDs based on their mobility traces has been a challenging problem due to the highly heterogeneous, incomplete and noisy mobility data across services. In this paper, we propose DPLink, an end-to-end deep learning based framework, to complete the user identity linkage task for heterogeneous mobility data collected from different services with different properties. DPLink is made up by a feature extractor including a location encoder and a trajectory encoder to extract representative features from trajectory and a comparator to compare and decide whether to link two trajectories as the same user. Particularly, we propose a pre-training strategy with a simple task to train the DPLink model to overcome the training difficulties introduced by the highly heterogeneous nature of different source mobility data. Besides, we introduce a multi-modal embedding network and a co-attention mechanism in DPLink to deal with the low-quality problem of mobility data. By conducting extensive experiments on two real-life ground-truth mobility datasets with eight baselines, we demonstrate that DPLink outperforms the state-of-the-art solutions by more than 15% in terms of hit-precision. Moreover, it is expandable to add external geographical context data and works stably with heterogeneous noisy mobility traces. Our code is publicly available1.