RLMob

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining Pub Date : 2022-02-11 DOI:10.1145/3488560.3498438

Ziyan Luo, Congcong Miao

{"title":"RLMob","authors":"Ziyan Luo, Congcong Miao","doi":"10.1145/3488560.3498438","DOIUrl":null,"url":null,"abstract":"Human mobility prediction is an important task in the field of spatiotemporal sequential data mining and urban computing. Despite the extensive work on mining human mobility behavior, little attention was paid to the problem of successive mobility prediction. The state-of-the-art methods of human mobility prediction are mainly based on supervised learning. To achieve higher predictability and adapt well to the successive mobility prediction, there are four key challenges: 1) disability to the circumstance that the optimizing target is discrete-continuous hybrid and non-differentiable. In our work, we assume that the user's demands are always multi-targeted and can be modeled as a discrete-continuous hybrid function; 2) difficulty to alter the recommendation strategy flexibly according to the changes in user needs in real scenarios; 3) error propagation and exposure bias issues when predicting multiple points in successive mobility prediction; 4) cannot interactively explore user's potential interest that does not appear in the history. While previous methods met these difficulties, reinforcement learning (RL) is an intuitive answer for this task to settle these issues. We innovatively introduce RL to the successive prediction task. In this paper, we formulate this problem as a Markov Decision Process. We further propose a framework - RLMob to solve our problem. A simulated environment is carefully designed. An actor-critic framework with an instance of Proximal Policy Optimization (PPO) is applied to adapt to our scene with a large state space. Experiments show that on the task, the performance of our approach is consistently superior to that of the compared approaches.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3488560.3498438","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Human mobility prediction is an important task in the field of spatiotemporal sequential data mining and urban computing. Despite the extensive work on mining human mobility behavior, little attention was paid to the problem of successive mobility prediction. The state-of-the-art methods of human mobility prediction are mainly based on supervised learning. To achieve higher predictability and adapt well to the successive mobility prediction, there are four key challenges: 1) disability to the circumstance that the optimizing target is discrete-continuous hybrid and non-differentiable. In our work, we assume that the user's demands are always multi-targeted and can be modeled as a discrete-continuous hybrid function; 2) difficulty to alter the recommendation strategy flexibly according to the changes in user needs in real scenarios; 3) error propagation and exposure bias issues when predicting multiple points in successive mobility prediction; 4) cannot interactively explore user's potential interest that does not appear in the history. While previous methods met these difficulties, reinforcement learning (RL) is an intuitive answer for this task to settle these issues. We innovatively introduce RL to the successive prediction task. In this paper, we formulate this problem as a Markov Decision Process. We further propose a framework - RLMob to solve our problem. A simulated environment is carefully designed. An actor-critic framework with an instance of Proximal Policy Optimization (PPO) is applied to adapt to our scene with a large state space. Experiments show that on the task, the performance of our approach is consistently superior to that of the compared approaches.

查看原文