Multi-Agent Reinforcement Learning for Online Food Delivery with Location Privacy Preservation

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Information (Switzerland) Pub Date : 2023-11-03 DOI:10.3390/info14110597

Suleiman Abahussein, Dayong Ye, Congcong Zhu, Zishuo Cheng, Umer Siddique, Sheng Shen

{"title":"Multi-Agent Reinforcement Learning for Online Food Delivery with Location Privacy Preservation","authors":"Suleiman Abahussein, Dayong Ye, Congcong Zhu, Zishuo Cheng, Umer Siddique, Sheng Shen","doi":"10.3390/info14110597","DOIUrl":null,"url":null,"abstract":"Online food delivery services today are considered an essential service that gets significant attention worldwide. Many companies and individuals are involved in this field as it offers good income and numerous jobs to the community. In this research, we consider the problem of online food delivery services and how we can increase the number of received orders by couriers and thereby increase their income. Multi-agent reinforcement learning (MARL) is employed to guide the couriers to areas with high demand for food delivery requests. A map of the city is divided into small grids, and each grid represents a small area of the city that has different demand for online food delivery orders. The MARL agent trains and learns which grid has the highest demand and then selects it. Thus, couriers can get more food delivery orders and thereby increase long-term income. While increasing the number of received orders is important, protecting customer location is also essential. Therefore, the Protect User Location Method (PULM) is proposed in this research in order to protect customer location information. The PULM injects differential privacy (DP) Laplace noise based on two parameters: city area size and customer frequency of online food delivery orders. We use two datasets—Shenzhen, China, and Iowa, USA—to demonstrate the results of our experiments. The results show an increase in the number of received orders in the Shenzhen and Iowa City datasets. We also show the similarity and data utility of courier trajectories after we use our obfuscation (PULM) method.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":"232 1","pages":"0"},"PeriodicalIF":2.4000,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information (Switzerland)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/info14110597","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 1

Abstract

Online food delivery services today are considered an essential service that gets significant attention worldwide. Many companies and individuals are involved in this field as it offers good income and numerous jobs to the community. In this research, we consider the problem of online food delivery services and how we can increase the number of received orders by couriers and thereby increase their income. Multi-agent reinforcement learning (MARL) is employed to guide the couriers to areas with high demand for food delivery requests. A map of the city is divided into small grids, and each grid represents a small area of the city that has different demand for online food delivery orders. The MARL agent trains and learns which grid has the highest demand and then selects it. Thus, couriers can get more food delivery orders and thereby increase long-term income. While increasing the number of received orders is important, protecting customer location is also essential. Therefore, the Protect User Location Method (PULM) is proposed in this research in order to protect customer location information. The PULM injects differential privacy (DP) Laplace noise based on two parameters: city area size and customer frequency of online food delivery orders. We use two datasets—Shenzhen, China, and Iowa, USA—to demonstrate the results of our experiments. The results show an increase in the number of received orders in the Shenzhen and Iowa City datasets. We also show the similarity and data utility of courier trajectories after we use our obfuscation (PULM) method.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于位置隐私保护的在线送餐多智能体强化学习

如今，在线送餐服务被认为是一项重要的服务，在全球范围内受到了极大的关注。许多公司和个人都参与了这个领域，因为它为社区提供了良好的收入和大量的就业机会。在本研究中，我们考虑在线外卖服务的问题，以及如何增加快递员收到的订单数量，从而增加他们的收入。采用多智能体强化学习(MARL)将快递员引导到外卖需求高的地区。城市地图被划分成小网格，每个网格代表城市的一小块区域，这些区域对在线外卖订单有不同的需求。MARL智能体训练并学习需求最大的网格，然后选择它。因此，快递员可以获得更多的外卖订单，从而增加长期收入。虽然增加收到的订单数量很重要，但保护客户位置也很重要。因此，本研究提出保护用户位置方法(protection User Location Method, PULM)来保护客户位置信息。该PULM基于城市面积和在线外卖订单的客户频率两个参数注入差分隐私(DP)拉普拉斯噪声。我们使用两个数据集——中国深圳和美国爱荷华——来展示我们的实验结果。结果显示，深圳和爱荷华城市数据集中收到的订单数量有所增加。我们还展示了使用我们的混淆(PULM)方法后的快递轨迹的相似性和数据效用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊