Multi-Agent Reinforcement Learning for Online Food Delivery with Location Privacy Preservation

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Information (Switzerland) Pub Date : 2023-11-03 DOI:10.3390/info14110597
Suleiman Abahussein, Dayong Ye, Congcong Zhu, Zishuo Cheng, Umer Siddique, Sheng Shen
{"title":"Multi-Agent Reinforcement Learning for Online Food Delivery with Location Privacy Preservation","authors":"Suleiman Abahussein, Dayong Ye, Congcong Zhu, Zishuo Cheng, Umer Siddique, Sheng Shen","doi":"10.3390/info14110597","DOIUrl":null,"url":null,"abstract":"Online food delivery services today are considered an essential service that gets significant attention worldwide. Many companies and individuals are involved in this field as it offers good income and numerous jobs to the community. In this research, we consider the problem of online food delivery services and how we can increase the number of received orders by couriers and thereby increase their income. Multi-agent reinforcement learning (MARL) is employed to guide the couriers to areas with high demand for food delivery requests. A map of the city is divided into small grids, and each grid represents a small area of the city that has different demand for online food delivery orders. The MARL agent trains and learns which grid has the highest demand and then selects it. Thus, couriers can get more food delivery orders and thereby increase long-term income. While increasing the number of received orders is important, protecting customer location is also essential. Therefore, the Protect User Location Method (PULM) is proposed in this research in order to protect customer location information. The PULM injects differential privacy (DP) Laplace noise based on two parameters: city area size and customer frequency of online food delivery orders. We use two datasets—Shenzhen, China, and Iowa, USA—to demonstrate the results of our experiments. The results show an increase in the number of received orders in the Shenzhen and Iowa City datasets. We also show the similarity and data utility of courier trajectories after we use our obfuscation (PULM) method.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information (Switzerland)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/info14110597","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1

Abstract

Online food delivery services today are considered an essential service that gets significant attention worldwide. Many companies and individuals are involved in this field as it offers good income and numerous jobs to the community. In this research, we consider the problem of online food delivery services and how we can increase the number of received orders by couriers and thereby increase their income. Multi-agent reinforcement learning (MARL) is employed to guide the couriers to areas with high demand for food delivery requests. A map of the city is divided into small grids, and each grid represents a small area of the city that has different demand for online food delivery orders. The MARL agent trains and learns which grid has the highest demand and then selects it. Thus, couriers can get more food delivery orders and thereby increase long-term income. While increasing the number of received orders is important, protecting customer location is also essential. Therefore, the Protect User Location Method (PULM) is proposed in this research in order to protect customer location information. The PULM injects differential privacy (DP) Laplace noise based on two parameters: city area size and customer frequency of online food delivery orders. We use two datasets—Shenzhen, China, and Iowa, USA—to demonstrate the results of our experiments. The results show an increase in the number of received orders in the Shenzhen and Iowa City datasets. We also show the similarity and data utility of courier trajectories after we use our obfuscation (PULM) method.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于位置隐私保护的在线送餐多智能体强化学习
如今,在线送餐服务被认为是一项重要的服务,在全球范围内受到了极大的关注。许多公司和个人都参与了这个领域,因为它为社区提供了良好的收入和大量的就业机会。在本研究中,我们考虑在线外卖服务的问题,以及如何增加快递员收到的订单数量,从而增加他们的收入。采用多智能体强化学习(MARL)将快递员引导到外卖需求高的地区。城市地图被划分成小网格,每个网格代表城市的一小块区域,这些区域对在线外卖订单有不同的需求。MARL智能体训练并学习需求最大的网格,然后选择它。因此,快递员可以获得更多的外卖订单,从而增加长期收入。虽然增加收到的订单数量很重要,但保护客户位置也很重要。因此,本研究提出保护用户位置方法(protection User Location Method, PULM)来保护客户位置信息。该PULM基于城市面积和在线外卖订单的客户频率两个参数注入差分隐私(DP)拉普拉斯噪声。我们使用两个数据集——中国深圳和美国爱荷华——来展示我们的实验结果。结果显示,深圳和爱荷华城市数据集中收到的订单数量有所增加。我们还展示了使用我们的混淆(PULM)方法后的快递轨迹的相似性和数据效用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Information (Switzerland)
Information (Switzerland) Computer Science-Information Systems
CiteScore
6.90
自引率
0.00%
发文量
515
审稿时长
11 weeks
期刊最新文献
Weakly Supervised Learning Approach for Implicit Aspect Extraction Science Mapping of Meta-Analysis in Agricultural Science An Integrated Time Series Prediction Model Based on Empirical Mode Decomposition and Two Attention Mechanisms Context-Aware Personalization: A Systems Engineering Framework Polarizing Topics on Twitter in the 2022 United States Elections
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1