Charge or Pick Up? Optimizing E-Taxi Management: A Dual-Stage Heuristic Coordinated Reinforcement Learning Approach

IF 6.4 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2024-11-05 DOI:10.1109/TASE.2024.3486342

Donghe Li;Chunlin Hu;Qingyu Yang;Pengtao Song;Feiye Zhang;Dou An

{"title":"Charge or Pick Up? Optimizing E-Taxi Management: A Dual-Stage Heuristic Coordinated Reinforcement Learning Approach","authors":"Donghe Li;Chunlin Hu;Qingyu Yang;Pengtao Song;Feiye Zhang;Dou An","doi":"10.1109/TASE.2024.3486342","DOIUrl":null,"url":null,"abstract":"In recent years, the rapid adoption of electric vehicles (EVs) in the taxi industry has transformed traditional taxi-hailing systems into electric taxi (E-taxi) hailing systems. As a result, it is crucial to develop effective strategies for optimizing E-taxi management by considering both passenger-taxi matching and charging planning. In this paper, we first formalize the E-taxi management optimization problem as a Markov decision problem with dynamic state and heterogeneous action. We then propose a dual-stage heuristic coordinated reinforcement learning (RL) approach that incorporates advanced feature selection and heuristic allocation strategies. Our approach consists of two main stages. In the first stage, we introduce the feature-guided state dimensionality stabilization proximal policy optimization (PPO) method to address dynamic state dimensions by a feature selection method, and enabling E-taxis to decide whether to charge or pick up passengers. In the second stage, we propose a heuristic coordinated assignment method to further allocate charging stations and passengers for the E-taxis, and provide the RL network in the first stage with rewards based on the results. This approach effectively tackles the challenge of RL processing of heterogeneous action spaces (charge and pick up). We evaluate our proposed method in a real-world E-taxi environment and find that it significantly enhances the experience for both E-taxis and passengers. Specifically, due to our method’s rational planning for passenger pick-up and charging, E-taxis can increase their revenue by 20% compared to traditional RL methods or random scheduling approaches. As for passengers, since the taxis have more efficiently planned their charging behavior, the probability of their orders being answered increases by 15%, while their waiting time is reduced by 55%. These achievements contribute to the advancement of E-taxi management strategies and promote the widespread adoption of electric vehicles, ultimately supporting the transition to a more sustainable transportation system. Note to Practitioners—The increasing adoption of electric vehicles in the taxi industry has led to the need for effective E-taxi management strategies that consider both passenger-taxi matching and charging planning. In this study, we introduce a dual-stage heuristic coordinated reinforcement learning approach that addresses these challenges by integrating a feature-guided state dimensionality stabilization proximal policy optimization method and a heuristic coordinated assignment method. Our approach offers several practical benefits for E-taxi service providers, drivers, and passengers. For E-taxi service providers, the proposed method improves E-taxi dispatch efficiency, resulting in a more effective use of available resources and potentially increasing overall revenue. For E-taxi drivers, our approach leads to better planning of charging and passenger pick-up decisions, increasing their earnings by 20% compared to traditional methods, and reducing the average occurrence of low battery status from more than 4 times every 10 hours to less than 1 time. Passengers, on the other hand, experience improved service quality due to the more efficient E-taxi management. The probability of their orders being answered increases by 15%, and their waiting time is reduced by 100%. These improvements contribute to an enhanced user experience and may encourage further adoption of E-taxis as a sustainable transportation solution. The proposed method can be integrated into existing E-taxi hailing platforms, such as DiDi and Uber, to enhance their dispatch and charging management capabilities. As the global trend towards sustainable transportation continues to grow, our approach provides valuable insights and a practical solution for the efficient management of E-taxi fleets in modern urban environments.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"8533-8553"},"PeriodicalIF":6.4000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10744210/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, the rapid adoption of electric vehicles (EVs) in the taxi industry has transformed traditional taxi-hailing systems into electric taxi (E-taxi) hailing systems. As a result, it is crucial to develop effective strategies for optimizing E-taxi management by considering both passenger-taxi matching and charging planning. In this paper, we first formalize the E-taxi management optimization problem as a Markov decision problem with dynamic state and heterogeneous action. We then propose a dual-stage heuristic coordinated reinforcement learning (RL) approach that incorporates advanced feature selection and heuristic allocation strategies. Our approach consists of two main stages. In the first stage, we introduce the feature-guided state dimensionality stabilization proximal policy optimization (PPO) method to address dynamic state dimensions by a feature selection method, and enabling E-taxis to decide whether to charge or pick up passengers. In the second stage, we propose a heuristic coordinated assignment method to further allocate charging stations and passengers for the E-taxis, and provide the RL network in the first stage with rewards based on the results. This approach effectively tackles the challenge of RL processing of heterogeneous action spaces (charge and pick up). We evaluate our proposed method in a real-world E-taxi environment and find that it significantly enhances the experience for both E-taxis and passengers. Specifically, due to our method’s rational planning for passenger pick-up and charging, E-taxis can increase their revenue by 20% compared to traditional RL methods or random scheduling approaches. As for passengers, since the taxis have more efficiently planned their charging behavior, the probability of their orders being answered increases by 15%, while their waiting time is reduced by 55%. These achievements contribute to the advancement of E-taxi management strategies and promote the widespread adoption of electric vehicles, ultimately supporting the transition to a more sustainable transportation system. Note to Practitioners—The increasing adoption of electric vehicles in the taxi industry has led to the need for effective E-taxi management strategies that consider both passenger-taxi matching and charging planning. In this study, we introduce a dual-stage heuristic coordinated reinforcement learning approach that addresses these challenges by integrating a feature-guided state dimensionality stabilization proximal policy optimization method and a heuristic coordinated assignment method. Our approach offers several practical benefits for E-taxi service providers, drivers, and passengers. For E-taxi service providers, the proposed method improves E-taxi dispatch efficiency, resulting in a more effective use of available resources and potentially increasing overall revenue. For E-taxi drivers, our approach leads to better planning of charging and passenger pick-up decisions, increasing their earnings by 20% compared to traditional methods, and reducing the average occurrence of low battery status from more than 4 times every 10 hours to less than 1 time. Passengers, on the other hand, experience improved service quality due to the more efficient E-taxi management. The probability of their orders being answered increases by 15%, and their waiting time is reduced by 100%. These improvements contribute to an enhanced user experience and may encourage further adoption of E-taxis as a sustainable transportation solution. The proposed method can be integrated into existing E-taxi hailing platforms, such as DiDi and Uber, to enhance their dispatch and charging management capabilities. As the global trend towards sustainable transportation continues to grow, our approach provides valuable insights and a practical solution for the efficient management of E-taxi fleets in modern urban environments.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

收费还是取车？优化电子出租车管理：双阶段启发式协调强化学习法

近年来，电动汽车在出租车行业的迅速普及，使传统的出租车叫车系统转变为电动出租车（E-taxi）叫车系统。因此，制定优化电动出租车管理的有效策略是至关重要的，同时考虑乘客-出租车匹配和收费计划。本文首先将电动出租车管理优化问题形式化为一个具有动态和异构作用的马尔可夫决策问题。然后，我们提出了一种双阶段启发式协调强化学习（RL）方法，该方法结合了高级特征选择和启发式分配策略。我们的方法包括两个主要阶段。在第一阶段，引入特征引导的状态维数稳定近端策略优化（PPO）方法，通过特征选择方法解决动态状态维数问题，使电动出租车能够自主决定是否收费或载客。在第二阶段，我们提出了一种启发式协调分配方法，进一步为电动出租车分配充电站和乘客，并根据结果向第一阶段的RL网络提供奖励。这种方法有效地解决了RL处理异构动作空间（充电和拾取）的挑战。我们在真实的电动出租车环境中评估了我们提出的方法，发现它显著提高了电动出租车和乘客的体验。具体而言，由于我们的方法对乘客接送和收费进行了合理的规划，相比传统的RL方法或随机调度方法，电动出租车的收益可以增加20%。对于乘客而言，由于出租车更有效地规划了自己的收费行为，他们的订单被回应的概率增加了15%，而他们的等待时间减少了55%。这些成就有助于推进电动出租车管理战略，促进电动汽车的广泛采用，最终支持向更可持续的交通系统过渡。从业人员注意：出租车行业越来越多地采用电动汽车，导致需要有效的电动出租车管理策略，考虑乘客-出租车匹配和充电计划。在本研究中，我们引入了一种双阶段启发式协调强化学习方法，通过集成特征引导的状态维稳定近端策略优化方法和启发式协调分配方法来解决这些挑战。我们的方法为电子出租车服务提供商、司机和乘客提供了几个实际的好处。对于电动出租车服务提供商而言，所提出的方法提高了电动出租车的调度效率，从而更有效地利用可用资源，并有可能增加整体收入。对于电动出租车司机来说，我们的方法可以更好地规划充电和接送乘客的决策，与传统方法相比，他们的收入增加了20%，并且将电池电量不足的平均发生率从每10小时4次以上减少到不到1次。另一方面，由于更高效的电子出租车管理，乘客体验到更高的服务质量。他们的订单得到回应的概率增加了15%，他们的等待时间减少了100%。这些改进有助于增强用户体验，并可能鼓励进一步采用电动出租车作为可持续的交通解决方案。所提出的方法可以集成到现有的电动出租车平台，如滴滴和优步，以增强他们的调度和收费管理能力。随着可持续交通的全球趋势不断发展，我们的方法为现代城市环境中电动出租车车队的有效管理提供了有价值的见解和实用的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统

CiteScore

12.50

自引率

14.30%

发文量

404

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.