首页 > 最新文献

Transportation Research Part E-Logistics and Transportation Review最新文献

英文 中文
Optimizing driver’s discount order acceptance strategies: A policy-improved deep deterministic policy gradient framework 优化司机折扣订单接受策略:一个策略改进的深度确定性策略梯度框架
IF 8.8 1区 工程技术 Q1 ECONOMICS Pub Date : 2026-01-06 DOI: 10.1016/j.tre.2025.104628
Hanwen Dai , Chang Gao , Fang He , Congyuan Ji , Yanni Yang
The rapid expansion of platform integration has emerged as an effective solution to mitigate market fragmentation by consolidating multiple ride-hailing platforms into a single application. To address heterogeneous passenger preferences, third-party integrators provide Discount Express service delivered by express drivers at lower trip fares. For the individual platform, encouraging broader participation of drivers in Discount Express services has the potential to expand the accessible demand pool and improve matching efficiency, but often at the cost of reduced profit margins. This study aims to dynamically manage drivers’ acceptance of Discount Express from the perspective of an individual platform, incorporating the spatiotemporal demand-supply patterns. The lack of historical data under the new business model necessitates online learning. However, early-stage exploration through trial and error can be costly in practice, highlighting the need for reliable early-stage performance in real-world deployment. To address these challenges, this study formulates the decision regarding the proportion of drivers accepting discount orders as a continuous control task. In response to the high stochasticity, the opaque matching mechanisms employed by third-party integrator, and the limited availability of historical data, we propose an innovative policy-improved deep deterministic policy gradient (pi-DDPG) framework. The proposed framework incorporates a refiner module to boost policy performance during the early training phase, leverages a convolutional long short-term memory network to effectively capture complex spatiotemporal patterns, and adopts a prioritized experience replay mechanism to enhance learning efficiency. A customized simulator based on a real-world dataset is developed to validate the effectiveness of the proposed pi-DDPG. Numerical experiments demonstrate that pi-DDPG achieves superior learning efficiency and significantly reduces early-stage training losses, enhancing its applicability to practical ride-hailing scenarios.
平台整合的快速扩张已经成为一种有效的解决方案,通过将多个网约车平台整合到一个应用程序中来缓解市场分化。为了解决乘客的不同偏好,第三方集成商提供了折扣快递服务,由快递司机以更低的票价提供服务。对于单个平台而言,鼓励司机更广泛地参与折扣快递服务,有可能扩大可访问的需求池,提高匹配效率,但往往以降低利润率为代价。本研究旨在从个体平台角度,结合时空供需模式,动态管理司机对折扣快递的接受度。在新的商业模式下,缺乏历史数据使得在线学习成为必要。然而,在实践中,通过试错进行的早期探索可能代价高昂,这突出了在实际部署中对可靠的早期性能的需求。为了解决这些挑战,本研究将司机接受折扣订单比例的决策制定为连续控制任务。针对高随机性、第三方集成商采用的不透明匹配机制以及历史数据可用性有限等问题,提出了一种创新的策略改进深度确定性策略梯度(pi-DDPG)框架。该框架采用细化模块提高策略在早期训练阶段的性能,利用卷积长短期记忆网络有效捕获复杂的时空模式,并采用优先体验重放机制提高学习效率。开发了一个基于真实数据集的定制模拟器来验证所提出的pi-DDPG的有效性。数值实验表明,pi-DDPG取得了优异的学习效率,显著降低了早期训练损失,增强了对实际网约车场景的适用性。
{"title":"Optimizing driver’s discount order acceptance strategies: A policy-improved deep deterministic policy gradient framework","authors":"Hanwen Dai ,&nbsp;Chang Gao ,&nbsp;Fang He ,&nbsp;Congyuan Ji ,&nbsp;Yanni Yang","doi":"10.1016/j.tre.2025.104628","DOIUrl":"10.1016/j.tre.2025.104628","url":null,"abstract":"<div><div>The rapid expansion of platform integration has emerged as an effective solution to mitigate market fragmentation by consolidating multiple ride-hailing platforms into a single application. To address heterogeneous passenger preferences, third-party integrators provide Discount Express service delivered by express drivers at lower trip fares. For the individual platform, encouraging broader participation of drivers in Discount Express services has the potential to expand the accessible demand pool and improve matching efficiency, but often at the cost of reduced profit margins. This study aims to dynamically manage drivers’ acceptance of Discount Express from the perspective of an individual platform, incorporating the spatiotemporal demand-supply patterns. The lack of historical data under the new business model necessitates online learning. However, early-stage exploration through trial and error can be costly in practice, highlighting the need for reliable early-stage performance in real-world deployment. To address these challenges, this study formulates the decision regarding the proportion of drivers accepting discount orders as a continuous control task. In response to the high stochasticity, the opaque matching mechanisms employed by third-party integrator, and the limited availability of historical data, we propose an innovative policy-improved deep deterministic policy gradient (pi-DDPG) framework. The proposed framework incorporates a refiner module to boost policy performance during the early training phase, leverages a convolutional long short-term memory network to effectively capture complex spatiotemporal patterns, and adopts a prioritized experience replay mechanism to enhance learning efficiency. A customized simulator based on a real-world dataset is developed to validate the effectiveness of the proposed pi-DDPG. Numerical experiments demonstrate that pi-DDPG achieves superior learning efficiency and significantly reduces early-stage training losses, enhancing its applicability to practical ride-hailing scenarios.</div></div>","PeriodicalId":49418,"journal":{"name":"Transportation Research Part E-Logistics and Transportation Review","volume":"208 ","pages":"Article 104628"},"PeriodicalIF":8.8,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discriminatory order assignment and payment-setting of on-demand food-delivery platforms: A multi-action and multi-agent reinforcement learning framework 按需外卖平台的歧视性订单分配和支付设置:一个多行动和多智能体强化学习框架
IF 8.8 1区 工程技术 Q1 ECONOMICS Pub Date : 2026-01-06 DOI: 10.1016/j.tre.2025.104653
Zijian Zhao , Sen Li
This paper studies the discriminatory order assignment and payment-setting strategies for on-demand food-delivery platforms. We consider an on-demand food-delivery platform that coordinates customers, couriers, and restaurants to maximize the profit. It determines how to bundle orders, assign orders to couriers, and set payments to couriers in real-time. These decisions are made in a personalized manner, depending on the historical data collected from each of the couriers, such as the order acceptance and rejection rates under distinct scenarios of order assignment and payment values. A Markov Decision Process is formulated for the courier, capturing the decisions of the platform (including differentiated order assignment/bundling strategies and the discriminatory payment-settings decisions) while considering its dependence on the personalized work-related data of each individual courier. To derive the optimal policies, we propose a novel multi-action and multi-agent deep reinforcement learning framework, where a double Deep Q-Network is employed to develop discrete order assignment strategies, and double Proximal Policy Optimization is utilized to determine continuous payment decisions. Within this learning framework, we introduce a novel neural network architecture that leverages the Query-Key attention mechanism to transform multiplicative time complexities into additive computation complexity for order assignment, and we adopt a variable-length Bi-LSTM module that compresses variable-length order sequence into a fixed-dimensional feature space to enhance scalability. The proposed neural network and algorithmic framework was validated in a case study using real-world food-delivery data from Hong Kong. By comparing the proposed method with a vanilla MLP-based neural network architecture, we find that the proposed neural network architecture significantly enhances platform performance: it increases the number of orders served by 5.25%, reduces platform expenses by 10%, and improves the overall reward of the platform by over 50%. Additionally, our results reveal that couriers with higher order rejection rates receive more orders during peak hours but earn lower wages. This counterintuitive finding is attributed to a strategic approach by the platform to differentiate order allocation: instead of simply allocating fewer orders to couriers with higher rejection rates, the platform preferentially assigns longer-distance trips to couriers with a higher likelihood of order acceptance. These findings expose the implicit biases in the discriminatory algorithms used by the profit-maximizing platform and highlight potential areas for governmental regulatory intervention. The code of this paper is provided at https://github.com/RS2002/Discriminatory-Food-Delivery.
本文研究了按需外卖平台的歧视性订单分配和支付设置策略。我们考虑建立一个按需送餐平台,协调顾客、快递员和餐馆,以实现利润最大化。它决定如何捆绑订单,如何将订单分配给快递员,以及如何实时设置支付给快递员的费用。这些决策以个性化的方式做出,取决于从每个快递员收集的历史数据,例如在不同的订单分配和支付值场景下的订单接受率和拒绝率。为快递员制定了一个马尔可夫决策过程,在考虑其对每个快递员个性化工作数据的依赖的同时,捕获平台的决策(包括差异化订单分配/捆绑策略和歧视性支付设置决策)。为了获得最优策略,我们提出了一种新的多动作和多智能体深度强化学习框架,其中使用双deep Q-Network来制定离散订单分配策略,使用双Proximal Policy Optimization来确定连续支付决策。在这个学习框架中,我们引入了一种新的神经网络架构,利用Query-Key关注机制将乘法时间复杂度转化为加性计算复杂度进行顺序分配,并采用变长Bi-LSTM模块将变长顺序序列压缩到固定维特征空间中以增强可扩展性。所提出的神经网络和算法框架在使用香港实际送餐数据的案例研究中得到验证。通过与基于mlp的神经网络架构进行比较,我们发现该神经网络架构显著提高了平台性能:服务订单数量增加了5.25%,平台费用减少了10%,平台整体回报提高了50%以上。此外,我们的研究结果显示,高拒收率的快递员在高峰时段收到的订单更多,但工资却更低。这一违反直觉的发现归因于该平台区分订单分配的战略方法:该平台不是简单地将较少的订单分配给拒收率较高的快递员,而是优先将较长距离的行程分配给接受订单可能性较高的快递员。这些发现揭示了利润最大化平台使用的歧视性算法中的隐性偏见,并突出了政府监管干预的潜在领域。本文的代码在https://github.com/RS2002/Discriminatory-Food-Delivery上提供。
{"title":"Discriminatory order assignment and payment-setting of on-demand food-delivery platforms: A multi-action and multi-agent reinforcement learning framework","authors":"Zijian Zhao ,&nbsp;Sen Li","doi":"10.1016/j.tre.2025.104653","DOIUrl":"10.1016/j.tre.2025.104653","url":null,"abstract":"<div><div>This paper studies the discriminatory order assignment and payment-setting strategies for on-demand food-delivery platforms. We consider an on-demand food-delivery platform that coordinates customers, couriers, and restaurants to maximize the profit. It determines how to bundle orders, assign orders to couriers, and set payments to couriers in real-time. These decisions are made in a personalized manner, depending on the historical data collected from each of the couriers, such as the order acceptance and rejection rates under distinct scenarios of order assignment and payment values. A Markov Decision Process is formulated for the courier, capturing the decisions of the platform (including differentiated order assignment/bundling strategies and the discriminatory payment-settings decisions) while considering its dependence on the personalized work-related data of each individual courier. To derive the optimal policies, we propose a novel multi-action and multi-agent deep reinforcement learning framework, where a double Deep Q-Network is employed to develop discrete order assignment strategies, and double Proximal Policy Optimization is utilized to determine continuous payment decisions. Within this learning framework, we introduce a novel neural network architecture that leverages the Query-Key attention mechanism to transform multiplicative time complexities into additive computation complexity for order assignment, and we adopt a variable-length Bi-LSTM module that compresses variable-length order sequence into a fixed-dimensional feature space to enhance scalability. The proposed neural network and algorithmic framework was validated in a case study using real-world food-delivery data from Hong Kong. By comparing the proposed method with a vanilla MLP-based neural network architecture, we find that the proposed neural network architecture significantly enhances platform performance: it increases the number of orders served by 5.25%, reduces platform expenses by 10%, and improves the overall reward of the platform by over 50%. Additionally, our results reveal that couriers with higher order rejection rates receive more orders during peak hours but earn lower wages. This counterintuitive finding is attributed to a strategic approach by the platform to differentiate order allocation: instead of simply allocating fewer orders to couriers with higher rejection rates, the platform preferentially assigns longer-distance trips to couriers with a higher likelihood of order acceptance. These findings expose the implicit biases in the discriminatory algorithms used by the profit-maximizing platform and highlight potential areas for governmental regulatory intervention. The code of this paper is provided at <span><span>https://github.com/RS2002/Discriminatory-Food-Delivery</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49418,"journal":{"name":"Transportation Research Part E-Logistics and Transportation Review","volume":"208 ","pages":"Article 104653"},"PeriodicalIF":8.8,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inventory-constrained online learning for revenue management with delayed feedback 具有延迟反馈的收入管理的库存约束在线学习
IF 8.8 1区 工程技术 Q1 ECONOMICS Pub Date : 2026-01-06 DOI: 10.1016/j.tre.2025.104649
Sheng Ji
Delayed feedback is a prevalent challenge in modern logistics and transportation systems, especially on digital retail platforms. This paper investigates an online learning and pricing problem characterized by aggregated and anonymous delays. In this setting, neither demand nor revenue is immediately observable following a pricing decision; instead, these metrics become available to the retailer only after some stochastic delay. The retailer also faces an initial inventory constraint, creating a complex exploration-exploitation trade-off among learning demand, generating revenue, and managing inventory. To address this challenge, we propose a novel batch-based learning algorithm, referred to as Bandits with Dual Mirror Descent (BUD for short), which integrates mirror descent with bandit control. The algorithm employs a carefully designed batch structure to isolate the impact of delayed feedback, while combining Upper Confidence Bound (UCB) for pricing with dual updates for inventory management. Our theoretical analysis shows that the regret (defined as the revenue gap between the optimal policy and the learning algorithm) of BUD grows sublinearly with the selling horizon and matches the known lower bounds in both bandit with delays and online pricing problems. We conducted numerical experiments to demonstrate that the regret of BUD converges to 0 in various scenarios.
延迟反馈是现代物流和运输系统中普遍存在的挑战,特别是在数字零售平台上。本文研究了一个以聚合和匿名延迟为特征的在线学习和定价问题。在这种情况下,定价决定后,需求和收入都无法立即观察到;相反,这些指标只有在经过一些随机延迟后才对零售商可用。零售商还面临最初的库存约束,在了解需求、产生收入和管理库存之间产生了复杂的探索-开发权衡。为了解决这一挑战,我们提出了一种新的基于批处理的学习算法,称为具有双镜像下降的强盗(简称BUD),它将镜像下降与强盗控制相结合。该算法采用精心设计的批处理结构来隔离延迟反馈的影响,同时将定价的上置信限(UCB)与库存管理的双更新相结合。我们的理论分析表明,BUD的后悔(定义为最优策略与学习算法之间的收入差距)随着销售水平的次线性增长,并且在具有延迟和在线定价问题的强盗中都匹配已知的下界。我们通过数值实验证明了在各种情况下BUD的后悔收敛于0。
{"title":"Inventory-constrained online learning for revenue management with delayed feedback","authors":"Sheng Ji","doi":"10.1016/j.tre.2025.104649","DOIUrl":"10.1016/j.tre.2025.104649","url":null,"abstract":"<div><div>Delayed feedback is a prevalent challenge in modern logistics and transportation systems, especially on digital retail platforms. This paper investigates an online learning and pricing problem characterized by aggregated and anonymous delays. In this setting, neither demand nor revenue is immediately observable following a pricing decision; instead, these metrics become available to the retailer only after some stochastic delay. The retailer also faces an initial inventory constraint, creating a complex exploration-exploitation trade-off among learning demand, generating revenue, and managing inventory. To address this challenge, we propose a novel batch-based learning algorithm, referred to as Bandits with Dual Mirror Descent (BUD for short), which integrates mirror descent with bandit control. The algorithm employs a carefully designed batch structure to isolate the impact of delayed feedback, while combining Upper Confidence Bound (UCB) for pricing with dual updates for inventory management. Our theoretical analysis shows that the regret (defined as the revenue gap between the optimal policy and the learning algorithm) of BUD grows sublinearly with the selling horizon and matches the known lower bounds in both bandit with delays and online pricing problems. We conducted numerical experiments to demonstrate that the regret of BUD converges to 0 in various scenarios.</div></div>","PeriodicalId":49418,"journal":{"name":"Transportation Research Part E-Logistics and Transportation Review","volume":"208 ","pages":"Article 104649"},"PeriodicalIF":8.8,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint optimization of flood water routing and congestion-aware evacuation scheduling 洪水路径与拥挤感知疏散调度的联合优化
IF 8.8 1区 工程技术 Q1 ECONOMICS Pub Date : 2026-01-03 DOI: 10.1016/j.tre.2025.104645
Sina Bahrami , Mehdi Nourinejad , Matthew J. Roorda , Yafeng Yin
Urban flood emergencies pose significant risks to human safety and infrastructure operability, particularly in smart cities with interdependent systems. This study proposes an integrated optimization model for coordinating water and transportation networks during flood evacuations. The model simultaneously determines optimal reservoir discharge rates and dynamic vehicular evacuation schedules to maximize the number of evacuees within the limited warning time. Water flow is modeled using the Muskingum-Cunge flood-routing method to simulate flood propagation through a river-reservoir system, while traffic flow is captured via the Cell Transmission Model, which accounts for congestion dynamics and road capacities. The problem is formulated as a nonlinear program and solved through a linear relaxation using generalized Benders decomposition. A case study of the Town of High River, Canada, illustrates the model’s practical utility. Results show that the integrated strategy extends warning times, reduces congestion, and lowers the number of individuals exposed to flood risks compared to uncoordinated approaches. By enabling real-time, infrastructure-aware evacuation planning, the proposed framework offers a scalable decision-support tool for emergency managers. This work contributes to the growing body of research on the management of city infrastructures under disruption and supports the development of resilient and coordinated evacuation strategies in smart urban environments.
城市突发洪水事件对人类安全和基础设施的可操作性构成重大风险,特别是在具有相互依存系统的智慧城市。本研究提出洪水疏散过程中水运网络协调的综合优化模型。该模型同时确定最优水库流量和动态车辆疏散计划,在有限的预警时间内实现疏散人数最大化。通过Muskingum-Cunge洪水路径方法模拟洪水在河流-水库系统中的传播,而通过细胞传输模型(Cell Transmission Model)捕获交通流量,该模型考虑了拥堵动态和道路容量。该问题被表述为一个非线性规划,并通过广义Benders分解的线性松弛来求解。以加拿大High River镇为例,说明了该模型的实用性。结果表明,与不协调的方法相比,综合策略延长了预警时间,减少了拥堵,降低了暴露于洪水风险的个体数量。通过实现实时、感知基础设施的疏散规划,提议的框架为应急管理人员提供了可扩展的决策支持工具。这项工作有助于对城市基础设施在中断下的管理进行越来越多的研究,并支持在智能城市环境中制定有弹性和协调的疏散策略。
{"title":"Joint optimization of flood water routing and congestion-aware evacuation scheduling","authors":"Sina Bahrami ,&nbsp;Mehdi Nourinejad ,&nbsp;Matthew J. Roorda ,&nbsp;Yafeng Yin","doi":"10.1016/j.tre.2025.104645","DOIUrl":"10.1016/j.tre.2025.104645","url":null,"abstract":"<div><div>Urban flood emergencies pose significant risks to human safety and infrastructure operability, particularly in smart cities with interdependent systems. This study proposes an integrated optimization model for coordinating water and transportation networks during flood evacuations. The model simultaneously determines optimal reservoir discharge rates and dynamic vehicular evacuation schedules to maximize the number of evacuees within the limited warning time. Water flow is modeled using the Muskingum-Cunge flood-routing method to simulate flood propagation through a river-reservoir system, while traffic flow is captured via the Cell Transmission Model, which accounts for congestion dynamics and road capacities. The problem is formulated as a nonlinear program and solved through a linear relaxation using generalized Benders decomposition. A case study of the Town of High River, Canada, illustrates the model’s practical utility. Results show that the integrated strategy extends warning times, reduces congestion, and lowers the number of individuals exposed to flood risks compared to uncoordinated approaches. By enabling real-time, infrastructure-aware evacuation planning, the proposed framework offers a scalable decision-support tool for emergency managers. This work contributes to the growing body of research on the management of city infrastructures under disruption and supports the development of resilient and coordinated evacuation strategies in smart urban environments.</div></div>","PeriodicalId":49418,"journal":{"name":"Transportation Research Part E-Logistics and Transportation Review","volume":"208 ","pages":"Article 104645"},"PeriodicalIF":8.8,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Offline operations strategies of bike-sharing platforms: pure profit or beyond profit? 共享单车平台的线下运营策略:纯利润还是超利润?
IF 8.8 1区 工程技术 Q1 ECONOMICS Pub Date : 2026-01-02 DOI: 10.1016/j.tre.2025.104644
Dongliang Guo , Zhi-Ping Fan , Minghe Sun
Bike-sharing platforms can adopt two different offline operations strategies, i.e., independent operations (Strategy I), where each platform independently manages its offline operations, and outsourcing (Strategy O), where one platform outsources its offline operations to another competing platform. With the rise of the “beyond profit” management doctrine, many bike-sharing platforms have begun to pursue dual purposes, i.e., both profits and consumer surpluses, instead of the single purpose, i.e., “pure profit”. Given these facts, this work examines the equilibrium offline operations strategies of two bike-sharing platforms in a duopoly market based on the Hotelling framework and analyzes the platform profits and consumer surplus when the platforms pursue a single purpose or dual purposes. Several important results are obtained. When the platforms engage in intensive competition, the equilibrium operations strategy of the two platforms is Strategy O, and pursuing dual purposes can harm their respective profits. Under weak platform competition, both the investment synergy effect and the investment efficiency of offline operations can significantly affect the platform equilibrium offline operations strategies, and the platforms can obtain higher profits when pursuing dual purposes than pursuing a single purpose if they give low attention weightings to consumer surplus. Additionally, consumer surplus can always be higher when the platforms pursue dual purposes than when pursuing a single purpose, but Pareto improvement may be achieved by the platforms and consumers regardless of the platform competition intensity and the adoption of Strategy I or O.
共享单车平台可以采用两种不同的线下运营策略,即独立运营(策略I)和外包(策略O),即一个平台将其线下运营外包给另一个竞争平台。随着“超越利润”管理理念的兴起,许多共享单车平台开始追求双重目的,即利润和消费者剩余,而不是单一目的,即“纯利润”。在此基础上,本文基于Hotelling框架,考察了双寡头市场下两个共享单车平台的均衡线下运营策略,并分析了平台追求单一目的和双重目的时的平台利润和消费者剩余。得到了几个重要的结果。当平台处于激烈竞争时,两个平台的均衡运营策略为O策略,追求双重目的会损害各自的利润。在弱平台竞争条件下,线下运营的投资协同效应和投资效率都会显著影响平台均衡的线下运营策略,且当平台对消费者剩余的关注权重较低时,追求双重目标的平台可以比追求单一目标的平台获得更高的利润。此外,当平台追求双重目标时,消费者剩余总是比追求单一目标时更高,但无论平台竞争强度和采用策略I或O,平台和消费者都可能实现帕累托改进。
{"title":"Offline operations strategies of bike-sharing platforms: pure profit or beyond profit?","authors":"Dongliang Guo ,&nbsp;Zhi-Ping Fan ,&nbsp;Minghe Sun","doi":"10.1016/j.tre.2025.104644","DOIUrl":"10.1016/j.tre.2025.104644","url":null,"abstract":"<div><div>Bike-sharing platforms can adopt two different offline operations strategies, i.e., independent operations (Strategy I), where each platform independently manages its offline operations, and outsourcing (Strategy O), where one platform outsources its offline operations to another competing platform. With the rise of the “beyond profit” management doctrine, many bike-sharing platforms have begun to pursue dual purposes, i.e., both profits and consumer surpluses, instead of the single purpose, i.e., “pure profit”. Given these facts, this work examines the equilibrium offline operations strategies of two bike-sharing platforms in a duopoly market based on the Hotelling framework and analyzes the platform profits and consumer surplus when the platforms pursue a single purpose or dual purposes. Several important results are obtained. When the platforms engage in intensive competition, the equilibrium operations strategy of the two platforms is Strategy O, and pursuing dual purposes can harm their respective profits. Under weak platform competition, both the investment synergy effect and the investment efficiency of offline operations can significantly affect the platform equilibrium offline operations strategies, and the platforms can obtain higher profits when pursuing dual purposes than pursuing a single purpose if they give low attention weightings to consumer surplus. Additionally, consumer surplus can always be higher when the platforms pursue dual purposes than when pursuing a single purpose, but Pareto improvement may be achieved by the platforms and consumers regardless of the platform competition intensity and the adoption of Strategy I or O.</div></div>","PeriodicalId":49418,"journal":{"name":"Transportation Research Part E-Logistics and Transportation Review","volume":"208 ","pages":"Article 104644"},"PeriodicalIF":8.8,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145876989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatiotemporal Features-Aware relocating for idle vehicles using spatial mean field deep Q network reinforcement learning 基于空间平均场深度Q网络强化学习的空闲车辆时空特征感知定位
IF 8.8 1区 工程技术 Q1 ECONOMICS Pub Date : 2026-01-02 DOI: 10.1016/j.tre.2025.104651
Zhiju Chen , Kai Liu , Jiangbo Wang , Jintao Ke
The cruising behavior of idle ride-hailing vehicles in search of passengers is a key influencing factor that restricts the spatiotemporal balance between online ride-hailing supply and passenger demands. This paper aims to simulate the strategy of transferring idle vehicles in multiple hexagonal partitions to adjacent grid partitions by proposing a spatiotemporal features-aware relocating approach (STFAR) that integrates spatiotemporal features of ride hailing into deep reinforcement learning. Specifically, spatial clustering algorithm and time series clustering algorithm are used to identify the spatiotemporal pattern of ride-hailing demand in each hexagonal partition. In addition, the direction of central hot spot is determined by accurately predicting the future short-term travel demand of each hexagonal partition. Finally, a spatial mean field deep Q network (SMFDQN) reinforcement learning method which regards the hexagonal partition as limited and fixed numbers spatial multi-agents is proposed to optimize the efficiency of idle vehicle transfer. STFAR improves the SMFDQN method by integrating the above spatiotemporal features into state space and action space designs and effectively improves the supply and demand balance in the entire region. Experiments based on Didi Chuxing order data during a certain time period in Chengdu showed that STFAR increases the cumulative order revenue by 3.64%, increases the completion rate of demand by 4.03%, and increases the dispatched rate of idle vehicles by 2.98% compared with the state-of-the-art algorithms.
空闲网约车寻客巡航行为是制约网约车供需时空平衡的关键影响因素。本文提出了一种时空特征感知的重新定位方法(STFAR),该方法将网约车的时空特征集成到深度强化学习中,旨在模拟将多个六边形分区中的闲置车辆转移到相邻网格分区的策略。具体而言,利用空间聚类算法和时间序列聚类算法识别每个六边形分区内网约车需求的时空格局。此外,通过对各六边形分区未来短期出行需求的准确预测,确定中心热点的方向。最后,提出了一种将六边形划分为有限固定数量的空间多智能体的空间均场深度Q网络(SMFDQN)强化学习方法来优化闲置车辆转移效率。STFAR改进了SMFDQN方法,将上述时空特征整合到状态空间和动作空间设计中,有效改善了整个区域的供需平衡。基于滴滴出行在成都某时间段的订单数据进行的实验表明,与现有算法相比,STFAR算法使累计订单收入提高3.64%,使需求完成率提高4.03%,使闲置车辆调度率提高2.98%。
{"title":"Spatiotemporal Features-Aware relocating for idle vehicles using spatial mean field deep Q network reinforcement learning","authors":"Zhiju Chen ,&nbsp;Kai Liu ,&nbsp;Jiangbo Wang ,&nbsp;Jintao Ke","doi":"10.1016/j.tre.2025.104651","DOIUrl":"10.1016/j.tre.2025.104651","url":null,"abstract":"<div><div>The cruising behavior of idle ride-hailing vehicles in search of passengers is a key influencing factor that restricts the spatiotemporal balance between online ride-hailing supply and passenger demands. This paper aims to simulate the strategy of transferring idle vehicles in multiple hexagonal partitions to adjacent grid partitions by proposing a spatiotemporal features-aware relocating approach (STFAR) that integrates spatiotemporal features of ride hailing into deep reinforcement learning. Specifically, spatial clustering algorithm and time series clustering algorithm are used to identify the spatiotemporal pattern of ride-hailing demand in each hexagonal partition. In addition, the direction of central hot spot is determined by accurately predicting the future short-term travel demand of each hexagonal partition. Finally, a spatial mean field deep Q network (SMFDQN) reinforcement learning method which regards the hexagonal partition as limited and fixed numbers spatial multi-agents is proposed to optimize the efficiency of idle vehicle transfer. STFAR improves the SMFDQN method by integrating the above spatiotemporal features into state space and action space designs and effectively improves the supply and demand balance in the entire region. Experiments based on Didi Chuxing order data during a certain time period in Chengdu showed that STFAR increases the cumulative order revenue by 3.64%, increases the completion rate of demand by 4.03%, and increases the dispatched rate of idle vehicles by 2.98% compared with the state-of-the-art algorithms.</div></div>","PeriodicalId":49418,"journal":{"name":"Transportation Research Part E-Logistics and Transportation Review","volume":"208 ","pages":"Article 104651"},"PeriodicalIF":8.8,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145876990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The flood fighting problem: A basic model and construction heuristics 防洪问题:一个基本模型及施工启发式
IF 8.8 1区 工程技术 Q1 ECONOMICS Pub Date : 2026-01-02 DOI: 10.1016/j.tre.2025.104636
Karolin Eisele, Alf Kimms
Natural disasters such as floods occur more and more frequently due to climate change and claim many victims. If protective measures such as floodplains and dams are not sufficient or are damaged, emergency services must be deployed. In order to be able to deploy them as effectively as possible, we present a model for emergency services planning in the event of flooding. The mathematical model is based on the idea that the area of interest is subdivided into cells and snapshots of the situation are considered at discrete time periods. This way, we can model the spread of water over time taking the specific profile of the terrain into account. Also, the locations and the movement of the emergency teams can be described with user–specified granularity. Since solving such models optimally is out of the scope of today’s computational capabilities, we discuss several variants of so–called construction heuristics. Such methods run fast and produce results that help to assess a flood situation and about what can be achieved over time by fighting the floods. Such insights may not only help after the occurrence of an event, but also in advance in order to be prepared better. In a computational study the performance of heuristics based in simple priority rules is studied.
由于气候变化,洪水等自然灾害越来越频繁地发生,并造成许多受害者。如果洪泛区和水坝等保护措施不够或遭到破坏,就必须部署紧急服务。为了能够尽可能有效地部署它们,我们提出了一个在发生洪水时进行应急服务规划的模型。数学模型是基于这样的思想,即感兴趣的区域被细分为单元,并且在离散的时间段考虑情况的快照。这样,我们就可以在考虑到地形的特定剖面的情况下,对水随时间的扩散进行建模。此外,可以用用户指定的粒度描述应急小组的位置和移动情况。由于以最佳方式求解此类模型超出了当今计算能力的范围,因此我们讨论了所谓的构造启发式的几种变体。这种方法运行迅速,产生的结果有助于评估洪水情况,以及随着时间的推移,通过抗洪可以取得什么成果。这样的洞见不仅可以在事件发生后有所帮助,还可以提前做好准备。在计算研究中,研究了基于简单优先规则的启发式算法的性能。
{"title":"The flood fighting problem: A basic model and construction heuristics","authors":"Karolin Eisele,&nbsp;Alf Kimms","doi":"10.1016/j.tre.2025.104636","DOIUrl":"10.1016/j.tre.2025.104636","url":null,"abstract":"<div><div>Natural disasters such as floods occur more and more frequently due to climate change and claim many victims. If protective measures such as floodplains and dams are not sufficient or are damaged, emergency services must be deployed. In order to be able to deploy them as effectively as possible, we present a model for emergency services planning in the event of flooding. The mathematical model is based on the idea that the area of interest is subdivided into cells and snapshots of the situation are considered at discrete time periods. This way, we can model the spread of water over time taking the specific profile of the terrain into account. Also, the locations and the movement of the emergency teams can be described with user–specified granularity. Since solving such models optimally is out of the scope of today’s computational capabilities, we discuss several variants of so–called construction heuristics. Such methods run fast and produce results that help to assess a flood situation and about what can be achieved over time by fighting the floods. Such insights may not only help after the occurrence of an event, but also in advance in order to be prepared better. In a computational study the performance of heuristics based in simple priority rules is studied.</div></div>","PeriodicalId":49418,"journal":{"name":"Transportation Research Part E-Logistics and Transportation Review","volume":"208 ","pages":"Article 104636"},"PeriodicalIF":8.8,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IF 8.8 1区 工程技术 Q1 ECONOMICS Pub Date : 2026-01-01
{"title":"","authors":"","doi":"","DOIUrl":"","url":null,"abstract":"","PeriodicalId":49418,"journal":{"name":"Transportation Research Part E-Logistics and Transportation Review","volume":"207 ","pages":"Article 104650"},"PeriodicalIF":8.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146469862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IF 8.8 1区 工程技术 Q1 ECONOMICS Pub Date : 2026-01-01
{"title":"","authors":"","doi":"","DOIUrl":"","url":null,"abstract":"","PeriodicalId":49418,"journal":{"name":"Transportation Research Part E-Logistics and Transportation Review","volume":"209 ","pages":"Article 104762"},"PeriodicalIF":8.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146498821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IF 8.8 1区 工程技术 Q1 ECONOMICS Pub Date : 2026-01-01
{"title":"","authors":"","doi":"","DOIUrl":"","url":null,"abstract":"","PeriodicalId":49418,"journal":{"name":"Transportation Research Part E-Logistics and Transportation Review","volume":"209 ","pages":"Article 104700"},"PeriodicalIF":8.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146498830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Transportation Research Part E-Logistics and Transportation Review
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1