The two-echelon distribution system has been increasingly adopted in modern e-commerce logistics. However, as customer service requirements become more diverse, logistics providers must simultaneously accommodate delivery and pickup requests while also satisfying multiple non-overlapping time windows. These additional constraints substantially increase the complexity of balancing service quality and operational costs. To address this challenge, this study extends the classical two-echelon vehicle routing problem and introduces a new variant, called the two-echelon vehicle routing problem with simultaneous pickup and delivery under multiple time windows (2E-VRPSPDMTW). A mixed-integer linear programming (MILP) model is formulated to minimize total operational cost. Given the NP-hard nature of the problem, a Q-learning-based hyper-heuristic algorithm (QLHHA) is developed. The proposed framework first applies a spatiotemporal clustering strategy to allocate customers to satellites, thereby reducing the search space. It then constructs a pool of eight low-level heuristic operators, while a Q-learning mechanism serves as the high-level controller to adaptively select the most appropriate operator. A case study based on real operational data from a cross-border e-commerce logistics company in the Guangdong-Hong Kong-Macau Greater Bay Area (GBA) is conducted to evaluate the method. Extensive test cases and ablation experiment results demonstrate that QLHHA surpasses several state-of-the-art algorithms in both solution quality and stability, achieving up to a 10% reduction in total operational cost. Sensitivity analyses further reveal that, for large-scale demand scenarios, moderately widening the time-window width can substantially reduce operational cost.
扫码关注我们
求助内容:
应助结果提醒方式:
