In vehicular networks, federated learning (FL) using multimodal sensory data from connected vehicles is critical for emerging autonomous driving applications, such as trajectory prediction, object detection, and semantic segmentation. However, imbalanced learning contributions across modalities and heterogeneous computation/communication capabilities of vehicles pose fundamental challenges, leading to biased model training, slow convergence, and inefficient resource utilization. In this paper, we propose V-FedMM, a novel framework for vehicular federated learning with multimodal data that optimizes both sample and vehicle selection. Particularly, sample selection follows a two-stage mechanism: (i) modality-level selection, where the server determines vehicle participation and per-modality sample counts based on onboard data distribution and modality-specific training contributions; and (ii) sample-level selection, where each vehicle selects specific multimodal data samples by balancing sample importance and utilization to strategically incentivize the participation of low-contribution data. We theoretically analyze the impact of varying modality contributions on training performance by deriving an upper bound for the loss function. To determine the optimal per-modality sample counts for each vehicle under a strict per-round delay constraint, we formulate a joint vehicle selection and per-modality sample count optimization problem that maximizes the contribution from all selected samples. We solve optimization problem by first selecting vehicles guided by a theoretical property, then solving the remaining sample allocation problem as an integer linear programming (ILP) problem with the branch and bound algorithm. Extensive simulation results demonstrate the superior performance of V-FedMM, achieving nearly 5% higher accuracy compared to conventional vehicular FL approaches and saving 16.8% computation overhead compared to the conventional sampling algorithm.
扫码关注我们
求助内容:
应助结果提醒方式:
