Pub Date : 2025-09-19DOI: 10.1109/TITS.2025.3609790
Yulun Li;Hongfang Gong;Mei Kuang
Deep reinforcement learning (DRL) is a promising way to develop autonomous driving decision-making models. However, poor driving decisions and low sample efficiency for multiple DRL coupled training hinder its applications in driving decision-making models. This article proposes an innovative framework to combine two different DRL algorithms as the upper- and lower-layer planner to make car-following and lane-changing decisions respectively. The upper- and lower-layer models are trained simultaneously, and the double-layer model outputs a composite driving action. In the upper-layer model, using TD3 algorithm generates continuous vehicle speed. This article proposes the action exploration mechanism where the TD3 algorithm selects one of two action policies with a probability and then outputs an action value in the early training phase. Moreover, the proposed Q-value auxiliary networks guide our upper-layer algorithm to compute the Q-value based on a Q-value from the trained TD3 algorithm. Dueling-double DQN is used to address the issue of how a vehicle changes lanes in the lower-layer model and to output discrete values to instruct the vehicle to change lanes. To validate our model in autonomous driving applications, different training and testing scenarios simulating expressways are designed through SUMO. The experiments show that our methods address the difficulty of coupling training for the integrated model and improve its performance under different traffic flow scenarios. Compared with other models, our model enhances driving velocity while ensuring vehicle safety.
{"title":"An Efficient Synchronous Training Integrated Model for Driving Decision-Making Based on Deep Reinforcement Learning","authors":"Yulun Li;Hongfang Gong;Mei Kuang","doi":"10.1109/TITS.2025.3609790","DOIUrl":"https://doi.org/10.1109/TITS.2025.3609790","url":null,"abstract":"Deep reinforcement learning (DRL) is a promising way to develop autonomous driving decision-making models. However, poor driving decisions and low sample efficiency for multiple DRL coupled training hinder its applications in driving decision-making models. This article proposes an innovative framework to combine two different DRL algorithms as the upper- and lower-layer planner to make car-following and lane-changing decisions respectively. The upper- and lower-layer models are trained simultaneously, and the double-layer model outputs a composite driving action. In the upper-layer model, using TD3 algorithm generates continuous vehicle speed. This article proposes the action exploration mechanism where the TD3 algorithm selects one of two action policies with a probability and then outputs an action value in the early training phase. Moreover, the proposed Q-value auxiliary networks guide our upper-layer algorithm to compute the Q-value based on a Q-value from the trained TD3 algorithm. Dueling-double DQN is used to address the issue of how a vehicle changes lanes in the lower-layer model and to output discrete values to instruct the vehicle to change lanes. To validate our model in autonomous driving applications, different training and testing scenarios simulating expressways are designed through SUMO. The experiments show that our methods address the difficulty of coupling training for the integrated model and improve its performance under different traffic flow scenarios. Compared with other models, our model enhances driving velocity while ensuring vehicle safety.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 12","pages":"23269-23281"},"PeriodicalIF":8.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145665757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Predicting spatio-temporal traffic flow presents significant challenges due to complex interactions between spatial and temporal factors. Existing approaches often address these dimensions in isolation, neglecting their critical interdependencies. In this paper, we introduce the Spatio-Temporal Unitized Model (STUM), a unified framework designed to capture both spatial and temporal dependencies while addressing spatio-temporal heterogeneity through techniques such as distribution alignment and feature fusion. It also ensures both predictive accuracy and computational efficiency. Central to STUM is the Adaptive Spatio-temporal Unitized Cell (ASTUC), which utilizes low-rank matrices to seamlessly store, update, and interact with space, time, as well as their correlations. Our framework is also modular, allowing it to integrate with various spatio-temporal graph neural networks through components such as backbone models, feature extractors, residual fusion blocks, and the predictor to collectively enhance forecasting outcomes. Experimental results across multiple real-world datasets demonstrate that STUM consistently improves prediction performance with minimal computational cost. These findings are further supported by hyperparameter optimization, ablation studies, and result visualization. We provide our source code for reproducibility at https://github.com/RWLinno/STUM
{"title":"Cross Space and Time: A Spatio-Temporal Unitized Model for Traffic Flow Forecasting","authors":"Weilin Ruan;Wenzhuo Wang;Siru Zhong;Wei Chen;Li Liu;Yuxuan Liang","doi":"10.1109/TITS.2025.3601630","DOIUrl":"https://doi.org/10.1109/TITS.2025.3601630","url":null,"abstract":"Predicting spatio-temporal traffic flow presents significant challenges due to complex interactions between spatial and temporal factors. Existing approaches often address these dimensions in isolation, neglecting their critical interdependencies. In this paper, we introduce the <underline>S</u>patio-<underline>T</u>emporal <underline>U</u>nitized <underline>M</u>odel (STUM), a unified framework designed to capture both spatial and temporal dependencies while addressing spatio-temporal heterogeneity through techniques such as distribution alignment and feature fusion. It also ensures both predictive accuracy and computational efficiency. Central to STUM is the Adaptive Spatio-temporal Unitized Cell (ASTUC), which utilizes low-rank matrices to seamlessly store, update, and interact with space, time, as well as their correlations. Our framework is also modular, allowing it to integrate with various spatio-temporal graph neural networks through components such as backbone models, feature extractors, residual fusion blocks, and the predictor to collectively enhance forecasting outcomes. Experimental results across multiple real-world datasets demonstrate that STUM consistently improves prediction performance with minimal computational cost. These findings are further supported by hyperparameter optimization, ablation studies, and result visualization. We provide our source code for reproducibility at <uri>https://github.com/RWLinno/STUM</uri>","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 11","pages":"21296-21308"},"PeriodicalIF":8.4,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145486518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-11DOI: 10.1109/TITS.2025.3605465
Aiheng Zhang;Zhen Sun;Qiguang Jiang;Kai Wang;Ming Li;Bailing Wang
With the development of intelligent transportation systems, vehicles are exposed to a complex network environment. As the mainstream in-vehicle network (IVN), the controller area network (CAN) has many potential security hazards. Existing deep learning-based intrusion detection methods have security performance advantages, however, they consume too much resources and are therefore not suitable to be directly implemented into the IVN. In this paper, we explore computational resource allocation schemes in the IVNs and propose the LiPar, which is a parallel neural network structure using lightweight multi-dimensional spatial and temporal feature fusion learning to perform intrusion detection tasks in the resource-constrained in-vehicle environment. In particular, LiPar adaptively allocates task loads to in-vehicle computing devices, such as multiple electronic control units, domain controllers, and computing gateways by evaluating whether a computing device is suitable to undertake the branch computing tasks according to its real-time resource occupancy. Experiment results show that LiPar achieves better detection performance, running efficiency, and optimized lightweight model size over existing methods, and can be well adapted to the resource-constrained in-vehicle environment and practically protect the in-vehicle CAN bus security. Code is available at https://github.com/wangkai-tech23/LiPar
{"title":"LiPar: A Lightweight Parallel Learning Model for Practical In-Vehicle Network Intrusion Detection","authors":"Aiheng Zhang;Zhen Sun;Qiguang Jiang;Kai Wang;Ming Li;Bailing Wang","doi":"10.1109/TITS.2025.3605465","DOIUrl":"https://doi.org/10.1109/TITS.2025.3605465","url":null,"abstract":"With the development of intelligent transportation systems, vehicles are exposed to a complex network environment. As the mainstream in-vehicle network (IVN), the controller area network (CAN) has many potential security hazards. Existing deep learning-based intrusion detection methods have security performance advantages, however, they consume too much resources and are therefore not suitable to be directly implemented into the IVN. In this paper, we explore computational resource allocation schemes in the IVNs and propose the LiPar, which is a parallel neural network structure using lightweight multi-dimensional spatial and temporal feature fusion learning to perform intrusion detection tasks in the resource-constrained in-vehicle environment. In particular, LiPar adaptively allocates task loads to in-vehicle computing devices, such as multiple electronic control units, domain controllers, and computing gateways by evaluating whether a computing device is suitable to undertake the branch computing tasks according to its real-time resource occupancy. Experiment results show that LiPar achieves better detection performance, running efficiency, and optimized lightweight model size over existing methods, and can be well adapted to the resource-constrained in-vehicle environment and practically protect the in-vehicle CAN bus security. Code is available at <uri>https://github.com/wangkai-tech23/LiPar</uri>","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 12","pages":"23358-23373"},"PeriodicalIF":8.4,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145665742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-09DOI: 10.1109/TITS.2025.3604412
Xiaokai Liu;Luyuan Hao;Yangyang Wang;Jie Wang
Ensuring resilient semantic segmentation under diverse outdoor conditions is vital for autonomous driving. However, nighttime segmentation remains underdeveloped compared to its daytime counterpart due to poor illumination and scarce annotated data, posing a significant challenge for night-scene understanding. Most current approaches mainly rely on domain adaptation technologies to transfer segmentation models trained on daytime scenes. However, the substantial distinctions between daytime and nighttime domains often hinder effective adaptation. To address this challenge, we leverage the implicit comprehensive information within nighttime data to enhance semantic segmentation through a semi-supervised approach. Specifically, we introduce a Semi-Supervised Comprehensive Learning (SSCL) approach, which is a unified, closed-loop learning architecture composed of three mutually reinforcing correction mechanisms: (1) unsupervised interactive correction to mitigate the risk of erroneous label propagation by leveraging complementary learning abilities; (2) unsupervised reinforcement correction, which enhances the model’s adaptability by promoting diverse learning on high-uncertainty regions through entropy-guided perturbation; (3) supervised standard correction to ensure alignment with known standards by anchoring the system to reference answers. SSCL is the first semi-supervised algorithm that jointly exploits structural diversity, uncertainty-aware supervision, and closed-loop correction to fully harness the latent potential of unlabeled nighttime data. Extensive experiments on NightCity, Dark Zurich and Nighttime Driving datasets demonstrate that SSCL achieves state-of-the-art performance in nighttime semantic segmentation.
{"title":"SSCL: Semi-Supervised Comprehensive Learning for Nighttime Semantic Segmentation","authors":"Xiaokai Liu;Luyuan Hao;Yangyang Wang;Jie Wang","doi":"10.1109/TITS.2025.3604412","DOIUrl":"https://doi.org/10.1109/TITS.2025.3604412","url":null,"abstract":"Ensuring resilient semantic segmentation under diverse outdoor conditions is vital for autonomous driving. However, nighttime segmentation remains underdeveloped compared to its daytime counterpart due to poor illumination and scarce annotated data, posing a significant challenge for night-scene understanding. Most current approaches mainly rely on domain adaptation technologies to transfer segmentation models trained on daytime scenes. However, the substantial distinctions between daytime and nighttime domains often hinder effective adaptation. To address this challenge, we leverage the implicit comprehensive information within nighttime data to enhance semantic segmentation through a semi-supervised approach. Specifically, we introduce a Semi-Supervised Comprehensive Learning (SSCL) approach, which is a unified, closed-loop learning architecture composed of three mutually reinforcing correction mechanisms: (1) unsupervised interactive correction to mitigate the risk of erroneous label propagation by leveraging complementary learning abilities; (2) unsupervised reinforcement correction, which enhances the model’s adaptability by promoting diverse learning on high-uncertainty regions through entropy-guided perturbation; (3) supervised standard correction to ensure alignment with known standards by anchoring the system to reference answers. SSCL is the first semi-supervised algorithm that jointly exploits structural diversity, uncertainty-aware supervision, and closed-loop correction to fully harness the latent potential of unlabeled nighttime data. Extensive experiments on NightCity, Dark Zurich and Nighttime Driving datasets demonstrate that SSCL achieves state-of-the-art performance in nighttime semantic segmentation.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 12","pages":"23202-23214"},"PeriodicalIF":8.4,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145665743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quantifying the driving risks is the primary prerequisite for improving intelligent vehicles’ driving safety. However, the time-varying nature of driving risks brought on by the dynamic and complicated traffic environment makes it challenging to precisely assess them. This study presents a method for measuring driving risk and establishing a unified framework for driving risk modeling. To determine the source of driving risks, we start with traffic accidents, which are abnormal energy transfers. A unified quantitative method based on the equivalent force model is proposed and analyzed in detail to design the unified driving risk modeling framework. The modeling method of an integrated driving risk model by comprehensively considering the driver-vehicle-road multifactor is obtained. We first validate the risk quantification effectiveness through simulation experiments involving multi-vehicle interactions and further verify the feasibility of the model through three natural vehicle experiments: car-following, cut-in, and intersection conflict scenarios. The experimental findings demonstrate that the suggested approach may determine the extent and direction of driving risks with vehicle-to-vehicle technical support. The proposed method adopts a collision warning system, which can forecast driving risk, broadcast the location of the risk in advance, and provide the driver with control suggestions to ensure driving safety.
{"title":"An Improved Modeling Method for Driving Risk by Considering Driver–Vehicle–Road Factors","authors":"Xunjia Zheng;Yue Liu;Huilan Li;Xing Chen;Jianjie Gao","doi":"10.1109/TITS.2025.3601620","DOIUrl":"https://doi.org/10.1109/TITS.2025.3601620","url":null,"abstract":"Quantifying the driving risks is the primary prerequisite for improving intelligent vehicles’ driving safety. However, the time-varying nature of driving risks brought on by the dynamic and complicated traffic environment makes it challenging to precisely assess them. This study presents a method for measuring driving risk and establishing a unified framework for driving risk modeling. To determine the source of driving risks, we start with traffic accidents, which are abnormal energy transfers. A unified quantitative method based on the equivalent force model is proposed and analyzed in detail to design the unified driving risk modeling framework. The modeling method of an integrated driving risk model by comprehensively considering the driver-vehicle-road multifactor is obtained. We first validate the risk quantification effectiveness through simulation experiments involving multi-vehicle interactions and further verify the feasibility of the model through three natural vehicle experiments: car-following, cut-in, and intersection conflict scenarios. The experimental findings demonstrate that the suggested approach may determine the extent and direction of driving risks with vehicle-to-vehicle technical support. The proposed method adopts a collision warning system, which can forecast driving risk, broadcast the location of the risk in advance, and provide the driver with control suggestions to ensure driving safety.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 11","pages":"21162-21171"},"PeriodicalIF":8.4,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145486524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-03DOI: 10.1109/TITS.2025.3601234
Tao Cheng;Mustafa Can Ozkan;Meng Fang;Xianghui Zhang
Structural disruptions in road networks, such as bridge closures or road outages, can severely impact traffic flow, leading to significant connectivity losses and unpredictable shifts in traffic patterns. Traditional traffic prediction models, designed for stable network conditions, often fail to adapt to these sudden changes in road capacity and connectivity. To address this challenge, we formalize flow redistribution caused by structural changes as a dynamic network prediction task. We then propose a novel feature-aware subgraph augmentation framework that enables Spatio-Temporal Graph Neural Networks (STGNNs) to learn robust redistribution patterns—even with limited historical data. Our framework simulates disruptions via subgraph perturbations to generate realistic training samples, effectively enriching the dataset and enhancing model generalizability to structural changes. Evaluated on the Hammersmith Bridge closure in London, the proposed augmentation strategy significantly improves model performance and outperforms data-hungry baselines, accurately capturing the disruption and its network-wide effects. This study demonstrates that targeted data augmentation can make STGNNs more effective in disruption scenarios with scarce historical data—offering a new, data-efficient paradigm for daily traffic prediction under both planned and unplanned network changes.
{"title":"What If London Bridge Is Closed? Feature-Aware Subgraph Augmentation for Modeling Road Network Structure Changes","authors":"Tao Cheng;Mustafa Can Ozkan;Meng Fang;Xianghui Zhang","doi":"10.1109/TITS.2025.3601234","DOIUrl":"https://doi.org/10.1109/TITS.2025.3601234","url":null,"abstract":"Structural disruptions in road networks, such as bridge closures or road outages, can severely impact traffic flow, leading to significant connectivity losses and unpredictable shifts in traffic patterns. Traditional traffic prediction models, designed for stable network conditions, often fail to adapt to these sudden changes in road capacity and connectivity. To address this challenge, we formalize flow redistribution caused by structural changes as a dynamic network prediction task. We then propose a novel feature-aware subgraph augmentation framework that enables Spatio-Temporal Graph Neural Networks (STGNNs) to learn robust redistribution patterns—even with limited historical data. Our framework simulates disruptions via subgraph perturbations to generate realistic training samples, effectively enriching the dataset and enhancing model generalizability to structural changes. Evaluated on the Hammersmith Bridge closure in London, the proposed augmentation strategy significantly improves model performance and outperforms data-hungry baselines, accurately capturing the disruption and its network-wide effects. This study demonstrates that targeted data augmentation can make STGNNs more effective in disruption scenarios with scarce historical data—offering a new, data-efficient paradigm for daily traffic prediction under both planned and unplanned network changes.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 11","pages":"21135-21148"},"PeriodicalIF":8.4,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145486515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01DOI: 10.1109/TITS.2025.3601716
Lei Lei;Peng Wei;Xiaoyue Xu;Jianxing Zhang;Guiyong Zhang
Marine intelligent transportation systems (M-ITS) face challenges in achieving adaptability and efficiency under dynamic spatiotemporal (ST) environments. This paper proposes a novel multimodal underwater transformable agent (MUTA) designed as an adaptive and efficient information node for M-ITS. The MUTA integrates a biomimetic morphing wing, multi-drive propulsion system, incremental environmental perception, and an uncertainty-aware dynamic modeling framework to switch smoothly among long-range (LR) cruising, high-maneuverability (HM) operation, and synergy modes. Comprehensive lake and sea trials demonstrate that the MUTA maintains high motion precision and achieves significant energy efficiency improvements, reaching depths up to 1200 m and covering a range of over 3000 km. The MUTA has achieved over 17% efficiency gain in lake and marine environments. This work provides an adaptive and efficient solution for M-ITS under uncertain marine conditions.
{"title":"Multimodal Underwater Transformable Agent for Efficient Marine Transportation in Dynamic Spatiotemporal Environments","authors":"Lei Lei;Peng Wei;Xiaoyue Xu;Jianxing Zhang;Guiyong Zhang","doi":"10.1109/TITS.2025.3601716","DOIUrl":"https://doi.org/10.1109/TITS.2025.3601716","url":null,"abstract":"Marine intelligent transportation systems (M-ITS) face challenges in achieving adaptability and efficiency under dynamic spatiotemporal (ST) environments. This paper proposes a novel multimodal underwater transformable agent (MUTA) designed as an adaptive and efficient information node for M-ITS. The MUTA integrates a biomimetic morphing wing, multi-drive propulsion system, incremental environmental perception, and an uncertainty-aware dynamic modeling framework to switch smoothly among long-range (LR) cruising, high-maneuverability (HM) operation, and synergy modes. Comprehensive lake and sea trials demonstrate that the MUTA maintains high motion precision and achieves significant energy efficiency improvements, reaching depths up to 1200 m and covering a range of over 3000 km. The MUTA has achieved over 17% efficiency gain in lake and marine environments. This work provides an adaptive and efficient solution for M-ITS under uncertain marine conditions.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 11","pages":"21283-21295"},"PeriodicalIF":8.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145486510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-19DOI: 10.1109/TITS.2025.3594563
Ruiping Wang;Jun Cheng;Junzhi Yu
Pedestrian trajectory prediction is crucial for intelligent surveillance, social robot navigation, and autonomous driving systems, attracting substantial research attention in recent years. Despite significant advances, accurate trajectory prediction remains challenging due to the inherent uncertainty in pedestrian intentions and the multimodal nature of human movement patterns. There remain two limitations in existing methods. First, they focus solely on predicting final goals while overlooking crucial intermediate intentions that guide pedestrian movement. Second, they utilize a static latent distribution model across all future timesteps, which fails to capture the dynamic and evolving nature of trajectory uncertainties as pedestrians move. To address these challenges, we propose a novel timewise intentions and time-varying distribution network, TITDNet, which can estimate pedestrian intentions over time while dynamically modeling trajectory uncertainties at each future timestep. Specifically, TITDNet includes two key components: an intention generator that estimates dynamic pedestrian intentions, and a variational autoencoder that captures the time-varying multimodal nature of future trajectories. A trajectory decoder then integrates historical movement patterns, predicted intentions, and learned distributions to generate accurate future trajectories. Extensive experiments on ETH, UCY, and SDD benchmark datasets demonstrate that our approach significantly outperforms the state-of-the-art methods.
{"title":"Timewise Intentions and Time-Varying Distribution Network for Pedestrian Trajectory Prediction","authors":"Ruiping Wang;Jun Cheng;Junzhi Yu","doi":"10.1109/TITS.2025.3594563","DOIUrl":"https://doi.org/10.1109/TITS.2025.3594563","url":null,"abstract":"Pedestrian trajectory prediction is crucial for intelligent surveillance, social robot navigation, and autonomous driving systems, attracting substantial research attention in recent years. Despite significant advances, accurate trajectory prediction remains challenging due to the inherent uncertainty in pedestrian intentions and the multimodal nature of human movement patterns. There remain two limitations in existing methods. First, they focus solely on predicting final goals while overlooking crucial intermediate intentions that guide pedestrian movement. Second, they utilize a static latent distribution model across all future timesteps, which fails to capture the dynamic and evolving nature of trajectory uncertainties as pedestrians move. To address these challenges, we propose a novel timewise intentions and time-varying distribution network, TITDNet, which can estimate pedestrian intentions over time while dynamically modeling trajectory uncertainties at each future timestep. Specifically, TITDNet includes two key components: an intention generator that estimates dynamic pedestrian intentions, and a variational autoencoder that captures the time-varying multimodal nature of future trajectories. A trajectory decoder then integrates historical movement patterns, predicted intentions, and learned distributions to generate accurate future trajectories. Extensive experiments on ETH, UCY, and SDD benchmark datasets demonstrate that our approach significantly outperforms the state-of-the-art methods.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 11","pages":"21123-21134"},"PeriodicalIF":8.4,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145510145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-13DOI: 10.1109/TITS.2025.3594889
Yang Gao;Saeed Saadatnejad;Alexandre Alahi
Accurate human trajectory prediction is one of the most crucial tasks for autonomous driving, ensuring its safety. Yet, existing models often fail to fully leverage the visual cues that humans subconsciously communicate when navigating the space. In this work, we study the benefits of predicting human trajectories using human body poses instead of solely their Cartesian space locations in time. We propose ‘Social-pose’, an attention-based pose encoder that effectively captures the poses of all humans in a scene and their social relations. Our method can be integrated into various trajectory prediction architectures. We have conducted extensive experiments on state-of-the-art models (based on LSTM, GAN, MLP, and Transformer), and showed improvements over all of them on synthetic (Joint Track Auto) and real (Human3.6M, Pedestrians and Cyclists in Road Traffic, and JRDB) datasets. We also explored the advantages of using 2D versus 3D poses, as well as the effect of noisy poses and the application of our pose-based predictor in robot navigation scenarios.
{"title":"Social-Pose: Enhancing Trajectory Prediction With Human Body Pose","authors":"Yang Gao;Saeed Saadatnejad;Alexandre Alahi","doi":"10.1109/TITS.2025.3594889","DOIUrl":"https://doi.org/10.1109/TITS.2025.3594889","url":null,"abstract":"Accurate human trajectory prediction is one of the most crucial tasks for autonomous driving, ensuring its safety. Yet, existing models often fail to fully leverage the visual cues that humans subconsciously communicate when navigating the space. In this work, we study the benefits of predicting human trajectories using human body poses instead of solely their Cartesian space locations in time. We propose ‘Social-pose’, an attention-based pose encoder that effectively captures the poses of all humans in a scene and their social relations. Our method can be integrated into various trajectory prediction architectures. We have conducted extensive experiments on state-of-the-art models (based on LSTM, GAN, MLP, and Transformer), and showed improvements over all of them on synthetic (Joint Track Auto) and real (Human3.6M, Pedestrians and Cyclists in Road Traffic, and JRDB) datasets. We also explored the advantages of using 2D versus 3D poses, as well as the effect of noisy poses and the application of our pose-based predictor in robot navigation scenarios.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 11","pages":"21309-21319"},"PeriodicalIF":8.4,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145486521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-13DOI: 10.1109/TITS.2025.3594901
Hao Ren;Tao Zheng;Lei Zhang;Wenxian Wang;Meng Li;Hongwei Li
Vulnerability behavior scanning plays a crucial role in securing Intelligent Autonomous Transportation Systems by ensuring protected communications and maintaining data integrity. Current scanning solutions, however, demonstrate several critical shortcomings: (1) their dependence on static analysis methods with predetermined scanning locations prevents dynamic adjustment of scanning strategies; (2) their limited capacity to capture data across multiple system layers fails to address sophisticated multi-layered attack patterns; and (3) their inability to dynamically activate monitoring probes hinders timely responses to newly emerging threats. To resolve these limitations, we present $textsf {VBSF}$ , an efficient and non-intrusive vulnerability scanning framework built upon extended Berkeley Packet Filter technology. The proposed system incorporates two key innovations: a dynamic probe activation mechanism that intelligently adjusts scanning locations in real-time to optimize resource usage, and a standardized data format that enables integrated analysis of vulnerability behaviors across different system layers. Experimental evaluations confirm that $textsf {VBSF}$ effectively identifies critical vulnerability behaviors in diverse attack scenarios while introducing only 1.47% additional system overhead.
{"title":"VBSF: Vulnerability Behavior Scanning Framework for Intelligent Autonomous Transport Systems","authors":"Hao Ren;Tao Zheng;Lei Zhang;Wenxian Wang;Meng Li;Hongwei Li","doi":"10.1109/TITS.2025.3594901","DOIUrl":"https://doi.org/10.1109/TITS.2025.3594901","url":null,"abstract":"Vulnerability behavior scanning plays a crucial role in securing Intelligent Autonomous Transportation Systems by ensuring protected communications and maintaining data integrity. Current scanning solutions, however, demonstrate several critical shortcomings: (1) their dependence on static analysis methods with predetermined scanning locations prevents dynamic adjustment of scanning strategies; (2) their limited capacity to capture data across multiple system layers fails to address sophisticated multi-layered attack patterns; and (3) their inability to dynamically activate monitoring probes hinders timely responses to newly emerging threats. To resolve these limitations, we present <inline-formula> <tex-math>$textsf {VBSF}$ </tex-math></inline-formula>, an efficient and non-intrusive vulnerability scanning framework built upon extended Berkeley Packet Filter technology. The proposed system incorporates two key innovations: a dynamic probe activation mechanism that intelligently adjusts scanning locations in real-time to optimize resource usage, and a standardized data format that enables integrated analysis of vulnerability behaviors across different system layers. Experimental evaluations confirm that <inline-formula> <tex-math>$textsf {VBSF}$ </tex-math></inline-formula> effectively identifies critical vulnerability behaviors in diverse attack scenarios while introducing only 1.47% additional system overhead.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 10","pages":"18225-18237"},"PeriodicalIF":8.4,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145384611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}