Pub Date : 2024-12-31DOI: 10.1109/TITS.2024.3520613
Kehua Chen;Yuhao Luo;Meixin Zhu;Hai Yang
Lane changing presents a dynamic scenario characterized by intricate interactions among vehicles. Within mixed-autonomy traffic environment, modeling a human-like lane-change trajectory enables human drivers to better understand and predict autonomous vehicles’ behaviors, thereby enhancing road safety and travel efficiency. In this study, we achieve human-like interactive lane-change modeling based on a novel framework named Diff-LC. The human-like modeling of LCV behaviors relies on an advanced diffusive planner, and the implemented trajectory is selected based on the recovered LCV reward function learned through Multi-Agent Adversarial Inverse Reinforcement Learning (MA-AIRL). To account for interactions between FVs and LCVs, we further employ a diffusive predictor to forecast future behaviors of FVs conditioned on both historical and planned trajectories. Additionally, we leverage the recovered reward function of FVs to enable controllable prediction of trajectories. In the experimental part, we begin by analyzing the significance of features in the recovered reward functions and then proceed to compare the distinctions between the LCV and the FV. To validate the effectiveness of the proposed framework, we compare the diffusive predictor and planner with several state-of-the-art methods. The results demonstrate that motions planned by Diff-LC closely reach the intended positions with small displacement errors and exhibit highly similar speed and jerk distributions to those of human drivers. We also conduct a dynamic simulation to evaluate Diff-LC’s performance across different traffic conditions. Finally, we explore customized generation using the Diffusion Posterior Sampling method. The codes can be found at https://github.com/zeonchen/Diff-LC/.
{"title":"Human-Like Interactive Lane-Change Modeling Based on Reward-Guided Diffusive Predictor and Planner","authors":"Kehua Chen;Yuhao Luo;Meixin Zhu;Hai Yang","doi":"10.1109/TITS.2024.3520613","DOIUrl":"https://doi.org/10.1109/TITS.2024.3520613","url":null,"abstract":"Lane changing presents a dynamic scenario characterized by intricate interactions among vehicles. Within mixed-autonomy traffic environment, modeling a human-like lane-change trajectory enables human drivers to better understand and predict autonomous vehicles’ behaviors, thereby enhancing road safety and travel efficiency. In this study, we achieve human-like interactive lane-change modeling based on a novel framework named Diff-LC. The human-like modeling of LCV behaviors relies on an advanced diffusive planner, and the implemented trajectory is selected based on the recovered LCV reward function learned through Multi-Agent Adversarial Inverse Reinforcement Learning (MA-AIRL). To account for interactions between FVs and LCVs, we further employ a diffusive predictor to forecast future behaviors of FVs conditioned on both historical and planned trajectories. Additionally, we leverage the recovered reward function of FVs to enable controllable prediction of trajectories. In the experimental part, we begin by analyzing the significance of features in the recovered reward functions and then proceed to compare the distinctions between the LCV and the FV. To validate the effectiveness of the proposed framework, we compare the diffusive predictor and planner with several state-of-the-art methods. The results demonstrate that motions planned by Diff-LC closely reach the intended positions with small displacement errors and exhibit highly similar speed and jerk distributions to those of human drivers. We also conduct a dynamic simulation to evaluate Diff-LC’s performance across different traffic conditions. Finally, we explore customized generation using the Diffusion Posterior Sampling method. The codes can be found at <uri>https://github.com/zeonchen/Diff-LC/</uri>.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"3903-3916"},"PeriodicalIF":7.9,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1109/TITS.2024.3504605
Kunyoung Lee;Hyunsoo Seo;Seunghyun Kim;Byeong Seon An;Shinwi Park;Yonggwon Jeon;Eui Chul Lee
Remote photoplethysmography (rPPG) is a method for monitoring pulse signal by utilizing a camera sensor to capture a facial video including variations in blood flow beneath the skin. Recently, rPPG advancements have enabled the measurement of an individual’s heart rate with a Root Mean Square Error (RMSE) of approximately 1.0 in controlled indoor environments. However, when applied in car dataset including driving environments, the RMSE of rPPG measurements significantly increases to over 9.07. This limitation, caused by motion-related artifacts and fluctuations in ambient illumination, becomes particularly noticeable while driving, resulting in a Percentage of Time that Error is less than 6 beats per minute (PTE6) of up to 65.1%. To address these limitations, we focus on the assessment of rPPG noise, with an emphasis on evaluating noise components within facial video and quantifying quality of the rPPG measurement. In this paper, we propose a deep learning framework that infers rPPG signal and quality based on video vision transformer. the proposed method demonstrates that the top 10% quality measurements yield PTE6 of 91.98% and 99.59% in driving and garage environments, respectively. Additionally, we introduce a quality-based rPPG compensation method that improves accuracy in driving environments by predicting rPPG quality based on noise assessment. This compensation method demonstrates superior accuracy compared to the current state-of-the-art, achieving a PTE6 of 68.24% in driving scenarios.
{"title":"Quality-Based rPPG Compensation With Temporal Difference Transformer for Camera-Based Driver Monitoring","authors":"Kunyoung Lee;Hyunsoo Seo;Seunghyun Kim;Byeong Seon An;Shinwi Park;Yonggwon Jeon;Eui Chul Lee","doi":"10.1109/TITS.2024.3504605","DOIUrl":"https://doi.org/10.1109/TITS.2024.3504605","url":null,"abstract":"Remote photoplethysmography (rPPG) is a method for monitoring pulse signal by utilizing a camera sensor to capture a facial video including variations in blood flow beneath the skin. Recently, rPPG advancements have enabled the measurement of an individual’s heart rate with a Root Mean Square Error (RMSE) of approximately 1.0 in controlled indoor environments. However, when applied in car dataset including driving environments, the RMSE of rPPG measurements significantly increases to over 9.07. This limitation, caused by motion-related artifacts and fluctuations in ambient illumination, becomes particularly noticeable while driving, resulting in a Percentage of Time that Error is less than 6 beats per minute (PTE6) of up to 65.1%. To address these limitations, we focus on the assessment of rPPG noise, with an emphasis on evaluating noise components within facial video and quantifying quality of the rPPG measurement. In this paper, we propose a deep learning framework that infers rPPG signal and quality based on video vision transformer. the proposed method demonstrates that the top 10% quality measurements yield PTE6 of 91.98% and 99.59% in driving and garage environments, respectively. Additionally, we introduce a quality-based rPPG compensation method that improves accuracy in driving environments by predicting rPPG quality based on noise assessment. This compensation method demonstrates superior accuracy compared to the current state-of-the-art, achieving a PTE6 of 68.24% in driving scenarios.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 2","pages":"1951-1963"},"PeriodicalIF":7.9,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1109/TITS.2024.3520520
Siyuan Yu;Congkai Shen;James Dallas;Bogdan I. Epureanu;Paramsothy Jayakumar;Tulga Ersal
This paper presents a novel terrain-adaptive local trajectory planner designed for the autonomous operation of off-road vehicles on deformable terrains. State-of-the-art solutions either do not account for deformable terrains, or do not offer sufficient robustness or computational speed. To bridge this research gap, the paper introduces a novel model predictive control (MPC) formulation. In contrast to the prevailing state-of-the-art approaches that rely exclusively on hard or soft constraints for obstacle avoidance, the present formulation enhances robustness by incorporating both types of constraints. The effectiveness and robustness of the formulation are evaluated through extensive simulations, encompassing a wide range of randomized scenarios, and compared against state-of-the-art methods. Subsequently, the formulation is augmented with an optimal-control-oriented terramechanics model from the literature, explicitly addressing terrain deformation. Additionally, a terrain estimator employing the unscented Kalman filter is utilized to dynamically adjust the sinkage exponent online, resulting in a terrain-adaptive formulation. This formulation is tested on a physical vehicle in real world experiments against a rigid-terrain formulation as the benchmark. The results showcase the superior safety and performance achieved by the proposed formulation, underscoring the critical significance of integrating terramechanics knowledge into the planning process. Specifically, the proposed terrain-adaptive formulation achieves reduced mean absolute sideslip angle, decreased mean absolute yaw rate, shorter time to goal, and a higher success rate, primarily attributed to its enhanced understanding of terramechanics within the planner.
{"title":"A Real-Time Terrain-Adaptive Local Trajectory Planner for High-Speed Autonomous Off-Road Navigation on Deformable Terrains","authors":"Siyuan Yu;Congkai Shen;James Dallas;Bogdan I. Epureanu;Paramsothy Jayakumar;Tulga Ersal","doi":"10.1109/TITS.2024.3520520","DOIUrl":"https://doi.org/10.1109/TITS.2024.3520520","url":null,"abstract":"This paper presents a novel terrain-adaptive local trajectory planner designed for the autonomous operation of off-road vehicles on deformable terrains. State-of-the-art solutions either do not account for deformable terrains, or do not offer sufficient robustness or computational speed. To bridge this research gap, the paper introduces a novel model predictive control (MPC) formulation. In contrast to the prevailing state-of-the-art approaches that rely exclusively on hard or soft constraints for obstacle avoidance, the present formulation enhances robustness by incorporating both types of constraints. The effectiveness and robustness of the formulation are evaluated through extensive simulations, encompassing a wide range of randomized scenarios, and compared against state-of-the-art methods. Subsequently, the formulation is augmented with an optimal-control-oriented terramechanics model from the literature, explicitly addressing terrain deformation. Additionally, a terrain estimator employing the unscented Kalman filter is utilized to dynamically adjust the sinkage exponent online, resulting in a terrain-adaptive formulation. This formulation is tested on a physical vehicle in real world experiments against a rigid-terrain formulation as the benchmark. The results showcase the superior safety and performance achieved by the proposed formulation, underscoring the critical significance of integrating terramechanics knowledge into the planning process. Specifically, the proposed terrain-adaptive formulation achieves reduced mean absolute sideslip angle, decreased mean absolute yaw rate, shorter time to goal, and a higher success rate, primarily attributed to its enhanced understanding of terramechanics within the planner.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"3324-3340"},"PeriodicalIF":7.9,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1109/TITS.2024.3520328
Liang Cao;Yan Qin;Yingnan Pan;Hongjing Liang
This paper considers the prescribed performance-based optimal formation control problem for unmanned surface vehicles with position constraints and yaw angle time-varying partial constraints while avoiding collisions and maintaining connectivity. To be more specific, prescribed-time performance constraints are imposed on the position tracking errors between each vehicle and its leader. Then, the prescribed performance-based optimal formation control strategy is developed to guarantee that each vehicle achieves collision-free formation control while maintaining connectivity, as well as the prescribed transient and steady performance on the position tracking errors. Inspired by the prescribed performance control, an improved asymmetric barrier function with prescribed performance is provided to ensure that the yaw angle errors satisfy the prescribed performance constraints. Eventually, theoretical analysis demonstrates that the optimal formation control scheme can produce position tracking errors that converge to a prescribed arbitrarily small region within a prescribed time interval, along with the yaw angle that adheres to the time-varying partial constraints, subject to optimal cost with limited communication ranges and collision avoidance constraints. Simulation results and comprehensive comparisons show extraordinary effectiveness and superiority.
{"title":"Prescribed Performance-Based Optimal Formation Control for USVs With Position Constraints and Yaw Angle Time-Varying Partial Constraints","authors":"Liang Cao;Yan Qin;Yingnan Pan;Hongjing Liang","doi":"10.1109/TITS.2024.3520328","DOIUrl":"https://doi.org/10.1109/TITS.2024.3520328","url":null,"abstract":"This paper considers the prescribed performance-based optimal formation control problem for unmanned surface vehicles with position constraints and yaw angle time-varying partial constraints while avoiding collisions and maintaining connectivity. To be more specific, prescribed-time performance constraints are imposed on the position tracking errors between each vehicle and its leader. Then, the prescribed performance-based optimal formation control strategy is developed to guarantee that each vehicle achieves collision-free formation control while maintaining connectivity, as well as the prescribed transient and steady performance on the position tracking errors. Inspired by the prescribed performance control, an improved asymmetric barrier function with prescribed performance is provided to ensure that the yaw angle errors satisfy the prescribed performance constraints. Eventually, theoretical analysis demonstrates that the optimal formation control scheme can produce position tracking errors that converge to a prescribed arbitrarily small region within a prescribed time interval, along with the yaw angle that adheres to the time-varying partial constraints, subject to optimal cost with limited communication ranges and collision avoidance constraints. Simulation results and comprehensive comparisons show extraordinary effectiveness and superiority.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"4109-4121"},"PeriodicalIF":7.9,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1109/TITS.2024.3514105
Yan Li;Zhenxing Niu;Yinzhang He;Qinshi Hu;Jiupeng Zhang
The goal of preventative maintenance (PM) decision-making on airport pavements is to deploy the appropriate maintenance countermeasures at the correct time. This paper proposed a three-stage method for maintenance based on machine learning, which further refined the PM decision-making process. First, a pavement maintenance level model was developed using the PCA and PSO algorithm optimized SVM model. The model was then used to separate pavement maintenance into three categories: daily, PM, and major. Second, the DBSCAN and OPTICS were utilized to further divide the PM requirements finely. In order to implement the scientific decision-making of PM, suitable maintenance procedures were ultimately chosen based on the predominant damage kinds of the pavement units. The results showed that, when compared to the original SVM model, the classification accuracy of the PCA-PSO-SVM model was greatly improved, with total accuracy and accuracy of each class increasing by 10%, 41.7%, 4.6%, and 7.8%, respectively. When clustering the airport pavement performance dataset, OPTICS outperformed the DBSCAN technique. Four groups of PM demands were discovered by visualizing the best grouping levels after dimensionality reduction.
{"title":"A Three-Stage Decision-Making Method Based on Machine Learning for Preventive Maintenance of Airport Pavement","authors":"Yan Li;Zhenxing Niu;Yinzhang He;Qinshi Hu;Jiupeng Zhang","doi":"10.1109/TITS.2024.3514105","DOIUrl":"https://doi.org/10.1109/TITS.2024.3514105","url":null,"abstract":"The goal of preventative maintenance (PM) decision-making on airport pavements is to deploy the appropriate maintenance countermeasures at the correct time. This paper proposed a three-stage method for maintenance based on machine learning, which further refined the PM decision-making process. First, a pavement maintenance level model was developed using the PCA and PSO algorithm optimized SVM model. The model was then used to separate pavement maintenance into three categories: daily, PM, and major. Second, the DBSCAN and OPTICS were utilized to further divide the PM requirements finely. In order to implement the scientific decision-making of PM, suitable maintenance procedures were ultimately chosen based on the predominant damage kinds of the pavement units. The results showed that, when compared to the original SVM model, the classification accuracy of the PCA-PSO-SVM model was greatly improved, with total accuracy and accuracy of each class increasing by 10%, 41.7%, 4.6%, and 7.8%, respectively. When clustering the airport pavement performance dataset, OPTICS outperformed the DBSCAN technique. Four groups of PM demands were discovered by visualizing the best grouping levels after dimensionality reduction.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"4152-4164"},"PeriodicalIF":7.9,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-30DOI: 10.1109/TITS.2024.3509381
Zhu Xiao;Bo Liu;Linshan Wu;Hongbo Jiang;Beihao Xia;Tao Li;Cassandra C. Wang
Carbon emissions caused by passenger cars in cities are essentially responsible for severe climate change and serious environmental problems. Exploring carbon emissions from passenger cars helps to control urban pollution and achieve urban sustainability. However, it is a challenging task to foresee the spatio-temporal distribution of carbon emission from passenger cars, as the following technical issues remain. i) Vehicle carbon emissions contain complex spatial interactions and temporal dynamics. How to collaboratively integrate such spatial-temporal correlations for carbon emission prediction is not yet resolved. ii) Given the mobility of passenger cars, the hidden dependencies inherent in traffic density are not properly addressed in predicting carbon emissions from passenger cars. To tackle these issues, we propose a Collaborative Spatial-temporal Network (CSTNet) for implementing carbon emissions prediction by using passenger car trajectory data. Within the proposed method, we devote to extract collaborative properties that stem from a multi-view graph structure together with parallel input of carbon emission and traffic density. Then, we design a spatial-temporal convolutional block for both carbon emission and traffic density, which constitutes of temporal gate convolution, spatial convolution and temporal attention mechanism. Following that, an interaction layer between carbon emission and traffic density is proposed to handle their internal dependencies, and further model spatial relationships between the features. Besides, we identify several global factors and embed them for final prediction with a collaborative fusion. Experimental results on the real-world passenger car trajectory dataset demonstrate that the proposed method outperforms the baselines with a roughly 7%-11% improvement.
{"title":"Exploring Spatio-Temporal Carbon Emission Across Passenger Car Trajectory Data","authors":"Zhu Xiao;Bo Liu;Linshan Wu;Hongbo Jiang;Beihao Xia;Tao Li;Cassandra C. Wang","doi":"10.1109/TITS.2024.3509381","DOIUrl":"https://doi.org/10.1109/TITS.2024.3509381","url":null,"abstract":"Carbon emissions caused by passenger cars in cities are essentially responsible for severe climate change and serious environmental problems. Exploring carbon emissions from passenger cars helps to control urban pollution and achieve urban sustainability. However, it is a challenging task to foresee the spatio-temporal distribution of carbon emission from passenger cars, as the following technical issues remain. i) Vehicle carbon emissions contain complex spatial interactions and temporal dynamics. How to collaboratively integrate such spatial-temporal correlations for carbon emission prediction is not yet resolved. ii) Given the mobility of passenger cars, the hidden dependencies inherent in traffic density are not properly addressed in predicting carbon emissions from passenger cars. To tackle these issues, we propose a Collaborative Spatial-temporal Network (CSTNet) for implementing carbon emissions prediction by using passenger car trajectory data. Within the proposed method, we devote to extract collaborative properties that stem from a multi-view graph structure together with parallel input of carbon emission and traffic density. Then, we design a spatial-temporal convolutional block for both carbon emission and traffic density, which constitutes of temporal gate convolution, spatial convolution and temporal attention mechanism. Following that, an interaction layer between carbon emission and traffic density is proposed to handle their internal dependencies, and further model spatial relationships between the features. Besides, we identify several global factors and embed them for final prediction with a collaborative fusion. Experimental results on the real-world passenger car trajectory dataset demonstrate that the proposed method outperforms the baselines with a roughly 7%-11% improvement.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 2","pages":"1812-1825"},"PeriodicalIF":7.9,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-27DOI: 10.1109/TITS.2024.3520103
Hong Zhu;Qingyang Lu;Lei Xue;Pingping Zhang;Guanglin Yuan
Vision-language tracking is a new rising topic in intelligent transportation systems, particularly significant in autonomous driving and road surveillance. It is a task that aims to combine visual and auxiliary linguistic modalities to co-locate the target object in a video sequence. Currently, multi-modal data scarcity and burdensome modality fusion have become two major factors in limiting the development of vision-language tracking. To tackle the issues, we propose an efficient and effective one-stage vision-language tracking framework (CPIPTrack) that unifies feature extraction and multi-modal fusion by interactive prompt learning. Feature extraction is performed by the high-performance vision-language foundation model CLIP, resulting in the impressive generalization ability inherited from the large-scale model. Modality fusion is simplified to a few lightweight prompts, leading to significant savings in computational resources. Specifically, we design three types of prompts to dynamically learn the layer-wise feature relationships between vision and language, facilitating rich context interactions while enabling the pre-trained CLIP adaptation. In this manner, discriminative target-oriented visual features can be extracted by language and template guidance, which are used for subsequent reasoning. Due to the elimination of extra heavy modality fusion, the proposed CPIPTrack shows high efficiency in both training and inference. CPIPTrack has been extensively evaluated on three benchmark datasets, and the experimental results demonstrate that it achieves a good performance-speed balance with an AUC of 66.0% on LaSOT and a runtime of 51.7 FPS on RTX2080 Super.
{"title":"Vision-Language Tracking With CLIP and Interactive Prompt Learning","authors":"Hong Zhu;Qingyang Lu;Lei Xue;Pingping Zhang;Guanglin Yuan","doi":"10.1109/TITS.2024.3520103","DOIUrl":"https://doi.org/10.1109/TITS.2024.3520103","url":null,"abstract":"Vision-language tracking is a new rising topic in intelligent transportation systems, particularly significant in autonomous driving and road surveillance. It is a task that aims to combine visual and auxiliary linguistic modalities to co-locate the target object in a video sequence. Currently, multi-modal data scarcity and burdensome modality fusion have become two major factors in limiting the development of vision-language tracking. To tackle the issues, we propose an efficient and effective one-stage vision-language tracking framework (CPIPTrack) that unifies feature extraction and multi-modal fusion by interactive prompt learning. Feature extraction is performed by the high-performance vision-language foundation model CLIP, resulting in the impressive generalization ability inherited from the large-scale model. Modality fusion is simplified to a few lightweight prompts, leading to significant savings in computational resources. Specifically, we design three types of prompts to dynamically learn the layer-wise feature relationships between vision and language, facilitating rich context interactions while enabling the pre-trained CLIP adaptation. In this manner, discriminative target-oriented visual features can be extracted by language and template guidance, which are used for subsequent reasoning. Due to the elimination of extra heavy modality fusion, the proposed CPIPTrack shows high efficiency in both training and inference. CPIPTrack has been extensively evaluated on three benchmark datasets, and the experimental results demonstrate that it achieves a good performance-speed balance with an AUC of 66.0% on LaSOT and a runtime of 51.7 FPS on RTX2080 Super.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"3659-3670"},"PeriodicalIF":7.9,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143563909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As the adoption of electric vehicles continues to grow, the demand for extensive charging infrastructure in urban areas is concurrently rising. In response to the evolving charging infrastructure shortage, private charging piles have emerged as crucial supplementary energy sources, especially in areas lacking public charging infrastructure. The sharing of private charging piles, however, introduces several challenges. Notably, the variable availability time and extremely limited usage space of private charging piles pose scheduling complexities for charging pile owners. Furthermore, the completely peer-to-peer operation of private charging piles may lead to suboptimal solutions for fulfilling overall charging demand. To comprehensively address these challenges, we explore the potential for cooperation among geographically proximate charging piles. We introduce a novel online booking mechanism paired with specialized scheduling algorithms designed for scenarios involving both multiple private charging piles and single private charging piles. Our objective is to maximize the attained revenue of charging pile owners under fully dynamic conditions on both the supply and demand sides. Through meticulous theoretical proofs, we show that our mechanism achieves advantageous competitive ratios for both scenarios when compared to the offline optimal solutions. Numerous experiments, conducted with real charging sessions, consistently demonstrate that the proposed mechanism achieves the highest revenue, providing substantial evidence for its superior performance.
{"title":"V2PCP: Toward Online Booking Mechanism for Private Charging Piles","authors":"Xinyu Lu;Jiawei Sun;Jiong Lou;Yusheng Ji;Chentao Wu;Wei Zhao;Guangtao Xue;Yuan Luo;Fan Cheng;Jie Li","doi":"10.1109/TITS.2024.3516828","DOIUrl":"https://doi.org/10.1109/TITS.2024.3516828","url":null,"abstract":"As the adoption of electric vehicles continues to grow, the demand for extensive charging infrastructure in urban areas is concurrently rising. In response to the evolving charging infrastructure shortage, private charging piles have emerged as crucial supplementary energy sources, especially in areas lacking public charging infrastructure. The sharing of private charging piles, however, introduces several challenges. Notably, the variable availability time and extremely limited usage space of private charging piles pose scheduling complexities for charging pile owners. Furthermore, the completely peer-to-peer operation of private charging piles may lead to suboptimal solutions for fulfilling overall charging demand. To comprehensively address these challenges, we explore the potential for cooperation among geographically proximate charging piles. We introduce a novel online booking mechanism paired with specialized scheduling algorithms designed for scenarios involving both multiple private charging piles and single private charging piles. Our objective is to maximize the attained revenue of charging pile owners under fully dynamic conditions on both the supply and demand sides. Through meticulous theoretical proofs, we show that our mechanism achieves advantageous competitive ratios for both scenarios when compared to the offline optimal solutions. Numerous experiments, conducted with real charging sessions, consistently demonstrate that the proposed mechanism achieves the highest revenue, providing substantial evidence for its superior performance.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 2","pages":"2514-2529"},"PeriodicalIF":7.9,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143106235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-27DOI: 10.1109/TITS.2024.3519528
Li Li;Lun Tang;Yaqing Wang;Tong Liu;Qianbin Chen
Vehicle digital twin (VDT) can support multiple different vehicle services through updating, and the updating of VDT faces two fundamental issues: one is the isolation of VDT, that is VDTs can run various services without being affected by others; the other one is the timeliness of VDT updates for low latency services. In this paper, we first propose to ensure the isolation of VDTs with the assistance of intelligent reflecting surface (IRS) and network slicing (NS), and obtain better update time of the VDTs within limited resources. Specifically, we divide resources for VDTs with different update requirements to ensure the isolation of VDTs and the resources required for update. On the other hand, considering that the communication performance between vehicles and base stations is affected by urban building density, we propose using an intelligent controller to achieve intelligent control of the physical channel by adjusting the phase shift of passive reflective elements to ensure better transmission performance during VDTs updating. Secondly, considering the dynamic variability of vehicles and the environment, we propose an improved deep reinforcement learning algorithm based on the actor-critic framework to allocate communication, computing resources, and adjust the phase shift of the IRS. Finally, a large number of simulation results indicate that our proposed algorithm performs better than the benchmark algorithms.
{"title":"Intelligent Reflecting Surface and Network Slicing Assisted Vehicle Digital Twin Update","authors":"Li Li;Lun Tang;Yaqing Wang;Tong Liu;Qianbin Chen","doi":"10.1109/TITS.2024.3519528","DOIUrl":"https://doi.org/10.1109/TITS.2024.3519528","url":null,"abstract":"Vehicle digital twin (VDT) can support multiple different vehicle services through updating, and the updating of VDT faces two fundamental issues: one is the isolation of VDT, that is VDTs can run various services without being affected by others; the other one is the timeliness of VDT updates for low latency services. In this paper, we first propose to ensure the isolation of VDTs with the assistance of intelligent reflecting surface (IRS) and network slicing (NS), and obtain better update time of the VDTs within limited resources. Specifically, we divide resources for VDTs with different update requirements to ensure the isolation of VDTs and the resources required for update. On the other hand, considering that the communication performance between vehicles and base stations is affected by urban building density, we propose using an intelligent controller to achieve intelligent control of the physical channel by adjusting the phase shift of passive reflective elements to ensure better transmission performance during VDTs updating. Secondly, considering the dynamic variability of vehicles and the environment, we propose an improved deep reinforcement learning algorithm based on the actor-critic framework to allocate communication, computing resources, and adjust the phase shift of the IRS. Finally, a large number of simulation results indicate that our proposed algorithm performs better than the benchmark algorithms.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"3799-3813"},"PeriodicalIF":7.9,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143564243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-27DOI: 10.1109/TITS.2024.3516839
Kai Xiong;Hanqing Yu;Supeng Leng;Chongwen Huang;Chau Yuen
Urban Air Mobility (UAM), powered by flying cars, is poised to revolutionize urban transportation by expanding vehicle travel from the ground to the air. This advancement promises to alleviate congestion and enable faster commutes. However, the fast travel speeds mean vehicles will encounter vastly different environments during a single journey. As a result, onboard learning systems need access to extensive environmental data, leading to high costs in data collection and training. These demands conflict with the limited in-vehicle computing and battery resources. Fortunately, learning model sharing offers a solution. Well-trained local Deep Learning (DL) models can be shared with other vehicles, reducing the need for redundant data collection and training. However, this sharing process relies heavily on efficient vehicular communications in UAM. To address these challenges, this paper leverages the multi-hop Reconfigurable Intelligent Surface (RIS) technology to improve DL model sharing between distant flying cars. We also employ knowledge distillation to reduce the size of the shared DL models and enable efficient integration of non-identical models at the receiver. Our approach enhances model sharing and onboard learning performance for cars entering new environments. Simulation results show that our scheme improves the total reward by 85% compared to benchmark methods.
{"title":"Multi-Hop RIS-Aided Learning Model Sharing for Urban Air Mobility","authors":"Kai Xiong;Hanqing Yu;Supeng Leng;Chongwen Huang;Chau Yuen","doi":"10.1109/TITS.2024.3516839","DOIUrl":"https://doi.org/10.1109/TITS.2024.3516839","url":null,"abstract":"Urban Air Mobility (UAM), powered by flying cars, is poised to revolutionize urban transportation by expanding vehicle travel from the ground to the air. This advancement promises to alleviate congestion and enable faster commutes. However, the fast travel speeds mean vehicles will encounter vastly different environments during a single journey. As a result, onboard learning systems need access to extensive environmental data, leading to high costs in data collection and training. These demands conflict with the limited in-vehicle computing and battery resources. Fortunately, learning model sharing offers a solution. Well-trained local Deep Learning (DL) models can be shared with other vehicles, reducing the need for redundant data collection and training. However, this sharing process relies heavily on efficient vehicular communications in UAM. To address these challenges, this paper leverages the multi-hop Reconfigurable Intelligent Surface (RIS) technology to improve DL model sharing between distant flying cars. We also employ knowledge distillation to reduce the size of the shared DL models and enable efficient integration of non-identical models at the receiver. Our approach enhances model sharing and onboard learning performance for cars entering new environments. Simulation results show that our scheme improves the total reward by 85% compared to benchmark methods.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"3947-3959"},"PeriodicalIF":7.9,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}