首页 > 最新文献

IEEE Signal Processing Letters最新文献

英文 中文
ARES: Text-Driven Automatic Realistic Simulator for Autonomous Traffic ARES:文本驱动的自动现实交通模拟器
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-15 DOI: 10.1109/LSP.2024.3481151
Jinghao Cao;Sheng Liu;Xiong Yang;Yang Li;Sidan Du
The large-scale generation of real-world scenario datasets is a pivotal task in the field of autonomous driving. Existing methods emphasize solely on single-frame rendering, which need complex inputs for continuous scenario rendering. In this letter, ARES: a text-driven automatic realistic simulator is proposed, which can generate extensive realistic datasets with just a single text input. Its core idea is to generate vehicle trajectories based on the textual description, and then render the scenario by vehicle attributes associated with these trajectories. For learning trajectories generating, supervisory signal temporal logic is proposed to assist conditional diffusion model, which incorporates prior physical information. We annotate textual descriptions for KITTI-MOT dataset and establish an objective quantitative evaluation system. The superiority of our method is demonstrated by its high performance, which is reflected in a matching score of 3.54 and an FID of 8.93in the trajectory reconstruction task, along with a speed accuracy of 0.99 and a direction accuracy of 0.93in the trajectory editing task. The scenarios rendered by the proposed method exhibit high quality and realism, which indicates its great potential in testing of autonomous driving algorithms with vehicle-in-the-loop simulations.
大规模生成真实世界场景数据集是自动驾驶领域的一项关键任务。现有方法仅强调单帧渲染,需要复杂的输入才能实现连续的场景渲染。在这封信中,我们提出了 "ARES:文本驱动的自动仿真模拟器",只需输入一个文本,就能生成大量的仿真数据集。其核心理念是根据文本描述生成车辆轨迹,然后根据与这些轨迹相关的车辆属性渲染场景。为了学习轨迹生成,我们提出了监督信号时序逻辑来辅助条件扩散模型,该模型结合了先验物理信息。我们对 KITTI-MOT 数据集的文本描述进行了注释,并建立了一个客观的定量评估系统。我们的方法在轨迹重建任务中的匹配度为 3.54,FID 为 8.93;在轨迹编辑任务中的速度精度为 0.99,方向精度为 0.93。该方法渲染的场景质量高、逼真度高,这表明它在利用车辆在环仿真测试自动驾驶算法方面具有巨大潜力。
{"title":"ARES: Text-Driven Automatic Realistic Simulator for Autonomous Traffic","authors":"Jinghao Cao;Sheng Liu;Xiong Yang;Yang Li;Sidan Du","doi":"10.1109/LSP.2024.3481151","DOIUrl":"https://doi.org/10.1109/LSP.2024.3481151","url":null,"abstract":"The large-scale generation of real-world scenario datasets is a pivotal task in the field of autonomous driving. Existing methods emphasize solely on single-frame rendering, which need complex inputs for continuous scenario rendering. In this letter, ARES: a text-driven automatic realistic simulator is proposed, which can generate extensive realistic datasets with just a single text input. Its core idea is to generate vehicle trajectories based on the textual description, and then render the scenario by vehicle attributes associated with these trajectories. For learning trajectories generating, supervisory signal temporal logic is proposed to assist conditional diffusion model, which incorporates prior physical information. We annotate textual descriptions for KITTI-MOT dataset and establish an objective quantitative evaluation system. The superiority of our method is demonstrated by its high performance, which is reflected in a matching score of 3.54 and an FID of 8.93in the trajectory reconstruction task, along with a speed accuracy of 0.99 and a direction accuracy of 0.93in the trajectory editing task. The scenarios rendered by the proposed method exhibit high quality and realism, which indicates its great potential in testing of autonomous driving algorithms with vehicle-in-the-loop simulations.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3049-3053"},"PeriodicalIF":3.2,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-System Fusion Positioning Method Based on Factor Graph 基于因子图的多系统融合定位方法
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-15 DOI: 10.1109/LSP.2024.3480833
Hongmei Wang;Sheng Xing;Zhiwei Wang;Minghui Min;Shiyin Li
Ultra-wideband (UWB) positioning system offers high-precision location capabilities. However, it introduces positive biases in complex environments. Pedestrian Dead Reckoning (PDR) algorithm based on Inertial Measurement Unit (IMU) can maintain robust tracking even in cases of abrupt changes in pedestrian trajectories but suffers from cumulative errors. Therefore, in this study, the strengths of both systems are combined. Hence, a factor graph model is established to enhance the multi-system fusion localization method based on factor graphs. Experimental verification in both straight-line trajectories and scenarios involving state mutations demonstrates an integrated average positioning accuracy within 0.1m. When compared to traditional system fusion localization methods, the accuracy is enhanced by more than 50%.
超宽带(UWB)定位系统具有高精度定位能力。然而,它在复杂环境中会产生正偏差。基于惯性测量单元(IMU)的行人惯性导航(PDR)算法即使在行人轨迹突然变化的情况下也能保持稳健的跟踪,但会出现累积误差。因此,在本研究中,两种系统的优势被结合起来。因此,建立了一个因子图模型,以增强基于因子图的多系统融合定位方法。在直线轨迹和涉及状态突变的场景中进行的实验验证表明,综合平均定位精度在 0.1 米以内。与传统的系统融合定位方法相比,精度提高了 50%以上。
{"title":"Multi-System Fusion Positioning Method Based on Factor Graph","authors":"Hongmei Wang;Sheng Xing;Zhiwei Wang;Minghui Min;Shiyin Li","doi":"10.1109/LSP.2024.3480833","DOIUrl":"https://doi.org/10.1109/LSP.2024.3480833","url":null,"abstract":"Ultra-wideband (UWB) positioning system offers high-precision location capabilities. However, it introduces positive biases in complex environments. Pedestrian Dead Reckoning (PDR) algorithm based on Inertial Measurement Unit (IMU) can maintain robust tracking even in cases of abrupt changes in pedestrian trajectories but suffers from cumulative errors. Therefore, in this study, the strengths of both systems are combined. Hence, a factor graph model is established to enhance the multi-system fusion localization method based on factor graphs. Experimental verification in both straight-line trajectories and scenarios involving state mutations demonstrates an integrated average positioning accuracy within 0.1m. When compared to traditional system fusion localization methods, the accuracy is enhanced by more than 50%.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3025-3029"},"PeriodicalIF":3.2,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Maximum Entropy and Quantized Metric Models for Absolute Category Ratings 绝对类别评级的最大熵和量化度量模型
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-15 DOI: 10.1109/LSP.2024.3480832
Dietmar Saupe;Krzysztof Rusek;David Hägele;Daniel Weiskopf;Lucjan Janowski
The datasets of most image quality assessment studies contain ratings on a categorical scale with five levels, from bad (1) to excellent (5). For each stimulus, the number of ratings from 1 to 5 is summarized and given in the form of the mean opinion score. In this study, we investigate families of multinomial probability distributions parameterized by mean and variance that are used to fit the empirical rating distributions. To this end, we consider quantized metric models based on continuous distributions that model perceived stimulus quality on a latent scale. The probabilities for the rating categories are determined by quantizing the corresponding random variables using threshold values. Furthermore, we introduce a novel discrete maximum entropy distribution for a given mean and variance. We compare the performance of these models and the state of the art given by the generalized score distribution for two large data sets, KonIQ-10k and VQEG HDTV. Given an input distribution of ratings, our fitted two-parameter models predict unseen ratings better than the empirical distribution. In contrast to empirical distributions of absolute category ratings and their discrete models, our continuous models can provide fine-grained estimates of quantiles of quality of experience that are relevant to service providers to satisfy a certain fraction of the user population.
大多数图像质量评估研究的数据集都包含从差(1)到优(5)五个等级的分类评分。对于每个刺激,从 1 到 5 的评分数都会汇总,并以平均意见分的形式给出。在本研究中,我们研究了以均值和方差为参数的多项式概率分布族,这些概率分布用于拟合经验评分分布。为此,我们考虑了基于连续分布的量化度量模型,该模型在一个潜在尺度上对感知到的刺激质量进行建模。评分类别的概率通过使用阈值量化相应的随机变量来确定。此外,我们还引入了一种给定均值和方差的新型离散最大熵分布。我们比较了这些模型和广义评分分布在两个大型数据集(KonIQ-10k 和 VQEG HDTV)中的表现。在输入评分分布的情况下,我们的拟合双参数模型对未见评分的预测优于经验分布。与绝对类别收视率的经验分布及其离散模型相比,我们的连续模型可以对体验质量的定量进行精细估算,这与服务提供商满足一部分用户的需求息息相关。
{"title":"Maximum Entropy and Quantized Metric Models for Absolute Category Ratings","authors":"Dietmar Saupe;Krzysztof Rusek;David Hägele;Daniel Weiskopf;Lucjan Janowski","doi":"10.1109/LSP.2024.3480832","DOIUrl":"https://doi.org/10.1109/LSP.2024.3480832","url":null,"abstract":"The datasets of most image quality assessment studies contain ratings on a categorical scale with five levels, from bad (1) to excellent (5). For each stimulus, the number of ratings from 1 to 5 is summarized and given in the form of the mean opinion score. In this study, we investigate families of multinomial probability distributions parameterized by mean and variance that are used to fit the empirical rating distributions. To this end, we consider quantized metric models based on continuous distributions that model perceived stimulus quality on a latent scale. The probabilities for the rating categories are determined by quantizing the corresponding random variables using threshold values. Furthermore, we introduce a novel discrete maximum entropy distribution for a given mean and variance. We compare the performance of these models and the state of the art given by the generalized score distribution for two large data sets, KonIQ-10k and VQEG HDTV. Given an input distribution of ratings, our fitted two-parameter models predict unseen ratings better than the empirical distribution. In contrast to empirical distributions of absolute category ratings and their discrete models, our continuous models can provide fine-grained estimates of quantiles of quality of experience that are relevant to service providers to satisfy a certain fraction of the user population.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2970-2974"},"PeriodicalIF":3.2,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Noise-Tolerant Meta-Learning With Noisy Labels 带噪声标签的分层容噪元学习
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-14 DOI: 10.1109/LSP.2024.3480033
Yahui Liu;Jian Wang;Yuntai Yang;Renlong Wang;Simiao Wang
Due to the detrimental impact of noisy labels on the generalization of deep neural networks, learning with noisy labels has become an important task in modern deep learning applications. Many previous efforts have mitigated this problem by either removing noisy samples or correcting labels. In this letter, we address this issue from a new perspective and empirically find that models trained with both clean and mislabeled samples exhibit distinguishable activation feature distributions. Building on this observation, we propose a novel meta-learning approach called the Hierarchical Noise-tolerant Meta-Learning (HNML) method, which involves a bi-level optimization comprising meta-training and meta-testing. In the meta-training stage, we incorporate consistency loss at the output prediction hierarchy to facilitate model adaptation to dynamically changing label noise. In the meta-testing stage, we extract activation feature distributions using class activation maps and propose a new mask-guided self-learning method to correct biases in the foreground regions. Through the bi-level optimization of HNML, we ensure that the model generates discriminative feature representations that are insensitive to noisy labels. When evaluated on both synthetic and real-world noisy datasets, our HNML method achieves significant improvements over previous state-of-the-art methods.
由于噪声标签对深度神经网络泛化的不利影响,使用噪声标签进行学习已成为现代深度学习应用中的一项重要任务。之前的许多研究都通过移除噪声样本或校正标签来缓解这一问题。在这封信中,我们从一个新的角度来解决这个问题,并根据经验发现,用干净样本和错误标签样本训练出来的模型都表现出可区分的激活特征分布。基于这一观察结果,我们提出了一种新颖的元学习方法,即分层噪声容限元学习(HNML)方法,该方法涉及由元训练和元测试组成的两级优化。在元训练阶段,我们将一致性损失纳入输出预测层次,以促进模型适应动态变化的标签噪声。在元测试阶段,我们使用类激活图提取激活特征分布,并提出一种新的掩码引导自学习方法来纠正前景区域的偏差。通过对 HNML 进行双层优化,我们确保模型生成的特征表征对噪声标签不敏感。在合成数据集和真实世界的噪声数据集上进行评估时,我们的 HNML 方法比以前最先进的方法取得了显著的改进。
{"title":"Hierarchical Noise-Tolerant Meta-Learning With Noisy Labels","authors":"Yahui Liu;Jian Wang;Yuntai Yang;Renlong Wang;Simiao Wang","doi":"10.1109/LSP.2024.3480033","DOIUrl":"https://doi.org/10.1109/LSP.2024.3480033","url":null,"abstract":"Due to the detrimental impact of noisy labels on the generalization of deep neural networks, learning with noisy labels has become an important task in modern deep learning applications. Many previous efforts have mitigated this problem by either removing noisy samples or correcting labels. In this letter, we address this issue from a new perspective and empirically find that models trained with both clean and mislabeled samples exhibit distinguishable activation feature distributions. Building on this observation, we propose a novel meta-learning approach called the Hierarchical Noise-tolerant Meta-Learning (HNML) method, which involves a bi-level optimization comprising meta-training and meta-testing. In the meta-training stage, we incorporate consistency loss at the output prediction hierarchy to facilitate model adaptation to dynamically changing label noise. In the meta-testing stage, we extract activation feature distributions using class activation maps and propose a new mask-guided self-learning method to correct biases in the foreground regions. Through the bi-level optimization of HNML, we ensure that the model generates discriminative feature representations that are insensitive to noisy labels. When evaluated on both synthetic and real-world noisy datasets, our HNML method achieves significant improvements over previous state-of-the-art methods.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3020-3024"},"PeriodicalIF":3.2,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pose-Promote: Progressive Visual Perception for Activities of Daily Living 姿势-促进:日常生活活动的渐进式视觉感知
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-14 DOI: 10.1109/LSP.2024.3480046
Qilang Ye;Zitong Yu
Poses are effective in interpreting fine-grained human activities, especially when encountering complex visual information. Unimodal methods for action recognition unsatisfactorily to daily activities due to the lack of a more comprehensive perspective. Multimodal methods to combine pose and visual are still not exhaustive enough in mining complementary information. Therefore, we propose a Pose-promote (Ppromo) framework that utilizes a priori knowledge of pose joints to perceive visual information progressively. We first introduce a temporal promote module to activate each video segment using temporally synchronized joint weights. Then a spatial promote module is proposed to capture the key regions in visuals using the learned pose attentions. To further refine the bimodal associations, the global inter-promote module is proposed to align global pose-visual semantics at the feature granularity. Finally, a learnable late fusion strategy between visual and pose is applied for accurate inference. Ppromo achieves state-of-the-art performance on three publicly available datasets.
姿势能有效解释细微的人类活动,尤其是在遇到复杂的视觉信息时。由于缺乏更全面的视角,用于动作识别的单模态方法对日常活动的识别效果并不理想。结合姿势和视觉的多模态方法在挖掘互补信息方面仍不够详尽。因此,我们提出了姿势促进(Ppromo)框架,利用姿势关节的先验知识逐步感知视觉信息。我们首先引入了一个时间促进模块,利用时间同步的关节权重激活每个视频片段。然后,我们提出了一个空间促进模块,利用学习到的姿势注意力捕捉视觉中的关键区域。为了进一步完善双模态关联,我们提出了全局相互促进模块,以在特征粒度上调整全局姿势-视觉语义。最后,在视觉和姿势之间采用可学习的后期融合策略,以实现精确推理。Ppromo 在三个公开可用的数据集上实现了最先进的性能。
{"title":"Pose-Promote: Progressive Visual Perception for Activities of Daily Living","authors":"Qilang Ye;Zitong Yu","doi":"10.1109/LSP.2024.3480046","DOIUrl":"https://doi.org/10.1109/LSP.2024.3480046","url":null,"abstract":"Poses are effective in interpreting fine-grained human activities, especially when encountering complex visual information. Unimodal methods for action recognition unsatisfactorily to daily activities due to the lack of a more comprehensive perspective. Multimodal methods to combine pose and visual are still not exhaustive enough in mining complementary information. Therefore, we propose a Pose-promote (Ppromo) framework that utilizes a priori knowledge of pose joints to perceive visual information progressively. We first introduce a temporal promote module to activate each video segment using temporally synchronized joint weights. Then a spatial promote module is proposed to capture the key regions in visuals using the learned pose attentions. To further refine the bimodal associations, the global inter-promote module is proposed to align global pose-visual semantics at the feature granularity. Finally, a learnable late fusion strategy between visual and pose is applied for accurate inference. Ppromo achieves state-of-the-art performance on three publicly available datasets.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2950-2954"},"PeriodicalIF":3.2,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Multidimensional Spatial Attention for Robust Nighttime Visual Tracking 学习多维空间注意力,实现稳健的夜间视觉跟踪
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-14 DOI: 10.1109/LSP.2024.3480831
Qi Gao;Mingfeng Yin;Yuanzhi Ni;Yuming Bo;Shaoyi Bei
The recent development of advanced trackers, which use nighttime image enhancement technology, has led to marked advances in the performance of visual tracking at night. However, the images recovered by currently available enhancement methods still have some weaknesses, such as blurred target details and obvious image noise. To this end, we propose a novel method for learning multidimensional spatial attention for robust nighttime visual tracking, which is developed over a spatial channel transformer based low light enhancer (SCT), named MSA-SCT. First, a novel multidimensional spatial attention (MSA) is designed. Additional reliable feature responses are generated by aggregating channel and multi-scale spatial information, thus making the model more adaptable to illumination conditions and noise levels in different regions of the image. Second, with optimized skip connections, the effects of redundant information and noise can be limited, which is more useful for the propagation of fine detail features in nighttime images from low to high level features and improves the enhancement effect. Finally, the tracker with enhancers was tested on multiple tracking benchmarks to fully demonstrate the effectiveness and superiority of MSA-SCT.
近年来,利用夜间图像增强技术的先进跟踪器的开发,使夜间视觉跟踪的性能有了显著提高。然而,目前可用的增强方法所恢复的图像仍存在一些缺陷,如目标细节模糊、图像噪声明显等。为此,我们提出了一种学习多维空间注意力的新方法,用于实现稳健的夜间视觉跟踪,该方法是在基于空间通道变换器的微光增强器(SCT)上开发的,命名为 MSA-SCT。首先,设计了一种新型多维空间注意力(MSA)。通过聚合信道和多尺度空间信息,产生更多可靠的特征响应,从而使模型更能适应图像不同区域的光照条件和噪声水平。其次,通过优化跳转连接,可以限制冗余信息和噪声的影响,这更有利于夜间图像中精细细节特征从低级特征向高级特征的传播,并提高增强效果。最后,对带有增强器的跟踪器进行了多个跟踪基准测试,以充分展示 MSA-SCT 的有效性和优越性。
{"title":"Learning Multidimensional Spatial Attention for Robust Nighttime Visual Tracking","authors":"Qi Gao;Mingfeng Yin;Yuanzhi Ni;Yuming Bo;Shaoyi Bei","doi":"10.1109/LSP.2024.3480831","DOIUrl":"https://doi.org/10.1109/LSP.2024.3480831","url":null,"abstract":"The recent development of advanced trackers, which use nighttime image enhancement technology, has led to marked advances in the performance of visual tracking at night. However, the images recovered by currently available enhancement methods still have some weaknesses, such as blurred target details and obvious image noise. To this end, we propose a novel method for learning multidimensional spatial attention for robust nighttime visual tracking, which is developed over a spatial channel transformer based low light enhancer (SCT), named MSA-SCT. First, a novel multidimensional spatial attention (MSA) is designed. Additional reliable feature responses are generated by aggregating channel and multi-scale spatial information, thus making the model more adaptable to illumination conditions and noise levels in different regions of the image. Second, with optimized skip connections, the effects of redundant information and noise can be limited, which is more useful for the propagation of fine detail features in nighttime images from low to high level features and improves the enhancement effect. Finally, the tracker with enhancers was tested on multiple tracking benchmarks to fully demonstrate the effectiveness and superiority of MSA-SCT.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2910-2914"},"PeriodicalIF":3.2,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Recurrent Spatio-Temporal Graph Neural Network Based on Latent Time Graph for Multi-Channel Time Series Forecasting 基于潜在时间图的循环时空图神经网络用于多通道时间序列预测
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-14 DOI: 10.1109/LSP.2024.3479917
Linzhi Li;Xiaofeng Zhou;Guoliang Hu;Shuai Li;Dongni Jia
With the advancement of technology, the field of multi-channel time series forecasting has emerged as a focal point of research. In this context, spatio-temporal graph neural networks have attracted significant interest due to their outstanding performance. An established approach involves integrating graph convolutional networks into recurrent neural networks. However, this approach faces difficulties in capturing dynamic spatial correlations and discerning the correlation of multi-channel time series signals. Another major problem is that the discrete time interval of recurrent neural networks limits the accuracy of spatio-temporal prediction. To address these challenges, we propose a continuous spatio-temporal framework, termed Recurrent Spatio-Temporal Graph Neural Network based on Latent Time Graph (RST-LTG). RST-LTG incorporates adaptive graph convolution networks with a time embedding generator to construct a latent time graph, which subtly captures evolving spatial characteristics by aggregating spatial information across multiple time steps. Additionally, to improve the accuracy of continuous time modeling, we introduce a gate enhanced neural ordinary differential equation that effectively integrates information across multiple scales. Empirical results on four publicly available datasets demonstrate that the RST-LTG model outperforms 19 competing methods in terms of accuracy.
随着技术的进步,多通道时间序列预测领域已成为研究的焦点。在此背景下,时空图神经网络因其出色的性能而备受关注。一种成熟的方法是将图卷积网络整合到递归神经网络中。然而,这种方法在捕捉动态空间相关性和辨别多通道时间序列信号的相关性方面面临困难。另一个主要问题是,递归神经网络的离散时间间隔限制了时空预测的准确性。为了应对这些挑战,我们提出了一种连续时空框架,即基于潜在时间图的递归时空图神经网络(RST-LTG)。RST-LTG 将自适应图卷积网络与时间嵌入生成器结合在一起,构建了一个潜在时间图,通过聚合多个时间步长的空间信息,巧妙地捕捉到不断变化的空间特征。此外,为了提高连续时间建模的准确性,我们引入了门增强神经常微分方程,有效地整合了多个尺度的信息。四个公开数据集的实证结果表明,RST-LTG 模型的准确性优于 19 种竞争方法。
{"title":"A Recurrent Spatio-Temporal Graph Neural Network Based on Latent Time Graph for Multi-Channel Time Series Forecasting","authors":"Linzhi Li;Xiaofeng Zhou;Guoliang Hu;Shuai Li;Dongni Jia","doi":"10.1109/LSP.2024.3479917","DOIUrl":"https://doi.org/10.1109/LSP.2024.3479917","url":null,"abstract":"With the advancement of technology, the field of multi-channel time series forecasting has emerged as a focal point of research. In this context, spatio-temporal graph neural networks have attracted significant interest due to their outstanding performance. An established approach involves integrating graph convolutional networks into recurrent neural networks. However, this approach faces difficulties in capturing dynamic spatial correlations and discerning the correlation of multi-channel time series signals. Another major problem is that the discrete time interval of recurrent neural networks limits the accuracy of spatio-temporal prediction. To address these challenges, we propose a continuous spatio-temporal framework, termed Recurrent Spatio-Temporal Graph Neural Network based on Latent Time Graph (RST-LTG). RST-LTG incorporates adaptive graph convolution networks with a time embedding generator to construct a latent time graph, which subtly captures evolving spatial characteristics by aggregating spatial information across multiple time steps. Additionally, to improve the accuracy of continuous time modeling, we introduce a gate enhanced neural ordinary differential equation that effectively integrates information across multiple scales. Empirical results on four publicly available datasets demonstrate that the RST-LTG model outperforms 19 competing methods in terms of accuracy.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2875-2879"},"PeriodicalIF":3.2,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142452677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Hybrid Quantum-Classical Deep Learning Architecture for Indoor-Outdoor Detection Using QCNN-LSTM and Cluster State Signal Processing 利用 QCNN-LSTM 和群集态信号处理实现用于室内外检测的混合量子-经典深度学习架构
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-14 DOI: 10.1109/LSP.2024.3480043
Muhammad Bilal Akram Dastagir;Dongsoo Han
Quantum computing, combined with deep learning, leverages principles like superposition and entanglement to enhance complex data-driven tasks. The Noisy Intermediate-Scale Quantum (NISQ) era presents opportunities for hybrid quantum-classical architectures to address this challenge. Despite significant progress, practical applications of these hybrid models are limited. This letter proposes a novel hybrid quantum-classical deep learning architecture, integrating Quantum Convolutional Neural Networks (QCNNs) and Long-Short-Term Memory (LSTM) networks, enhanced by Cluster State Signal Processing. Furthermore, this letter addresses indoor-outdoor detection using high-dimensional signal data, utilizing the Cirq platform—a Python framework for developing and simulating Noisy Intermediate Scale Quantum (NISQ) circuits on quantum computers and simulators. The approach addresses noise and decoherence issues. Preliminary results show that the QCNN-LSTM model outperforms pure quantum and hybrid models in accuracy and efficiency. This validates the practical benefits of hybrid architectures, paving the way for advancements in complex data classification like indoor-outdoor detection.
量子计算与深度学习相结合,可利用叠加和纠缠等原理来增强复杂的数据驱动任务。噪声中量子(NISQ)时代为混合量子-经典架构应对这一挑战提供了机遇。尽管取得了重大进展,但这些混合模型的实际应用仍然有限。这封信提出了一种新型混合量子-经典深度学习架构,它整合了量子卷积神经网络(QCNN)和长短期记忆(LSTM)网络,并通过簇态信号处理(Cluster State Signal Processing)进行了增强。此外,这封信还利用 Cirq 平台--在量子计算机和模拟器上开发和模拟噪声中间量级量子(NISQ)电路的 Python 框架--解决了利用高维信号数据进行室内-室外检测的问题。该方法解决了噪声和退相干问题。初步结果表明,QCNN-LSTM 模型在准确性和效率方面优于纯量子模型和混合模型。这验证了混合架构的实际优势,为室内外检测等复杂数据分类的进步铺平了道路。
{"title":"Towards Hybrid Quantum-Classical Deep Learning Architecture for Indoor-Outdoor Detection Using QCNN-LSTM and Cluster State Signal Processing","authors":"Muhammad Bilal Akram Dastagir;Dongsoo Han","doi":"10.1109/LSP.2024.3480043","DOIUrl":"https://doi.org/10.1109/LSP.2024.3480043","url":null,"abstract":"Quantum computing, combined with deep learning, leverages principles like superposition and entanglement to enhance complex data-driven tasks. The Noisy Intermediate-Scale Quantum (NISQ) era presents opportunities for hybrid quantum-classical architectures to address this challenge. Despite significant progress, practical applications of these hybrid models are limited. This letter proposes a novel hybrid quantum-classical deep learning architecture, integrating Quantum Convolutional Neural Networks (QCNNs) and Long-Short-Term Memory (LSTM) networks, enhanced by Cluster State Signal Processing. Furthermore, this letter addresses indoor-outdoor detection using high-dimensional signal data, utilizing the Cirq platform—a Python framework for developing and simulating Noisy Intermediate Scale Quantum (NISQ) circuits on quantum computers and simulators. The approach addresses noise and decoherence issues. Preliminary results show that the QCNN-LSTM model outperforms pure quantum and hybrid models in accuracy and efficiency. This validates the practical benefits of hybrid architectures, paving the way for advancements in complex data classification like indoor-outdoor detection.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2945-2949"},"PeriodicalIF":3.2,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fishing Net Optimization: A Learning Scheme of Optimizing Multi-Lateration Stations in Air-Ground Vehicle Networks 渔网优化:优化空地车辆网络中多测站的学习方案
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-14 DOI: 10.1109/LSP.2024.3479923
Haitao Zhao;Chunxi Zhao;Tianyu Zhang;Bo Xu;Jinlong Sun
Integrated sensing and communication in 6G, particularly for air-ground surveillance using automatic dependent surveillance-broadcast (ADS-B) and multi-lateration (MLAT) systems, is gaining significant research interest. This letter investigates the problem of optimal anchor station selection for tracking aerial vehicles, and proposes a novel heuristic learning scheme termed as fishing net-like optimization (FNO). Specifically, we perform constrained random walk steps on a two-dimensional surface to optimize the initial anchor stations’ parameters. FNO also incorporates with new evaluation strategies and acceleration techniques to accelerate the convergence speed. Experimental results demonstrate that FNO can achieve better selection of the anchor stations, and the accuracy of the chosen MLAT can be improved by ten times or more with the anchors optimization.
6G中的综合传感与通信,特别是使用自动依托监视广播(ADS-B)和多地平线(MLAT)系统的空地监视,正受到越来越多的研究关注。这封信研究了跟踪航空飞行器的最佳锚站选择问题,并提出了一种新颖的启发式学习方案,称为类渔网优化(FNO)。具体来说,我们在二维曲面上执行受限随机行走步骤,以优化初始锚点参数。FNO 还结合了新的评估策略和加速技术,以加快收敛速度。实验结果表明,FNO 可以实现更好的锚点选择,而且通过锚点优化,所选 MLAT 的精度可以提高十倍或更多。
{"title":"Fishing Net Optimization: A Learning Scheme of Optimizing Multi-Lateration Stations in Air-Ground Vehicle Networks","authors":"Haitao Zhao;Chunxi Zhao;Tianyu Zhang;Bo Xu;Jinlong Sun","doi":"10.1109/LSP.2024.3479923","DOIUrl":"https://doi.org/10.1109/LSP.2024.3479923","url":null,"abstract":"Integrated sensing and communication in 6G, particularly for air-ground surveillance using automatic dependent surveillance-broadcast (ADS-B) and multi-lateration (MLAT) systems, is gaining significant research interest. This letter investigates the problem of optimal anchor station selection for tracking aerial vehicles, and proposes a novel heuristic learning scheme termed as fishing net-like optimization (FNO). Specifically, we perform constrained random walk steps on a two-dimensional surface to optimize the initial anchor stations’ parameters. FNO also incorporates with new evaluation strategies and acceleration techniques to accelerate the convergence speed. Experimental results demonstrate that FNO can achieve better selection of the anchor stations, and the accuracy of the chosen MLAT can be improved by ten times or more with the anchors optimization.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2965-2969"},"PeriodicalIF":3.2,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Color and Geometric Contrastive Learning Based Intra-Frame Supervision for Self-Supervised Monocular Depth Estimation 基于色彩和几何对比学习的帧内监督,实现自我监督式单目深度估算
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-14 DOI: 10.1109/LSP.2024.3480032
Yanbo Gao;Xianye Wu;Shuai Li;Xun Cai;Chuankun Li
In recent years, self-supervised monocular depth estimation has become popular due to its advantage in estimating the depth without the need of groundtruth depth labels. Instead, it takes an inter-frame supervision using depth based view synthesis to reconstruct temporal adjacent frames to indirectly supervise the generated depth. However, such supervision weakens the depth estimation at temporal incoherent regions containing small changes among consecutive frames. To overcome the above problem, we propose a color and geometric contrastive learning based intra-frame supervision framework to enhance self-supervised monocular depth estimation. Color-contrastive learning is proposed to guide the network to learn color invariant features considering color information is irrelevant to depth data. To improve the local details of the learned feature, a pixel-level contrastive learning is further used to optimize the learning. In view that the depth estimation, as a pixel-level task, is sensitive to the geometric transformation, geometric-contrastive learning is developed using an inverse geometric transformation to learn features that are equivariant to the geometric data augmentation. A local plane guidance layer (LPG) with contrastive learning is further used to decompose the geometric information and enhance the geometric contrastive learning. Experiments demonstrate that the proposed method achieves the best result compared to the state-of-the-art methods in all tested quality metrics, with the largest improvement of 22.8% over baseline Monodepth2 and 3.2% over Monovit, in terms of SqRel reduction.
近年来,自监督单目深度估算因其无需真实深度标签即可估算深度的优势而备受青睐。然而,自监督单目深度估算在时间相邻区域的深度估算会受到影响。然而,这种监督会削弱在包含连续帧间微小变化的时间不连贯区域的深度估计。为了克服上述问题,我们提出了一种基于色彩和几何对比学习的帧内监督框架,以增强自我监督的单目深度估计。考虑到颜色信息与深度数据无关,我们提出了颜色对比学习来引导网络学习颜色不变特征。为了改善所学特征的局部细节,进一步使用像素级对比学习来优化学习。鉴于作为像素级任务的深度估算对几何变换非常敏感,因此利用反几何变换开发了几何对比学习,以学习与几何数据增强等价的特征。具有对比学习功能的局部平面引导层(LPG)被进一步用于分解几何信息和增强几何对比学习。实验表明,在所有测试的质量指标中,与最先进的方法相比,所提出的方法都取得了最佳效果,在 SqRel 减少方面,与基线 Monodepth2 相比,最大改进幅度为 22.8%,与 Monovit 相比,最大改进幅度为 3.2%。
{"title":"Color and Geometric Contrastive Learning Based Intra-Frame Supervision for Self-Supervised Monocular Depth Estimation","authors":"Yanbo Gao;Xianye Wu;Shuai Li;Xun Cai;Chuankun Li","doi":"10.1109/LSP.2024.3480032","DOIUrl":"https://doi.org/10.1109/LSP.2024.3480032","url":null,"abstract":"In recent years, self-supervised monocular depth estimation has become popular due to its advantage in estimating the depth without the need of groundtruth depth labels. Instead, it takes an inter-frame supervision using depth based view synthesis to reconstruct temporal adjacent frames to indirectly supervise the generated depth. However, such supervision weakens the depth estimation at temporal incoherent regions containing small changes among consecutive frames. To overcome the above problem, we propose a color and geometric contrastive learning based intra-frame supervision framework to enhance self-supervised monocular depth estimation. Color-contrastive learning is proposed to guide the network to learn color invariant features considering color information is irrelevant to depth data. To improve the local details of the learned feature, a pixel-level contrastive learning is further used to optimize the learning. In view that the depth estimation, as a pixel-level task, is sensitive to the geometric transformation, geometric-contrastive learning is developed using an inverse geometric transformation to learn features that are equivariant to the geometric data augmentation. A local plane guidance layer (LPG) with contrastive learning is further used to decompose the geometric information and enhance the geometric contrastive learning. Experiments demonstrate that the proposed method achieves the best result compared to the state-of-the-art methods in all tested quality metrics, with the largest improvement of 22.8% over baseline Monodepth2 and 3.2% over Monovit, in terms of SqRel reduction.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2940-2944"},"PeriodicalIF":3.2,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Signal Processing Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1