IEEE Signal Processing Letters最新文献

英文中文

A New Family of Graph Representation Matrices: Application to Graph and Signal Classification 新的图形表示矩阵系列：图形和信号分类的应用

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters

Pub Date : 2024-10-14 DOI: 10.1109/LSP.2024.3479918

T. Averty;A. O. Boudraa;D. Daré-Emzivat

Most natural matrices that incorporate information about a graph are the adjacency and the Laplacian matrices. These algebraic representations govern the fundamental concepts and tools in graph signal processing even though they reveal information in different ways. Furthermore, in the context of spectral graph classification, the problem of cospectrality may arise and it is not well handled by these matrices. Thus, the question of finding the best graph representation matrix still stands. In this letter, a new family of representations that well captures information about graphs and also allows to find the standard representation matrices, is introduced. This family of unified matrices well captures the graph information and extends the recent works of the literature. Two properties are proven, namely its positive semidefiniteness and the monotonicity of their eigenvalues. Reported experimental results of spectral graph classification highlight the potential and the added value of this new family of matrices, and evidence that the best representation depends upon the structure of the underlying graph.

包含图形信息的最自然矩阵是邻接矩阵和拉普拉斯矩阵。这些代数表示法是图信号处理的基本概念和工具，尽管它们揭示信息的方式各不相同。此外，在谱图分类中，可能会出现共谱性问题，而这些矩阵并不能很好地解决这一问题。因此，寻找最佳图表示矩阵的问题依然存在。在这封信中，我们介绍了一个新的表示族，它能很好地捕捉图的信息，还能找到标准表示矩阵。这个统一矩阵族能很好地捕捉图形信息，并扩展了近期的文献成果。研究证明了两个特性，即其正半定性和特征值的单调性。报告的谱图分类实验结果凸显了这一新矩阵族的潜力和附加值，并证明最佳表示取决于底层图的结构。

引用次数: 0

Approximation of Non-Ideal Filtering in OTFS via Variable Fractional Delay 通过可变分数延迟逼近 OTFS 中的非理想滤波

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters

Pub Date : 2024-10-14 DOI: 10.1109/LSP.2024.3479934

Penghui Lai;Yaru Shan;Fanggang Wang;Shilian Wang;Peiguo Liu

The orthogonal time frequency space (OTFS) modulation is resilient to the Doppler effect and thus is employed in high-mobility communications. In earlier work, the equivalent representations of the OTFS transmission were established, revealing the profound impact of fractional delay on channel sparsity. However, these representations tend to be distorted and cumbersome when accounting for digital reception and non-ideal filtering. In this letter, we propose an alternative representation with low distortion that uses the variable-fractional-delay filter to characterize the fractional delays in a conventional digital transceiver. In our proposed method, the approximation of the received signal is improved, and the error performance is enhanced compared to the original approaches. At last, the simulations show that our proposed representation is valid.

正交时频空间（OTFS）调制具有抗多普勒效应的能力，因此被用于高移动性通信。在早期的工作中，建立了 OTFS 传输的等效表示法，揭示了分数延迟对信道稀疏性的深刻影响。然而，在考虑数字接收和非理想滤波时，这些表示法往往会失真和繁琐。在这封信中，我们提出了一种失真度较低的替代表示法，利用可变分数延迟滤波器来描述传统数字收发器中的分数延迟。与原有方法相比，我们提出的方法改进了接收信号的近似性，并提高了误差性能。最后，模拟结果表明，我们提出的表示方法是有效的。

引用次数: 0

EMDSQA: A Neural Speech Quality Assessment Model With Speaker Embedding EMDSQA：带有说话者嵌入功能的神经语音质量评估模型

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters

Pub Date : 2024-10-10 DOI: 10.1109/LSP.2024.3478211

Yiya Hao;Feifei Xiong;Bei Li;Nai Ding;Jinwei Feng

We present a neural speech quality assessment model with speaker embedding. This model, i.e., EMDSQA, can precisely predict the Mean Opinion Score (MOS) of speech quality during online communications. Intrusive speech quality assessment methods such as perceptual objective listening quality analysis (POLQA) are not practical for online communications because every piece of degraded speech requires a corresponding clean reference. Non-intrusive methods can assess the quality of online speech, but have not reached the accuracy and robustness required for real-world applications. EMDSQA extracts the speaker embedding using an independent pipeline and feeds it as a prior feature to a self-attention-based MOS prediction model. Since EMDSQA does not need the corresponding clean reference, it is practical for real-world communication applications. An open-source test corpus, featuring real-world data, was also developed. Experimental results show that EMDSQA achieves a 0.92 Pearson correlation coefficient with the MOS measured from humans, surpassing other state-of-the-art intrusive or non-intrusive methods.

我们提出了一种具有说话者嵌入功能的神经语音质量评估模型。该模型（即 EMDSQA）可精确预测在线通信中语音质量的平均意见分（MOS）。感知客观听力质量分析（POLQA）等侵入式语音质量评估方法在在线通信中并不实用，因为每一段降级语音都需要相应的干净参考。非侵入式方法可以评估在线语音质量，但尚未达到实际应用所需的准确性和鲁棒性。EMDSQA 使用独立管道提取说话者嵌入，并将其作为先验特征输入基于自我关注的 MOS 预测模型。由于 EMDSQA 不需要相应的干净参考，因此在实际通信应用中非常实用。此外，还开发了一个以真实世界数据为特色的开源测试语料库。实验结果表明，EMDSQA 与人工测量的 MOS 之间的皮尔逊相关系数达到了 0.92，超过了其他最先进的侵入式或非侵入式方法。

{"title":"EMDSQA: A Neural Speech Quality Assessment Model With Speaker Embedding","authors":"Yiya Hao;Feifei Xiong;Bei Li;Nai Ding;Jinwei Feng","doi":"10.1109/LSP.2024.3478211","DOIUrl":"https://doi.org/10.1109/LSP.2024.3478211","url":null,"abstract":"We present a neural speech quality assessment model with speaker embedding. This model, i.e., EMDSQA, can precisely predict the Mean Opinion Score (MOS) of speech quality during online communications. Intrusive speech quality assessment methods such as perceptual objective listening quality analysis (POLQA) are not practical for online communications because every piece of degraded speech requires a corresponding clean reference. Non-intrusive methods can assess the quality of online speech, but have not reached the accuracy and robustness required for real-world applications. EMDSQA extracts the speaker embedding using an independent pipeline and feeds it as a prior feature to a self-attention-based MOS prediction model. Since EMDSQA does not need the corresponding clean reference, it is practical for real-world communication applications. An open-source test corpus, featuring real-world data, was also developed. Experimental results show that EMDSQA achieves a 0.92 Pearson correlation coefficient with the MOS measured from humans, surpassing other state-of-the-art intrusive or non-intrusive methods.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3064-3068"},"PeriodicalIF":3.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10713506","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adversarial Embedding Steganography via Progressive Probability Optimizing and Discarded Stego Recycling 通过渐进概率优化和丢弃式偷窃回收实现逆向嵌入式隐写术

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters

Pub Date : 2024-10-10 DOI: 10.1109/LSP.2024.3478109

Fan Wang;Zhangjie Fu;Xiang Zhang;Junjie Lu

Adversarial embedding for image steganography is a novel technology to effectively enhance the steganographic security of the traditional steganographic algorithms. However, the existing schemes still have room for further improvement in the design of optimization strategy and the steganographic post-processing of optimization failure. In this paper, we design the progressive probability optimizing strategy (PPO). It dynamically selects more efficient gradients to guide the optimization of the probability optimization in a progressive manner. Moreover, we propose a discarded stego recycling mechanism (DSR) to re-select the stego from the discarded stego set that have failed to deceive the target steganalyzer after the optimzation fails. In such way, the statistical distribution of the stego can still further approximate the cover, thus further improving the steganographic security on re-trained steganalyzers in adversary-aware scenario. Comprehensive experiments show that compared with the existing advanced schemes, the proposed method boosts the security improvement against both the re-trained hand-crafted feature-based and deep leanring-based steganalysis models.

逆向嵌入图像隐写术是一种新型技术，能有效提高传统隐写算法的隐写安全性。然而，现有方案在优化策略设计和优化失败的隐写后处理方面仍有进一步改进的空间。本文设计了渐进概率优化策略（PPO）。它能动态选择更有效的梯度，以渐进的方式指导概率优化。此外，我们还提出了一种丢弃的隐去再循环机制（DSR），在优化失败后，从丢弃的隐去集中重新选择未能欺骗目标隐分析仪的隐去。这样，隐果的统计分布仍能进一步逼近封面，从而进一步提高了在对手感知场景下重新训练的隐分析仪的隐写安全性。综合实验结果表明，与现有的先进方案相比，所提出的方法在对抗重新训练的基于手工特征的隐写分析模型和基于深度精简的隐写分析模型时，都提高了安全性。

{"title":"Adversarial Embedding Steganography via Progressive Probability Optimizing and Discarded Stego Recycling","authors":"Fan Wang;Zhangjie Fu;Xiang Zhang;Junjie Lu","doi":"10.1109/LSP.2024.3478109","DOIUrl":"https://doi.org/10.1109/LSP.2024.3478109","url":null,"abstract":"Adversarial embedding for image steganography is a novel technology to effectively enhance the steganographic security of the traditional steganographic algorithms. However, the existing schemes still have room for further improvement in the design of optimization strategy and the steganographic post-processing of optimization failure. In this paper, we design the progressive probability optimizing strategy (PPO). It dynamically selects more efficient gradients to guide the optimization of the probability optimization in a progressive manner. Moreover, we propose a discarded stego recycling mechanism (DSR) to re-select the stego from the discarded stego set that have failed to deceive the target steganalyzer after the optimzation fails. In such way, the statistical distribution of the stego can still further approximate the cover, thus further improving the steganographic security on re-trained steganalyzers in adversary-aware scenario. Comprehensive experiments show that compared with the existing advanced schemes, the proposed method boosts the security improvement against both the re-trained hand-crafted feature-based and deep leanring-based steganalysis models.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2920-2924"},"PeriodicalIF":3.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient Vibrotactile Codec Based on Nbeats Network 基于 Nbeats 网络的高效振动触觉编解码器

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters

Pub Date : 2024-10-09 DOI: 10.1109/LSP.2024.3477251

Yiwen Xu;Dongfang Chen;Ying Fang;Yang Lu;Tiesong Zhao

Within the domain of multimodal communication, the compression of audio, image, and video information is well-established, but compressing haptic signals, including vibrotactile signals, remains challenging. Particularly with the enhancement of haptic signal sampling rate and degrees of freedom, there is a substantial increase in data volume. While existing algorithms have made progress in vibrotactile codecs, there remains significant room for improvement in compression ratios. We propose an innovative Nbeats Network-based Vibrotactile Codec (NNVC) that leverages the statistical characteristics of vibrotactile data. This advanced codec integrates the Nbeats network for precise vibrotactile prediction, residual quantization, efficient Run-Length Encoding, and Huffman coding. The algorithm not only captures the intricate details of vibrotactile signals but also ensures high-efficiency data compression. It exhibits robust overall performance in terms of Signal-to-Noise Ratio (SNR) and Peak Signal-to-Noise Ratio (PSNR), significantly surpassing the state-of-the-art.

在多模态通信领域，音频、图像和视频信息的压缩技术已经非常成熟，但包括振动触觉信号在内的触觉信号的压缩技术仍然具有挑战性。特别是随着触觉信号采样率和自由度的提高，数据量大幅增加。虽然现有算法在振动编解码方面取得了进展，但在压缩率方面仍有很大的改进空间。我们提出了一种创新的基于 Nbeats 网络的振动编解码器（NNVC），它充分利用了振动数据的统计特性。这种先进的编解码器集成了 Nbeats 网络，用于精确的振动预测、残差量化、高效的运行长度编码和哈夫曼编码。该算法不仅能捕捉振动信号的复杂细节，还能确保高效的数据压缩。该算法在信噪比（SNR）和峰值信噪比（PSNR）方面表现出强劲的整体性能，大大超过了最先进的算法。

引用次数: 0

Adaptive Pattern-Coupled Sparse Bayesian Learning for Channel Estimation in OTFS Systems 自适应模式耦合稀疏贝叶斯学习用于 OTFS 系统中的信道估计

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters

Pub Date : 2024-10-09 DOI: 10.1109/LSP.2024.3477254

Zhuo Chen;Xiaoming Niu;Jian Ding;Hong Wu;Zhiyang Liu

The orthogonal time frequency space (OTFS) has emerged as a promising modulation waveform for high-mobility wireless communications owing to its robust advantages of resisting Doppler effects. However, due to the limit of the frame duration, the fractional Doppler shift appears, which is a challenge for channel estimation in OTFS systems. In this letter, we formulate the channel estimation problem as a block sparse signal recovery issue and propose an adaptive pattern-coupled sparse Bayesian learning (APCSBL) method. To be specific, we introduce a pattern-coupled hierarchical Gaussian prior model to characterize the dependencies among adjacent channel coefficients. On this basis, an adaptive hyperparameter strategy is presented, in which we appropriately utilize various coupling parameters further to characterize the strength of the correlation between adjacent elements. Then we exploit the expectation maximization (EM) algorithm to update the hidden variables and the channel vector. Simulation results demonstrate that the proposed algorithm outperforms existing methods and works for various environments.

正交时频空间（OTFS）因其抗多普勒效应的强大优势，已成为高移动性无线通信领域一种前景广阔的调制波形。然而，由于帧持续时间的限制，会出现小数多普勒频移，这对 OTFS 系统的信道估计是一个挑战。在这封信中，我们将信道估计问题表述为块稀疏信号恢复问题，并提出了一种自适应模式耦合稀疏贝叶斯学习（APCSBL）方法。具体来说，我们引入了一个模式耦合分层高斯先验模型来描述相邻信道系数之间的依赖关系。在此基础上，我们提出了一种自适应超参数策略，即进一步适当利用各种耦合参数来描述相邻元素之间的相关性强度。然后，我们利用期望最大化（EM）算法来更新隐藏变量和信道向量。仿真结果表明，所提出的算法优于现有方法，并适用于各种环境。

{"title":"Adaptive Pattern-Coupled Sparse Bayesian Learning for Channel Estimation in OTFS Systems","authors":"Zhuo Chen;Xiaoming Niu;Jian Ding;Hong Wu;Zhiyang Liu","doi":"10.1109/LSP.2024.3477254","DOIUrl":"https://doi.org/10.1109/LSP.2024.3477254","url":null,"abstract":"The orthogonal time frequency space (OTFS) has emerged as a promising modulation waveform for high-mobility wireless communications owing to its robust advantages of resisting Doppler effects. However, due to the limit of the frame duration, the fractional Doppler shift appears, which is a challenge for channel estimation in OTFS systems. In this letter, we formulate the channel estimation problem as a block sparse signal recovery issue and propose an adaptive pattern-coupled sparse Bayesian learning (APCSBL) method. To be specific, we introduce a pattern-coupled hierarchical Gaussian prior model to characterize the dependencies among adjacent channel coefficients. On this basis, an adaptive hyperparameter strategy is presented, in which we appropriately utilize various coupling parameters further to characterize the strength of the correlation between adjacent elements. Then we exploit the expectation maximization (EM) algorithm to update the hidden variables and the channel vector. Simulation results demonstrate that the proposed algorithm outperforms existing methods and works for various environments.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2895-2899"},"PeriodicalIF":3.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Edge-Aware Attention Transformer for Image Super-Resolution 用于图像超分辨率的边缘感知注意力变换器

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters

Pub Date : 2024-10-09 DOI: 10.1109/LSP.2024.3477298

Haoqian Wang;Zhongyang Xing;Zhongjie Xu;Xiangai Cheng;Teng Li

In this study, we explore poor edge reconstruction in image super-resolution (SR) tasks, emphasizing the significance of enhancing edge details identified through visual analysis. Existing SR networks typically optimize their network architectures, enabling complete feature extraction from feature maps. This is because the management of spatial and channel information during SR is often pivotal to the network's feature extraction capacity. Despite continuous improvements, directly comparing SR and high-resolution (HR) images through differential mapping reveals the suboptimal performance of these methods in edge reconstruction. In this paper, we introduce a edgey-aware attention transformer (EAT), which focuses on edge reconstruction while maintaining the effective original low frequency information retrieval. Our framework utilizes deformable convolution (DC) to adaptively extract edge features. Then feature enhancement techniques are employed to intensify edge-sensitive features. Furthermore, extensive experiments demonstrate our EAT's exceptional quantitative and visual results, which surpass most benchmarks. This validates the EAT's effectiveness when compared to state-of-the-art models. The code is available at https://github.com/ImWangHaoqian/EAT.

在这项研究中，我们探讨了图像超分辨率（SR）任务中的边缘重建问题，强调了通过视觉分析增强边缘细节的重要性。现有的 SR 网络通常会优化其网络架构，以便从特征图中完整提取特征。这是因为在 SR 过程中，空间和通道信息的管理往往对网络的特征提取能力至关重要。尽管不断改进，但通过差分映射直接比较 SR 和高分辨率（HR）图像发现，这些方法在边缘重建方面的性能并不理想。在本文中，我们介绍了一种边缘感知注意力转换器（EAT），它侧重于边缘重建，同时保持有效的原始低频信息检索。我们的框架利用可变形卷积（DC）自适应地提取边缘特征。然后采用特征增强技术来强化边缘敏感特征。此外，大量实验证明，我们的 EAT 在数量和视觉效果上都非常出色，超越了大多数基准测试。与最先进的模型相比，这验证了 EAT 的有效性。代码见 https://github.com/ImWangHaoqian/EAT。

{"title":"Edge-Aware Attention Transformer for Image Super-Resolution","authors":"Haoqian Wang;Zhongyang Xing;Zhongjie Xu;Xiangai Cheng;Teng Li","doi":"10.1109/LSP.2024.3477298","DOIUrl":"https://doi.org/10.1109/LSP.2024.3477298","url":null,"abstract":"In this study, we explore poor edge reconstruction in image super-resolution (SR) tasks, emphasizing the significance of enhancing edge details identified through visual analysis. Existing SR networks typically optimize their network architectures, enabling complete feature extraction from feature maps. This is because the management of spatial and channel information during SR is often pivotal to the network's feature extraction capacity. Despite continuous improvements, directly comparing SR and high-resolution (HR) images through differential mapping reveals the suboptimal performance of these methods in edge reconstruction. In this paper, we introduce a edgey-aware attention transformer (EAT), which focuses on edge reconstruction while maintaining the effective original low frequency information retrieval. Our framework utilizes deformable convolution (DC) to adaptively extract edge features. Then feature enhancement techniques are employed to intensify edge-sensitive features. Furthermore, extensive experiments demonstrate our EAT's exceptional quantitative and visual results, which surpass most benchmarks. This validates the EAT's effectiveness when compared to state-of-the-art models. The code is available at \u0000<uri>https://github.com/ImWangHaoqian/EAT</uri>\u0000.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2905-2909"},"PeriodicalIF":3.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrated Sensing and Communications Waveform Design for OTFS and FTN Fusion 用于 OTFS 和 FTN 融合的综合传感与通信波形设计

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters

Pub Date : 2024-10-09 DOI: 10.1109/LSP.2024.3478112

Xiaolong Yang;Bingrui Zhang;Mu Zhou;Ming Gao

In this letter, we propose an Integrated Sensing and Communications (ISAC) waveform design method based on the fusion of Orthogonal Time Frequency Space (OTFS) and Faster-Than-Nyquist (FTN). The objective is to maximize the communication data transmission rate while minimizing the sensing performance impact on the target parameter estimation. We first map the FTN symbols to OTFS waveform time domain for realizing symbol spacing compression and transmit them in time-varying channels. Then, an equalizer based on the Minimum Mean Square Error (MMSE) algorithm is used to eliminate the interference generated by the FTN. Simulation results show taking into account the system bit-error rate, the proposed method achieves an increase in the throughput as well as an improvement in the distance and velocity estimation of the target compared to the existing methods.

在这封信中，我们提出了一种基于正交时频空间（OTFS）和快速奈奎斯特（FTN）融合的综合传感与通信（ISAC）波形设计方法。其目标是最大限度地提高通信数据传输速率，同时将传感性能对目标参数估计的影响降至最低。我们首先将 FTN 符号映射到 OTFS 波形时域，以实现符号间距压缩，并在时变信道中传输。然后，使用基于最小均方误差（MMSE）算法的均衡器来消除 FTN 产生的干扰。仿真结果表明，考虑到系统误码率，与现有方法相比，建议的方法不仅提高了吞吐量，还改进了目标的距离和速度估计。

引用次数: 0

Radar-Based Crowd Counting in Real-World Environments With Spatiotemporal Transformer 利用时空变换器在真实世界环境中进行基于雷达的人群计数

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters

Pub Date : 2024-10-09 DOI: 10.1109/LSP.2024.3477263

Jae-Ho Choi;Kyung-Tae Kim

With the advent of deep learning (DL) for signal processing, the deployment of DL for radar-based crowd counting has yielded significant performance enhancement. Despite these advancements, current methodologies predominantly undergo validation in controlled conditions with limited subject movement variability, posing a challenge for practical usage. Addressing this gap, this letter first attempts the application of radar-based crowd counting in an unregulated and dense setting, capturing the radar reflections of up to 31 subjects in real-world scenarios, such as queues at restaurant kiosks. Furthermore, to address the complexities of such a challenging condition, we introduce a novel radar crowd counting model that utilizes a spatiotemporal transformer. The expremental results demonstrate the potentiality of the proposed model as a robust crowd counting system under the full realistic scenarios, as well as establish its superiority over the conventional radar-based crowd counting models.

随着用于信号处理的深度学习（DL）技术的出现，基于雷达的人群计数的 DL 部署取得了显著的性能提升。尽管取得了这些进步，但目前的方法主要是在受控条件下进行验证，受试者的运动变化有限，这给实际应用带来了挑战。为了弥补这一不足，本研究首次尝试在不受控制的密集环境中应用基于雷达的人群计数，在真实世界的场景中捕捉多达 31 个受试者的雷达反射，例如在餐厅售货亭排队。此外，为了解决这种具有挑战性的复杂条件，我们引入了一种利用时空变换器的新型雷达人群计数模型。实验结果表明，在完全真实的场景下，所提出的模型具有作为鲁棒性人群计数系统的潜力，并确立了其优于传统的基于雷达的人群计数模型的地位。

引用次数: 0

Prototypical Metric Segment Anything Model for Data-Free Few-Shot Semantic Segmentation 用于无数据少镜头语义分割的原型度量分割 Anything 模型

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters

Pub Date : 2024-10-08 DOI: 10.1109/LSP.2024.3476208

Zhiyu Jiang;Ye Yuan;Yuan Yuan

Few-shot semantic segmentation (FSS) is crucial for image interpretation, yet it is constrained by requirements for extensive base data and a narrow focus on foreground-background differentiation. This work introduces Data-free Few-shot Semantic Segmentation (DFSS), a task that requires limited labeled images and forgoes the need for extensive base data, allowing for comprehensive image segmentation. The proposed method utilizes the Segment Anything Model (SAM) for its generalization capabilities. The Prototypical Metric Segment Anything Model is introduced, featuring an initial segmentation phase followed by prototype matching, effectively addressing the learning challenges posed by limited data. To enhance discrimination in multi-class segmentation, the Supervised Prototypical Contrastive Loss (SPCL) is designed to refine prototype features, ensuring intra-class cohesion and inter-class separation. To further accommodate intra-class variability, the Adaptive Prototype Update (APU) strategy dynamically refines prototypes, adapting the model to class heterogeneity. The method's effectiveness is demonstrated through superior performance over existing techniques on the DFSS task, marking a significant advancement in UAV image segmentation.

少镜头语义分割（FSS）对图像判读至关重要，但它受制于对大量基础数据的要求和对前景-背景区分的狭隘关注。这项工作引入了无数据少镜头语义分割（DFSS），这项任务需要有限的标记图像，放弃了对大量基础数据的需求，从而实现了全面的图像分割。所提出的方法利用了 Segment Anything Model (SAM) 的泛化能力。该方法引入了原型度量分割模型（Prototyical Metric Segment Anything Model），其特点是在初始分割阶段之后进行原型匹配，从而有效地解决了有限数据带来的学习挑战。为了提高多类分割的辨别能力，设计了监督原型对比损失（SPCL）来完善原型特征，确保类内内聚和类间分离。为了进一步适应类内变异，自适应原型更新（APU）策略可动态完善原型，使模型适应类的异质性。该方法在 DFSS 任务中的表现优于现有技术，证明了其有效性，标志着无人机图像分割技术的重大进步。

{"title":"Prototypical Metric Segment Anything Model for Data-Free Few-Shot Semantic Segmentation","authors":"Zhiyu Jiang;Ye Yuan;Yuan Yuan","doi":"10.1109/LSP.2024.3476208","DOIUrl":"https://doi.org/10.1109/LSP.2024.3476208","url":null,"abstract":"Few-shot semantic segmentation (FSS) is crucial for image interpretation, yet it is constrained by requirements for extensive base data and a narrow focus on foreground-background differentiation. This work introduces Data-free Few-shot Semantic Segmentation (DFSS), a task that requires limited labeled images and forgoes the need for extensive base data, allowing for comprehensive image segmentation. The proposed method utilizes the Segment Anything Model (SAM) for its generalization capabilities. The Prototypical Metric Segment Anything Model is introduced, featuring an initial segmentation phase followed by prototype matching, effectively addressing the learning challenges posed by limited data. To enhance discrimination in multi-class segmentation, the Supervised Prototypical Contrastive Loss (SPCL) is designed to refine prototype features, ensuring intra-class cohesion and inter-class separation. To further accommodate intra-class variability, the Adaptive Prototype Update (APU) strategy dynamically refines prototypes, adapting the model to class heterogeneity. The method's effectiveness is demonstrated through superior performance over existing techniques on the DFSS task, marking a significant advancement in UAV image segmentation.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2800-2804"},"PeriodicalIF":3.2,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142438589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

IEEE Signal Processing Letters

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀