首页 > 最新文献

IEEE Signal Processing Letters最新文献

英文 中文
A Dual-Reward Guided 2D Mapping Generation Network for JPEG Reversible Data Hiding JPEG可逆数据隐藏的双奖励引导二维映射生成网络
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/LSP.2026.3652378
Rui Yan;Yao Zhao;Shaowei Weng;Lifang Yu
Recently, researchers have shifted focus to reversible data hiding (RDH) schemes for JPEG images. The reinforcement learning (RL) is a solution for RDH to automatically acquire the optimal two-dimensional (2D) mapping for 2D histograms of non-zero quantized alternating current coefficients. However, merely utilizing the payload-distortion reward mechanism (PDRM) in RL cannot inject the payload guidance to the 2D mapping generation process. To tackle this issue, we propose a payload supplementary reward mechanism (PSRM) and incorporate PDRM and PSRM into RL to construct DR-2DNet, a dual-reward guided 2D mapping generation network with considering additional payload guidance. DR-2DNet generates two candidate 2D mappings, one with low distortion generated by merely utilizing PDRM and the other with low distortion and high payload obtained by jointly using PDRM and PSRM. Finally, according to the required payload, the one with the lower distortion selected from two acquired 2D mappings is used for achieving data embedding. To priorly select the frequency bands with low costs for data embedding, a frequency selection strategy combining the smoothness and embedding performance of the frequency band is designed to evaluate the cost of each frequency band, reducing image distortion and preserving the file size. Extensive experiments are conducted on the Kodak dataset and 100 images randomly chosen from the BOSSBase dataset, and the results demonstrate that the proposed method is superior to several related state-of-the-art RDH schemes for JPEG images.
近年来,研究人员将焦点转移到JPEG图像的可逆数据隐藏(RDH)方案上。强化学习(RL)是RDH自动获取非零量化交流系数二维直方图的最优二维映射的一种解决方案。然而,仅仅利用RL中的有效载荷扭曲奖励机制(PDRM)并不能将有效载荷制导注入到二维映射生成过程中。为了解决这一问题,我们提出了一种有效载荷补充奖励机制(PSRM),并将PDRM和PSRM结合到RL中,构建了考虑额外有效载荷引导的双奖励引导2D地图生成网络DR-2DNet。DR-2DNet生成两个候选2D映射,一个是单独利用PDRM生成的低失真映射,另一个是联合使用PDRM和PSRM获得的低失真高载荷映射。最后,根据需要的有效载荷,从两个获取的二维映射中选择失真较小的映射来实现数据嵌入。为了优先选择成本较低的频带进行数据嵌入,设计了一种结合频带平滑性和嵌入性能的频率选择策略,以评估每个频带的成本,减少图像失真并保持文件大小。在Kodak数据集和从BOSSBase数据集随机选择的100幅图像上进行了大量的实验,结果表明,该方法优于几种相关的最先进的JPEG图像RDH方案。
{"title":"A Dual-Reward Guided 2D Mapping Generation Network for JPEG Reversible Data Hiding","authors":"Rui Yan;Yao Zhao;Shaowei Weng;Lifang Yu","doi":"10.1109/LSP.2026.3652378","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652378","url":null,"abstract":"Recently, researchers have shifted focus to reversible data hiding (RDH) schemes for JPEG images. The reinforcement learning (RL) is a solution for RDH to automatically acquire the optimal two-dimensional (2D) mapping for 2D histograms of non-zero quantized alternating current coefficients. However, merely utilizing the payload-distortion reward mechanism (PDRM) in RL cannot inject the payload guidance to the 2D mapping generation process. To tackle this issue, we propose a payload supplementary reward mechanism (PSRM) and incorporate PDRM and PSRM into RL to construct DR-2DNet, a dual-reward guided 2D mapping generation network with considering additional payload guidance. DR-2DNet generates two candidate 2D mappings, one with low distortion generated by merely utilizing PDRM and the other with low distortion and high payload obtained by jointly using PDRM and PSRM. Finally, according to the required payload, the one with the lower distortion selected from two acquired 2D mappings is used for achieving data embedding. To priorly select the frequency bands with low costs for data embedding, a frequency selection strategy combining the smoothness and embedding performance of the frequency band is designed to evaluate the cost of each frequency band, reducing image distortion and preserving the file size. Extensive experiments are conducted on the Kodak dataset and 100 images randomly chosen from the BOSSBase dataset, and the results demonstrate that the proposed method is superior to several related state-of-the-art RDH schemes for JPEG images.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"526-530"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Covariance Tensor Decomposition for NLOS Direction Finding in RIS-Aided Bistatic MIMO Radar ris辅助双基地MIMO雷达NLOS测向的协方差张量分解
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/LSP.2026.3652124
Qian-Peng Xie;Xiao-Peng Li;Ji-Yuan Chen;Ming-Xing Fang
This letter investigates the problem of direction-of-departure (DOD) and direction-of-arrival (DOA) estimation for non-line-of-sight (NLOS) targets in bistatic multiple-input multiple-output (MIMO) radar systems assisted by an intelligent reflecting surface (IRS). To tackle this issue, we propose a covariance tensor subspace-based algorithm. First, the received data is modeled within a tensor framework to preserve their inherent multi-dimensional spatiotemporal structure. Then, a fourth-order covariance tensor is constructed by computing correlations along the temporal dimension. Using the higher-order singular value decomposition (HOSVD), the signal subspace matrix is derived from this covariance tensor. The receive steering matrix is accurately reconstructed by exploiting the property of the Khatri–Rao product for full-column-rank matrices. Based on the estimated signal subspace and the reconstructed steering matrix, DOD and DOA estimation is efficiently performed via the rotational invariance technique combined with a one-dimensional correlation-based method, which provides automatic parameter pairing. Simulation results validate the superiority and effectiveness of the proposed algorithm in estimating angles.
本文研究了由智能反射面(IRS)辅助的双基地多输入多输出(MIMO)雷达系统中非视距(NLOS)目标的出发方向(DOD)和到达方向(DOA)估计问题。为了解决这个问题,我们提出了一种基于协方差张量子空间的算法。首先,将接收到的数据在张量框架内建模,以保持其固有的多维时空结构。然后,通过计算时间维度上的相关性,构造一个四阶协方差张量。利用高阶奇异值分解(HOSVD),由协方差张量导出信号子空间矩阵。利用全列秩矩阵的Khatri-Rao积的性质,精确地重构了接收导向矩阵。基于估计的信号子空间和重构的转向矩阵,通过旋转不变性技术和一维相关方法相结合,有效地进行了DOD和DOA估计,并提供了自动参数配对。仿真结果验证了该算法在角度估计方面的优越性和有效性。
{"title":"Covariance Tensor Decomposition for NLOS Direction Finding in RIS-Aided Bistatic MIMO Radar","authors":"Qian-Peng Xie;Xiao-Peng Li;Ji-Yuan Chen;Ming-Xing Fang","doi":"10.1109/LSP.2026.3652124","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652124","url":null,"abstract":"This letter investigates the problem of direction-of-departure (DOD) and direction-of-arrival (DOA) estimation for non-line-of-sight (NLOS) targets in bistatic multiple-input multiple-output (MIMO) radar systems assisted by an intelligent reflecting surface (IRS). To tackle this issue, we propose a covariance tensor subspace-based algorithm. First, the received data is modeled within a tensor framework to preserve their inherent multi-dimensional spatiotemporal structure. Then, a fourth-order covariance tensor is constructed by computing correlations along the temporal dimension. Using the higher-order singular value decomposition (HOSVD), the signal subspace matrix is derived from this covariance tensor. The receive steering matrix is accurately reconstructed by exploiting the property of the Khatri–Rao product for full-column-rank matrices. Based on the estimated signal subspace and the reconstructed steering matrix, DOD and DOA estimation is efficiently performed via the rotational invariance technique combined with a one-dimensional correlation-based method, which provides automatic parameter pairing. Simulation results validate the superiority and effectiveness of the proposed algorithm in estimating angles.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"574-578"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling Localized PPG for Blood Pressure Forecasting With MoE and Quantile Regression 基于MoE和分位数回归的局部PPG血压预测模型
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/LSP.2026.3652975
Yunjia Zhang;Zhixiong Hu;Mei Li
Accurate blood pressure (BP) monitoring using photoplethysmography (PPG) remains challenging due to noise, individual variability, and nonlinear signal dynamics. In this letter, we present FBPP-Net, a lightweight temporal-mixing framework that integrates sparse and shared Mixture-of-Experts (MoE) modules with quantile regression. The model enables specialized subnetworks to capture diverse temporal dependencies while providing robust probabilistic estimation of systolic and diastolic BP. Without requiring GPU acceleration, FBPP-Net achieves efficient training and inference, making it suitable for real-time wearable applications. Experiments on the UQVS dataset show that FBPP-Net-MoE attains SBP/DBP errors of 2.83/3.54 mmHg, and FBPP-Net-TM achieves 3.08/3.18 mmHg, outperforming XGBoost, LSTM, MLP, Informer, and TSMixer baselines. Furthermore, the analysis of expert activations and temporal segments provides interpretable insights into the localized dynamics driving near-future BP variations, supporting intelligent and explainable physiological monitoring.
由于噪声、个体差异和非线性信号动力学的影响,使用光容积脉搏波(PPG)准确监测血压(BP)仍然具有挑战性。在这封信中,我们提出了FBPP-Net,一个轻量级的时间混合框架,它集成了稀疏和共享的专家混合(MoE)模块和分位数回归。该模型使专门的子网络能够捕获不同的时间依赖性,同时提供收缩压和舒张压的鲁棒概率估计。在不需要GPU加速的情况下,FBPP-Net实现了高效的训练和推理,适合实时可穿戴应用。在UQVS数据集上的实验表明,FBPP-Net-MoE的SBP/DBP误差为2.83/3.54 mmHg, FBPP-Net-TM的SBP/DBP误差为3.08/3.18 mmHg,优于XGBoost、LSTM、MLP、Informer和TSMixer基线。此外,对专家激活和时间段的分析为驱动近期BP变化的局部动态提供了可解释的见解,支持智能和可解释的生理监测。
{"title":"Modeling Localized PPG for Blood Pressure Forecasting With MoE and Quantile Regression","authors":"Yunjia Zhang;Zhixiong Hu;Mei Li","doi":"10.1109/LSP.2026.3652975","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652975","url":null,"abstract":"Accurate blood pressure (BP) monitoring using photoplethysmography (PPG) remains challenging due to noise, individual variability, and nonlinear signal dynamics. In this letter, we present FBPP-Net, a lightweight temporal-mixing framework that integrates sparse and shared Mixture-of-Experts (MoE) modules with quantile regression. The model enables specialized subnetworks to capture diverse temporal dependencies while providing robust probabilistic estimation of systolic and diastolic BP. Without requiring GPU acceleration, FBPP-Net achieves efficient training and inference, making it suitable for real-time wearable applications. Experiments on the UQVS dataset show that FBPP-Net-MoE attains SBP/DBP errors of 2.83/3.54 mmHg, and FBPP-Net-TM achieves 3.08/3.18 mmHg, outperforming XGBoost, LSTM, MLP, Informer, and TSMixer baselines. Furthermore, the analysis of expert activations and temporal segments provides interpretable insights into the localized dynamics driving near-future BP variations, supporting intelligent and explainable physiological monitoring.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"531-535"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Customized Dynamic Filter Augmentation 自定义动态过滤器增强
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/LSP.2026.3652120
Song-Kyoo Kim
Customized dynamic filter augmentation (CDFA) presents a novel data augmentation technique for time-series forecasting, adapting convolutional principles from signal processing to emphasize historical patterns through localized correlations and amplitude adjustments. Built upon convolutional filters, local correlations between paired random variables, and statistical forecasting functions from compact data learning, CDFA generates plausible subsequences while preserving original data characteristics. Empirical evaluations on real-world datasets, including stock prices for Apple, Google, AMD, and oil, demonstrate superior root mean square error (RMSE) reductions, with CDFA achieving 81% to 82% improvements over baselines like statistical forecasting from CDL and customized convolutional filters. This approach enhances model efficiency for large-scale sequences, outperforming traditional linear models in capturing shared patterns across diverse applications.
自定义动态滤波增强(CDFA)是一种用于时间序列预测的新型数据增强技术,它采用信号处理中的卷积原理,通过局部相关和幅度调整来强调历史模式。CDFA建立在卷积滤波器、成对随机变量之间的局部相关性以及来自紧凑数据学习的统计预测函数的基础上,在保留原始数据特征的同时产生可信的子序列。对现实世界数据集(包括苹果、苹果、AMD和石油的股票价格)的实证评估表明,CDFA的均方根误差(RMSE)降低了81%至82%,比CDL和定制卷积滤波器的统计预测等基线提高了81%至82%。这种方法提高了大规模序列的模型效率,在捕获跨不同应用程序的共享模式方面优于传统的线性模型。
{"title":"Customized Dynamic Filter Augmentation","authors":"Song-Kyoo Kim","doi":"10.1109/LSP.2026.3652120","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652120","url":null,"abstract":"Customized dynamic filter augmentation (CDFA) presents a novel data augmentation technique for time-series forecasting, adapting convolutional principles from signal processing to emphasize historical patterns through localized correlations and amplitude adjustments. Built upon convolutional filters, local correlations between paired random variables, and statistical forecasting functions from compact data learning, CDFA generates plausible subsequences while preserving original data characteristics. Empirical evaluations on real-world datasets, including stock prices for Apple, Google, AMD, and oil, demonstrate superior root mean square error (RMSE) reductions, with CDFA achieving 81% to 82% improvements over baselines like statistical forecasting from CDL and customized convolutional filters. This approach enhances model efficiency for large-scale sequences, outperforming traditional linear models in capturing shared patterns across diverse applications.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"639-642"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic-Spatial Guided Reasoning for Human-Object Interaction Detection 人-物交互检测的语义-空间引导推理
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/LSP.2026.3652953
Ping Cao;Chunjie Zhang;Xiaolong Zheng;Yao Zhao
Human-Object Interaction (HOI) detection requires not only recognizing what the interaction is but also understanding where it occurs. Although recent methods have achieved remarkable progress, they often lack effective joint modeling of spatial and semantic information, which is essential for accurate reasoning in complex scenes. In this paper, we propose a Semantic-Spatial Guided Reasoning (SSGR) framework that performs interaction reasoning by jointly modeling global semantic cues and fine-grained spatial priors. Specifically, SSGR constructs pair-specific spatial layouts to encode detailed spatial relationships and introduces a global semantic decoder to learn category-aware semantic representations. A semantic-spatial guided reasoning module further adaptively fuses these complementary cues, enabling unified reasoning and more discriminative interaction understanding. Extensive experiments on HICO-DET and V-COCO demonstrate that SSGR consistently outperforms prior methods under both standard and zero-shot settings, validating the effectiveness of our semantic-spatial reasoning paradigm.
人-物交互(HOI)检测不仅需要识别交互是什么,还需要了解交互发生的位置。尽管最近的方法取得了显著的进展,但它们往往缺乏对空间和语义信息的有效联合建模,而这对于复杂场景的准确推理至关重要。在本文中,我们提出了一个语义空间引导推理(SSGR)框架,该框架通过联合建模全局语义线索和细粒度空间先验来进行交互推理。具体来说,SSGR构建了特定于对的空间布局来编码详细的空间关系,并引入了全局语义解码器来学习类别感知的语义表示。语义空间引导推理模块进一步自适应融合这些互补线索,实现统一推理和更具辨别性的交互理解。在HICO-DET和V-COCO上进行的大量实验表明,在标准和零射击设置下,SSGR始终优于先前的方法,验证了我们的语义空间推理范式的有效性。
{"title":"Semantic-Spatial Guided Reasoning for Human-Object Interaction Detection","authors":"Ping Cao;Chunjie Zhang;Xiaolong Zheng;Yao Zhao","doi":"10.1109/LSP.2026.3652953","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652953","url":null,"abstract":"Human-Object Interaction (HOI) detection requires not only recognizing <italic>what</i> the interaction is but also understanding <italic>where</i> it occurs. Although recent methods have achieved remarkable progress, they often lack effective joint modeling of spatial and semantic information, which is essential for accurate reasoning in complex scenes. In this paper, we propose a Semantic-Spatial Guided Reasoning (SSGR) framework that performs interaction reasoning by jointly modeling global semantic cues and fine-grained spatial priors. Specifically, SSGR constructs pair-specific spatial layouts to encode detailed spatial relationships and introduces a global semantic decoder to learn category-aware semantic representations. A semantic-spatial guided reasoning module further adaptively fuses these complementary cues, enabling unified reasoning and more discriminative interaction understanding. Extensive experiments on HICO-DET and V-COCO demonstrate that SSGR consistently outperforms prior methods under both standard and zero-shot settings, validating the effectiveness of our semantic-spatial reasoning paradigm.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"551-555"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-Complexity Approximations of the WECS Method for SAR Change Detection 基于wcs的SAR变化检测方法的低复杂度逼近
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/LSP.2026.3653395
Luan Portella;R. J. Cintra;Aluísio Pinheiro
This letter introduces low-complexity approximations for the Wavelet Energy Correlation Screening (WECS) method, which aims at change detection in multitemporal SAR images. The WECS method relies on the non-decimated discrete wavelet transform (ND-DWT) to compute approximation coefficients employed in a feature screening process based on the Pearson correlation. Although effective, WECS presents a high computational cost due to its repeated wavelet filtering stage. To overcome this drawback, we propose two approximations for the wavelet filter coefficients, obtained by truncating their canonical signed digit (CSD) representation, which significantly reduces the number of arithmetic operations. Numerical experiments using both simulated and real-world datasets demonstrate that the proposed methods not only maintain the performance of the original WECS but also achieve computational gains, even outperforming it in certain scenarios.
本文介绍了小波能量相关筛选(WECS)方法的低复杂度近似,该方法旨在对多时相SAR图像进行变化检测。WECS方法依靠非抽取离散小波变换(ND-DWT)来计算基于Pearson相关性的特征筛选过程中使用的近似系数。小波滤波虽然有效,但由于小波滤波阶段重复,计算量大。为了克服这个缺点,我们提出了两个小波滤波器系数的近似,通过截断它们的正则符号数(CSD)表示来获得,这大大减少了算术运算的次数。使用模拟和真实数据集进行的数值实验表明,所提出的方法不仅保持了原始WECS的性能,而且还获得了计算增益,甚至在某些情况下优于原始WECS。
{"title":"Low-Complexity Approximations of the WECS Method for SAR Change Detection","authors":"Luan Portella;R. J. Cintra;Aluísio Pinheiro","doi":"10.1109/LSP.2026.3653395","DOIUrl":"https://doi.org/10.1109/LSP.2026.3653395","url":null,"abstract":"This letter introduces low-complexity approximations for the Wavelet Energy Correlation Screening (WECS) method, which aims at change detection in multitemporal SAR images. The WECS method relies on the non-decimated discrete wavelet transform (ND-DWT) to compute approximation coefficients employed in a feature screening process based on the Pearson correlation. Although effective, WECS presents a high computational cost due to its repeated wavelet filtering stage. To overcome this drawback, we propose two approximations for the wavelet filter coefficients, obtained by truncating their canonical signed digit (CSD) representation, which significantly reduces the number of arithmetic operations. Numerical experiments using both simulated and real-world datasets demonstrate that the proposed methods not only maintain the performance of the original WECS but also achieve computational gains, even outperforming it in certain scenarios.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"643-647"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Boosting Small Object Detection via High-Frequency Feature Oriented Network 基于高频特征网络的小目标检测
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/LSP.2026.3652955
Min Li;Zhaofei Hao;Gang Li;Jin Wan;Delong Han;Mingle Zhou
Small Object Detection (SOD) aims to accurately identify and locate small objects in images. However, existing methods usually focus on exploring spatial domain features, neglecting high-frequency features that preserve fine-grained details such as texture and edge information. To overcome this limitation, we propose a High-Frequency Feature-Oriented Network (HFFO-Net). First, we introduce the Channel-wise Frequency Modulation Module (CFMM), which leverages the 2D Discrete Cosine Transform (DCT) to accentuate salient frequency components while mitigating noise interference. Second, we design a High-Frequency Oriented Module (HFOM), which utilizes the Channel Selection Branch (CSB) and Spatial Selection Branch (SSB) to highlight small objects in the channel and spatial region. Third, we introduce a Dual-Query Attention Fusion Mechanism (DQAFM), which reduces the semantic gap between spatial and frequency features and achieves better feature fusion through bidirectional cross-attention. Extensive experiments are implemented, and the corresponding results demonstrate that HFFO-Net excels at detecting small objects.
小目标检测(Small Object Detection, SOD)旨在准确识别和定位图像中的小目标。然而,现有方法通常侧重于探索空间域特征,而忽略了保留纹理和边缘信息等细粒度细节的高频特征。为了克服这一限制,我们提出了高频特征导向网络(HFFO-Net)。首先,我们介绍了信道调频模块(CFMM),它利用二维离散余弦变换(DCT)来突出显著频率分量,同时减轻噪声干扰。其次,我们设计了一个高频定向模块(hfm),该模块利用信道选择分支(CSB)和空间选择分支(SSB)来突出显示信道和空间区域中的小目标。第三,引入双查询注意融合机制(Dual-Query Attention Fusion Mechanism, DQAFM),减小空间特征和频率特征之间的语义差距,通过双向交叉注意实现更好的特征融合。大量的实验结果表明,HFFO-Net在检测小目标方面表现优异。
{"title":"Boosting Small Object Detection via High-Frequency Feature Oriented Network","authors":"Min Li;Zhaofei Hao;Gang Li;Jin Wan;Delong Han;Mingle Zhou","doi":"10.1109/LSP.2026.3652955","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652955","url":null,"abstract":"Small Object Detection (SOD) aims to accurately identify and locate small objects in images. However, existing methods usually focus on exploring spatial domain features, neglecting high-frequency features that preserve fine-grained details such as texture and edge information. To overcome this limitation, we propose a High-Frequency Feature-Oriented Network (HFFO-Net). First, we introduce the Channel-wise Frequency Modulation Module (CFMM), which leverages the 2D Discrete Cosine Transform (DCT) to accentuate salient frequency components while mitigating noise interference. Second, we design a High-Frequency Oriented Module (HFOM), which utilizes the Channel Selection Branch (CSB) and Spatial Selection Branch (SSB) to highlight small objects in the channel and spatial region. Third, we introduce a Dual-Query Attention Fusion Mechanism (DQAFM), which reduces the semantic gap between spatial and frequency features and achieves better feature fusion through bidirectional cross-attention. Extensive experiments are implemented, and the corresponding results demonstrate that HFFO-Net excels at detecting small objects.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"584-588"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HIER: Heterogeneous Information Bottleneck and Expert Routing for Social Bot Detection 基于异构信息瓶颈和专家路由的社交机器人检测
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/LSP.2026.3652127
Kun Lu;Hongli Zhang;Yuchen Yang;Chao Meng;Binxing Fang
Social bots constitute a substantial fraction of active accounts on digital platforms, fundamentally threatening information authenticity and democratic discourse. Contemporary detection methods confront critical limitations: information imbalance across heterogeneous relations, computational challenges in processing massive neighborhoods, and inadequate multi-scale representation learning. We propose HIER (Heterogeneous Information Bottleneck and Expert Routing), a pioneering framework that integrates variational information theory with mixture-of-experts paradigms for social network analysis. HIER introduces relation-aware variational information bottleneck for optimal compression across relationship types, dynamic sparse expert routing that extends mixture-of-experts to edge-level graph processing, and dual-scale mutual information maximization enhancing representation discriminability through neighborhood consistency and graph-level contrastive learning. Experimental validation demonstrates HIER’s superior performance across real-world datasets, establishing new benchmarks for heterogeneous social bot detection.
社交机器人在数字平台上的活跃账户中占很大比例,从根本上威胁着信息的真实性和民主话语。当代检测方法面临着严重的局限性:跨异构关系的信息不平衡,处理大量邻域的计算挑战,以及多尺度表示学习的不足。我们提出了HIER(异构信息瓶颈和专家路由),这是一个开创性的框架,将变分信息理论与专家混合范式集成在一起,用于社会网络分析。HIER引入了关系感知的变分信息瓶颈,用于跨关系类型的最佳压缩,将专家混合扩展到边缘级图处理的动态稀疏专家路由,以及通过邻域一致性和图级对比学习增强表征判别性的双尺度互信息最大化。实验验证证明了HIER在真实世界数据集上的卓越性能,为异构社交机器人检测建立了新的基准。
{"title":"HIER: Heterogeneous Information Bottleneck and Expert Routing for Social Bot Detection","authors":"Kun Lu;Hongli Zhang;Yuchen Yang;Chao Meng;Binxing Fang","doi":"10.1109/LSP.2026.3652127","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652127","url":null,"abstract":"Social bots constitute a substantial fraction of active accounts on digital platforms, fundamentally threatening information authenticity and democratic discourse. Contemporary detection methods confront critical limitations: information imbalance across heterogeneous relations, computational challenges in processing massive neighborhoods, and inadequate multi-scale representation learning. We propose HIER (Heterogeneous Information Bottleneck and Expert Routing), a pioneering framework that integrates variational information theory with mixture-of-experts paradigms for social network analysis. HIER introduces relation-aware variational information bottleneck for optimal compression across relationship types, dynamic sparse expert routing that extends mixture-of-experts to edge-level graph processing, and dual-scale mutual information maximization enhancing representation discriminability through neighborhood consistency and graph-level contrastive learning. Experimental validation demonstrates HIER’s superior performance across real-world datasets, establishing new benchmarks for heterogeneous social bot detection.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"521-525"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Risk-Aware Low-Pass Importance Sampling for Graph Signals Under Heterogeneous Noise and Model Mismatch 非均匀噪声和模型失配下图信号风险感知低通重要性采样
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/LSP.2026.3652954
Kai-Wei Peng
We study sampling of smooth/bandlimited graph signals when (i) sensor noise is heterogeneous across vertices and (ii) the graph used to design the sampler can be mildly mismatched to the true topology.We propose a risk-aware variant of local low-pass importance sampling that scores each vertex via a Hutchinson estimator of the diagonal of a graph heat-kernel operator and reweights the score by the inverse noise variance. The sampler selects without replacement according to these risk-aware scores. Reconstruction is performed with standard decoders (Tikhonov, Bandlimited, and a Chebyshev data-consistent smoother), enabling fair comparisons to prior work. On grid, Erdős–Rényi (ER), and Barabási–Albert (BA) graphs, our approach consistently reduces the normalized root-mean-square error (NRMSE) compared to random sampling; the gain increases with the sampling rate and persists under selection-graph mismatch. The method is simple, eigendecomposition-free, and scales linearly in the number of edges per Hutchinson probe.
我们研究了当(i)传感器噪声在各个顶点之间是异构的,以及(ii)用于设计采样器的图可能与真实拓扑有轻微的不匹配时,平滑/限带图信号的采样。我们提出了一种局部低通重要性采样的风险感知变体,通过图热核算子对角线的Hutchinson估计器对每个顶点进行评分,并通过逆噪声方差重新加权得分。抽样者根据这些风险意识得分进行选择而不进行替换。使用标准解码器(Tikhonov, Bandlimited和Chebyshev数据一致性平滑器)进行重建,可以与先前的工作进行公平比较。在网格、Erdős-Rényi (ER)和Barabási-Albert (BA)图上,与随机抽样相比,我们的方法一致地降低了归一化均方根误差(NRMSE);增益随采样率的增加而增加,并在选择图不匹配的情况下保持不变。该方法简单,无特征分解,且每个哈钦森探针的边数呈线性扩展。
{"title":"Risk-Aware Low-Pass Importance Sampling for Graph Signals Under Heterogeneous Noise and Model Mismatch","authors":"Kai-Wei Peng","doi":"10.1109/LSP.2026.3652954","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652954","url":null,"abstract":"We study sampling of smooth/bandlimited graph signals when (i) sensor noise is heterogeneous across vertices and (ii) the graph used to design the sampler can be mildly mismatched to the true topology.We propose a risk-aware variant of local low-pass importance sampling that scores each vertex via a Hutchinson estimator of the diagonal of a graph heat-kernel operator and reweights the score by the inverse noise variance. The sampler selects without replacement according to these risk-aware scores. Reconstruction is performed with standard decoders (<sc>Tikhonov</small>, <sc>Bandlimited</small>, and a Chebyshev data-consistent smoother), enabling fair comparisons to prior work. On grid, Erdős–Rényi (ER), and Barabási–Albert (BA) graphs, our approach consistently reduces the normalized root-mean-square error (NRMSE) compared to random sampling; the gain increases with the sampling rate and persists under selection-graph mismatch. The method is simple, eigendecomposition-free, and scales linearly in the number of edges per Hutchinson probe.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"556-558"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Latent-Level Enhancement With Flow Matching for Robust Automatic Speech Recognition 基于流匹配的潜在级增强鲁棒自动语音识别
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/LSP.2026.3653238
Da-Hee Yang;Joon-Hyuk Chang
Noise-robust automatic speech recognition (ASR) has been commonly addressed by applying speech enhancement (SE) at the waveform level before recognition. However, speech-level enhancement does not always translate into consistent recognition improvements due to residual distortions and mismatches with the latent space of the ASR encoder. In this letter, we introduce a complementary strategy termed latent-level enhancement, where distorted representations are refined during ASR inference. Specifically, we propose a plug-and-play Flow Matching Refinement module (FM-Refiner) that operates on the output latents of a pretrained CTC-based ASR encoder. Trained to map imperfect latents—either directly from noisy inputs or from enhanced-but-imperfect speech—toward their clean counterparts, the FM-Refiner is applied only at inference, without fine-tuning ASR parameters. Experiments show that FM-Refiner consistently reduces word error rate, both when directly applied to noisy inputs and when combined with conventional SE front-ends. These results demonstrate that latent-level refinement via flow matching provides a lightweight and effective complement to existing SE approaches for robust ASR.
噪声鲁棒性自动语音识别(ASR)通常通过在识别前的波形级应用语音增强(SE)来解决。然而,由于残余的失真和与ASR编码器的潜在空间不匹配,语音水平的增强并不总是转化为一致的识别改进。在这封信中,我们介绍了一种称为潜在级增强的互补策略,其中在ASR推理期间对扭曲表示进行了细化。具体来说,我们提出了一个即插即用的流匹配细化模块(FM-Refiner),它在预训练的基于ctc的ASR编码器的输出电位上运行。经过训练,FM-Refiner可以直接从噪声输入或从增强但不完美的语音中映射不完美的潜在信号,并将其映射到干净的对应信号中,它只应用于推理,不需要对ASR参数进行微调。实验表明,FM-Refiner无论是直接应用于噪声输入还是与传统SE前端结合使用,都能持续降低单词错误率。这些结果表明,通过流匹配进行的潜在级细化为鲁棒ASR的现有SE方法提供了轻量级和有效的补充。
{"title":"Latent-Level Enhancement With Flow Matching for Robust Automatic Speech Recognition","authors":"Da-Hee Yang;Joon-Hyuk Chang","doi":"10.1109/LSP.2026.3653238","DOIUrl":"https://doi.org/10.1109/LSP.2026.3653238","url":null,"abstract":"Noise-robust automatic speech recognition (ASR) has been commonly addressed by applying speech enhancement (SE) at the waveform level before recognition. However, speech-level enhancement does not always translate into consistent recognition improvements due to residual distortions and mismatches with the latent space of the ASR encoder. In this letter, we introduce a complementary strategy termed latent-level enhancement, where distorted representations are refined during ASR inference. Specifically, we propose a plug-and-play Flow Matching Refinement module (FM-Refiner) that operates on the output latents of a pretrained CTC-based ASR encoder. Trained to map imperfect latents—either directly from noisy inputs or from enhanced-but-imperfect speech—toward their clean counterparts, the FM-Refiner is applied only at inference, without fine-tuning ASR parameters. Experiments show that FM-Refiner consistently reduces word error rate, both when directly applied to noisy inputs and when combined with conventional SE front-ends. These results demonstrate that latent-level refinement via flow matching provides a lightweight and effective complement to existing SE approaches for robust ASR.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"589-593"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Signal Processing Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1